Winged helix DNA-binding proteins share a related winged helix-turn-helix DNA-binding motif, where the "wings", or loops, are small beta-sheets. The winged helix motif consists of two wings (W1, W2), three alpha helices (H1, H2, H3) and three beta-sheets (S1, S2, S3) arranged in the order H1-S1-H2-H3-S2-W1-S3-W2. The DNA-recognition helix makes sequence-specific DNA contacts with the major groove of DNA, while the wings make different DNA contacts, often with the minor groove or the backbone of DNA. Several winged-helix proteins display an exposed patch of hydrophobic residues thought to mediate protein-protein interactions.
Many different proteins with diverse biological functions contain a winged helix DNA-binding domain, including transcriptional repressors such as biotin repressor, LexA repressor and the arginine repressor; transcription factors such as the hepatocyte nuclear factor-3 proteins involved in cell differentiation, heat-shock transcription factor, and the general transcription factors TFIIE and TFIIF; helicases such as RuvB that promotes branch migration at the Holliday junction, and CDC6 in the pre-replication complex; endonucleases such as FokI and TnsA; histones; and Mu transposase, where the flexible wing of the enhancer-binding domain is essential for efficient transposition.
Cytochrome c oxidase is an oligomeric enzymatic complex that is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.
In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptide subunits. One of these subunits is the potentially haem-binding subunit, VIb, which is encoded in the nucleus.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.
Members of this entry adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair.
Homeodomain proteins are transcription factors that share a related DNA binding homeodomain. The homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well conserved in many other animals, including vertebrates. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two alpha helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. Many proteins contain homeodomains, including Drosophila Engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and HOX proteins.
The homeodomain motif is very similar in sequence and structure to domains in a wide range of DNA-binding proteins, including recombinases, Myb proteins, GARP response regulators, human telomeric proteins (hTRF1), paired domain proteins (PAX), yeast RAP1, centromere-binding proteins CENP-B and ABP-1, transcriptional regulators (TyrR), AraC-type transcriptional activators, and tetracycline repressor-like proteins (TetR, QacR, YcdC).
Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.
The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.
The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.
In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.
This entry represents IscX proteins (also known as hypothetical protein YfhJ) that are part of the ISC system. IscX is active as a monomer. The structure of YfhJ is an orthogonal alpha-bundle. YfhJ is a small acidic protein that binds IscS, and contains a modified winged helix motif that is usually found in DNA-binding proteins. YfhJ/IscX can bind Fe, and may function as an Fe donor in the assembly of FeS clusters
This entry represents a multi-helical domain found in several NAD or NADP-utilizing dehydrogenases, including 6-phosphogluconate dehydrogenase, classes I and II ketol-acid reductoisomerases, L-3-hydroxyacyl CoA dehydrogenase, UDP-glucose dehydrogenase, glycerol-3-phosphate dehydrogenase, ketopantoate reductase, N-(1-D-carboxylethyl)-L-norvaline dehydrogenase, and mannitol 2-dehydrogenase. This domain is often found in the C-terminal region of the protein.
Fumarate reductase catalyses the reduction of fumarate to succinate, coupling the reaction to the oxidation of quinol to quinine. This reaction is opposite to that catalysed by succinate dehydrogenase. This entry represents the C-terminal domain of fumarate reductase, which is structurally related to the N-terminal domain of dihydropyrimidine dehydrogenase, an enzyme that catalyses the NADPH-dependent conversion of pyrimidines to 5,6-dihydro compounds.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases.
The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases . The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.
The cytochrome bd type terminal oxidases catalyse quinol dependent, Na+ independent oxygen uptake. Members of this family are integral membrane proteins and contain a protoheame IX centre B558.
Cytochrome bd may play an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae, where it is expressed under all conditions that permit diazotrophy.
The 14 kDa (or VI) subunit of the complex is not directly involved in electron transfer, but has a role in assembly of the complex.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker. This entry represents an alpha-helical domain that assumes an orthogonal bundle topology.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry represents the alpha/beta subdomain that comprises part of the catalytic core of eukaryotic and viral topoisomerase I (type IB) enzymes, which occurs near the C-terminal region of the protein.
Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.
Vaccinia virus, a cytoplasmically-replicating poxvirus, encodes a type I DNA topoisomerase that is biochemically similar to eukaryotic-like DNA topoisomerases I, and which has been widely studied as a model topoisomerase. It is the smallest topoisomerase known and is unusual in that it is resistant to the potent chemotherapeutic agent camptothecin. The crystal structure of an amino-terminal fragment of vaccinia virus DNA topoisomerase I shows that the fragment forms a five-stranded, antiparallel beta-sheet with two short alpha-helices and connecting loops. Residues that are conserved between all eukaryotic-like type I topoisomerases are not clustered in particular regions of the structure.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Mammalian DNA polymerase beta (polB) is a 39-kDa protein with both nucleotidyltransferase and 5'-deoxyribose phosphodiesterase activities, playing a role in both excision repair and meiosis. polB has a modular organisation with an 8-kDa N-terminal domain (NTD) connected to the 31-kDa C-terminal domain by a protease-hypersensitive hinge region. The NTD acts as a single-stranded DNA binding domain, interacting most efficiently with the 5'-phosphate of the downstream primer of the gapped DNA. This interaction is mediated by a helix-hairpin-helix motif (HhH), which is also found in several other DNA repair enzymes. The residue threonine 79 (T79), which is located within the NTD, was identified as being critical to polB function, even though it makes no contact with either DNA template or dNTP substrate; T79 is located between two HhH motifs, and acts as a hinge residue that is important for positioning the DNA within the active site.
The catalytic core (residues 148-242) of murine terminal deoxynucleotidyl transferase (TdT) displays a structural fold that is similar to polB, and shares a common two-metal ion mechanism of nucleotidyl transfer with polB. TdT elongates DNA strands in a template-independent manner, and belongs to the pol X family of polymerases. TdT has only been found in vertebrates, where it is highly conserved. TdT brings additional diversity in the immune repertoire by adding nucleotides, called N regions, to the V(D)J recombination junction sites of immunoglobulin and T-cell receptor genes.
Sterile alpha motif (SAM) domains are known to be involved in diverse protein-protein interactions, associating with both SAM-containing and non-SAM-containing proteins pathway. SAM domains exhibit a conserved structure, consisting of a 4-5-helical bundle of two orthogonally packed alpha-hairpins. However SAM domains display a diversity of function, being involved in interactions with proteins, DNA and RNA. The name sterile alpha motif arose from its presence in proteins that are essential for yeast sexual differentiation. The SAM domain has had various names, including SPM, PTN (pointed), SEP (yeast sterility, Ets-related, PcG proteins), NCR (N-terminal conserved region) and HLH (helix-loop-helix) domain, all of which are related and can be classified as SAM domains.
SAM domains occur in eukaryotic and in some bacterial proteins. Structures have been determined for several proteins that contain SAM domains, including Ets-1 transcription factor, which plays a role in the development and invasion of tumour cells by regulating the expression of matrix-degrading proteases; Etv6 transcription factor, gene rearrangements of which have been demonstrated in several malignancies; EphA4 receptor tyrosine kinase, which is believed to be important for the correct localization of a motoneuron pool to a specific position in the spinal cord; EphB2 receptor, which is involved in spine morphogenesis via intersectin, Cdc42 and N-Wasp; p73, a p53 homologue involved in neuronal development; and polyhomeotic, which is a member of the Polycomb group of genes (Pc-G) required for the maintenance of the spatial expression pattern of homeotic genes.
Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.
The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions.
The "beige" mouse is established as an animal model of Chediak-Higashi Syndrome (CHS). The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein It is also found in distantly related proteins like, for example,andwhich are factor associated with neutral sphingomyelinase activation.
The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins are very basic. About 50 residues long, they are the smallest proteins of eukaryotic-type ribosomes.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family constitutes the mitochondrial ATP synthase epsilon subunit, which is distinct from the bacterial epsilon subunit (the latter being homologous to the mitochondrial delta subunit). The mitochondrial epsilon subunit is located in the stalk region of the F1 complex, and acts as an inhibitor of the ATPase catalytic core. The epsilon subunit can assume two conformations, contracted and extended, where the latter inhibits ATP hydrolysis. The conformation of the epsilon subunit is determined by the direction of rotation of the gamma subunit, and possibly by the presence of ADP. The extended epsilon subunit is thought to become extended in the presence of ADP, thereby acting as a safety lock to prevent wasteful ATP hydrolysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker. This entry represents an alpha-helical domain that assumes an orthogonal bundle topology.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.
This entry represents the N-terminal alpha-helical domain, which has an orthogonal bundle topology.
Histones mediate DNA organisation and plays a dominant role in regulating eukaryotic transcription. The histone-fold consists of a core of three helices, where the long middle helix is flanked at each end by shorter ones. Proteins displaying this structure include the nucleosome core histones, which form octomers composed of two copies of each of the four histones, H2A, H2B, H3 and H4; archaeal histone, which possesses only the core domain part of eukaryotic histone; and the TATA-box binding protein (TBP)-associated factors (TAF), where the histone fold is a common motif for mediating TAF-TAF interactions. TAF proteins include TAF(II)18 and TAF(II)28, which form a heterodimer, TAF(II)42 and TAF(II)62, which form a heterotetramer similar to (H3-H4)2, and the negative cofactor 2 (NC2) alpha and beta chains, which form a heterodimer. The TAF proteins are a component of transcription factor IID (TFIID), along with the TBP protein. TFIID forms part of the pre-initiation complex on core promoter elements required for RNA polymerase II-dependent transcription. The TAF subunits of TFIID mediate transcriptional activation of subsets of eukaryotic genes. The NC2 complex mediates the inhibition of TATA-dependent transcription through interactions with TBP.
This domain consists of a duplication of two EF-hand units, where each unit is composed of two helices connected by a twelve-residue calcium-binding loop. The calcium ion in the EF-hand loop is coordinated in a pentagonal bipyramidal configuration. Many calcium-binding proteins contain an EF-hand type calcium-binding domain. These include: calbindin D9K, S100 proteins such as calcyclin, polcalcin phl p 7 (a calcium-binding pollen allergen), osteonectin, parvalbumin, calmodulin family of proteins (troponin C, caltractin, cdc4p, myosin essential chain, calcineurin, recoverin, neurocalcin), plasmodial-specific CaII-binding protein Cbp40, penta-EF-Hand proteins (sorcin, grancalcin, calpain), as well as multidomain proteins such as phosphoinositide-specific phospholipase C, dystrophin, Cb1 and alpha-actinin. The fold consists of four helices and an open array of two hairpins.
The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins, and can be found fused to the C-terminus of DNA topoisomerase in Chlamydia.
MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded beta-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 alpha-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.
The SWIB and MDM2 domains are homologous and share a common fold.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the M domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents a mainly alpha helical domain of subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV. It does not include the topoisomerase II enzymes composed of a single polypeptide, as are found in most eukaryotes.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
The RNA-binding domains of the ribosomal protein S15 and the influenza virus non-structural protein NS1 share the same structural fold, consisting of three helices in an irregular array. S15 is one of 21 proteins in the small, bacterial 30S ribosomal subunit, and is required for assembly of the subunit through its binding to 16S rRNA. The multifunctional glutamyl-prolyl-tRNA synthase (EPRS) contains three tandem repeats linking two catalytic domains, all three of which contribute to RNA-binding; the second repeated element bears structural resemblance to the S15/NS1 RNA-binding domain.
The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.
Such a structure is shown in the following schematic representation:
It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. L29 belongs to a family of ribosomal proteins of 63 to 138 amino-acid residues which, on the basis of sequence similarities, groups:
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry represents the N-terminal domain of Seryl-tRNA synthetase, which consists of two helices in a long alpha-hairpin. Seryl-tRNA synthetase exists as monomer and belongs to class IIa.
Synucleins are small, soluble proteins expressed primarily in neural tissue and in certain tumors. The family includes three known proteins: alpha-synuclein, beta-synuclein, and gamma-synuclein. All synucleins have in common a highly conserved alpha-helical lipid-binding motif with similarity to the class-A2 lipid-binding domains of the exchangeable apolipoproteins.
Synuclein family members are not found outside vertebrates, although they have some conserved structural similarity with plant 'late-embryo-abundant' proteins. The alpha- and beta-synuclein proteins are found primarily in brain tissue, where they are seen mainly in presynaptic terminals. The gamma-synuclein protein is found primarily in the peripheral nervous system and retina, but its expression in breast tumors is a marker for tumor progression. Normal cellular functions have not been determined for any of the synuclein proteins, although some data suggest a role in the regulation of membrane stability and/or turnover. Mutations in alpha-synuclein are associated with rare familial cases of early-onset Parkinson's disease, and the protein accumulates abnormally in Parkinson's disease, Alzheimer's disease, and several other neurodegenerative illnesses.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
Type IA topoisomerases are comprised of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA: an N-terminal alpha/beta Toprim domain, domain 2 and the C-terminal domain 4 are winged-helix domains, and domain 3 is a beta-barrel. Domains 1 (Toprim) and 3 form the active site of the enzyme, while the winged helix domains 2 and 4 form a single-strand DNA-binding groove. This entry represents the alpha-bundle subdomain 3 of the central region of topoisomerase type IA enzymes, where the central region covers both domains 2 and 3.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.
The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins; LEF1 lymphoid enhancer binding factor 1; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.
Members of this family are essential for gametocytogenesis in Plasmodium falciparum. They contain a fold composed of two pseudo dyad-related repeats of the helix-turn-helix motif, serving as a platform for RNA and Src homology-3 (SH3) binding.
The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins. It comprises the following groups of actin-binding domains:
A comprehensive review of proteins containing this type of actin-binding domains is given in.
The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene have only a single copy. The structure of an example CH-domain has recently been solved.
Phage integrases are enzymes that mediate unidirectional site-specific recombination between two DNA recognition sequences, the phage attachment site, attP, and the bacterial attachment site, attB. Integrases may be grouped into two major families, the tyrosine recombinases and the serine recombinases, based on their mode of catalysis. Tyrosine family integrases, such as lambda integrase, utilise a catalytic tyrosine to mediate strand cleavage, tend to recognize longer attP sequences, and require other proteins encoded by the phage or the host bacteria.
The 356 amino acid lambda integrase consists of two domains: an N-terminal domain that includes residues 1-64 and is responsible for binding the arm-type sites of attP, and a C-terminal domain (CTD) that binds the lower affinity core-type sites and contains the catalytic site. The CTD can be further divided into the core-type binding domain (residues 65-169) and the catalytic core domain (170-356), the later representing this entry. The catalytic core adopts an alpha3-beta3-alpha4 fold, where one side of the beta sheet is exposed.
The recombinases Cre from phage P1, XerD from Escherichia coli and Flp from yeast are members of the tyrosine recombinase family, and have a two-domain motif resembling that of lambda integrase, as well as sharing a conserved binding mechanism. The structural fold of their catalytic core domains resemble that of Lambda integrase
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the basis of sequence similarities. The structure for S7 is known.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
Type IA topoisomerases are comprised of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA: an N-terminal alpha/beta Toprim domain, domain 2 and the C-terminal domain 4 are winged-helix domains, and domain 3 is a beta-barrel. Domains 1 (Toprim) and 3 form the active site of the enzyme, while the winged helix domains 2 and 4 form a single-strand DNA-binding groove. This entry represents the alpha-bundle subdomain 1 of the central region of topoisomerase type IA enzymes, where the central region covers both domains 2 and 3.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
This domain is also found as the core domain in transcription factor IIB (TFIIB) and in the retinoblastoma tumour suppressor.Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.
TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
This domain is found in the central region of transcription elongation factor S-II and in several hypothetical proteins.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .
More information about this protein can be found at Protein of the Month: ATP Synthases.
The enzymes belonging to this family are involved in phosphate ester hydrolysis and contain a triad of closely spaced zinc ions at their active centres. Both families of enzymes hydrolyse phosphodiesters. Substrates for phospholipase C are phosphatidylinositol and phosphatidylcholine, while P1 nuclease is an endonuclease hydrolysing single stranded ribo- and deoxyribonucleotides. P1 nuclease also has activity as a phosphomonoesterase against 3'-terminal phosphates of nucleotides. The Zn ions in both enzymes form almost identical trinuclear sites.
Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).
Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.
This entry represents the large alpha-helical domain from type I and II citrate synthase enzymes, as well as a homolgous domain found in the related enzyme 2-methylcitrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive.
Terpenoid cyclases catalyze remarkably complex cyclisation cascades that are initiated by the formation of a highly reactive carbocation in a polyisoprene substrate. The pathways of monoterpene, sesquiterpene, and diterpene biosynthesis are conveniently divided into several stages. The first encompasses the synthesis of isopentenyl diphosphate, isomerization to dimethylallyl diphosphate, prenyltransferase-catalysed condensation of these two C5-units to geranyl diphosphate (GDP), and the subsequent 1'-4 additions of isopentenyl diphosphate to generate farnesyl (FDP) and geranylgeranyl (GGDP) diphosphate. In the second stage, the prenyl diphosphates undergo a range of cyclisations based on variations on the same mechanistic theme to produce the parent skeletons of each class. Thus, GDP (C10) gives rise to monoterpenes, FDP (C15) to sesquiterpenes, and GGDP (C20) to diterpenes. These transformations catalysed by the terpenoid synthases (cyclases) may be followed by a variety of redox modifications of the parent skeletal types to produce the many thousands of different terpenoid metabolites of the essential oils, turpentines, and resins of plant origin. Terpenoid synthases enzymes provide a template for binding and stabilizing the flexible substrate in the precise orientation required for catalysis, trigger carbocation formation, chaperone the conformations of the reactive carbocation intermediates through a unique cyclisation sequence, and sequester and stabilize carbocations from premature quenching.
The R2 protein of ribonucleotide reductase catalyses the reduction of all four ribonucleotides to deoxyribonucleotides for use in DNA synthesis. This catalysis involves generating and storing a tyrosyl radical, which is essential for ribonucleotide reduction. The crystal structure consists of a core of four helices in a closed bundle with a left-handed twist and one crossover connection, and a bimetal-ion centre in the middle of the bundle.
This entry represents a family of proteins that are structurally related to the R2 protein of class I ribonucleotide reductase, which includes the alpha and beta subunits of methane monooxygenase hydrolase, and delta 9-stearoyl-acyl carrier protein desaturase.
After cytochrome c is synthesized in the cytoplasm as apocytochrome c, it is transported through the outer mitochondrial membrane to the intermembrane space, where haem is covalently attached by thioester bonds to two cysteine residues located in the cytochrome c centre. Cytochrome c is required during oxidative phosphorylation as an electron shuttle between Complex III (cytochrome c reductase) and IV (cytochrome c oxidase). In addition, cytochrome c is involved in apoptosis in more complex organisms such as Xenopus, rats and humans. Cellular stress can induce cytochrome c release from the mitochondrial membrane. In mammals, cytochrome c triggers the assembly of the apoptosome, consisting of cytochrome c, Apaf-1 and dATP, which activates caspase-9, leading to cell death. There are several different members of the cytochrome c family with different functional roles, for instance cytochrome c549 is associated with photosystem II.
The known structures of c-type cytochromes have six different classes of fold. Of these, four are unique to c-type cytochromes. The consensus sequence for the cytochrome c centre is Cys-X-X-Cys-His, where the histidine residue is one of the two axial ligands of the haem iron. This arrangement is shared by all proteins known to belong to the cytochrome c family, which presently includes both mono-haem proteins and multi-haem proteins. This entry represents mono-haem cytochrome c proteins (excluding class II and f-type cytochromes), such as cytochromes c, c1, c2, c5, c555, c550 to c553, c556, and c6.
Cytochrome c-type centres are also found in the active sites of many enzymes, including cytochrome cd1-nitrite reductase as the N-terminal haem c domain, in quinoprotein alcohol dehydrogenase as the C-terminal domain, in Quinohemoprotein amine dehydrogenase A chain as domains 1 and 2, and in the cytochrome bc1 complex as the cytochrome bc1 domain.
This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.
In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. GST is found as a domain in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST. Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.
Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain, which adopts a 4-helical bundle fold. This entry is the C-terminal domain.
Glutaredoxin 2 (Grx2), glutathione-dependent disulphide oxidoreductases, is structurally similar to GSTs, even though they lack any sequence similarity. Grx2 is also composed of N and C terminal subdomains. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with glutathione in cellular redox regulation including the response to oxidative stress. Grx2 is dissimilar to other glutaredoxins apart from containing the conserved active site residues.
A number of transmembrane (TM) channel proteins can be grouped together on the basis of sequence similarities.
These include:
MIP family proteins are thought to contain 6 TM domains. Sequence analysis suggests that the proteins may have arisen through tandem, intragenic duplication from an ancestral protein that contained 3 TM domains.
Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates a ttached to lipids or proteins. Aquaporin-CHIP (Aquaporin 1) belo ngs to the Colton blood group system and is associated with Co(a/b) antigen.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.
PA28 activator complex (also known as 11S regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha (PA28alpha) and beta (PA28beta) subunits. The catalytic properties of PA28alpha and PA28beta-activated proteosome are similar. This entry represents the beta subunit. The activator complex binds to the 20S proteasome and stimulates peptidase activity in and ATP-independent manner.
The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.
14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.
6-phosphogluconate dehydrogenase catalyses the oxidative decarboxylation of 6-phosphogluconate to ribulose 5-phosphate with the concomitant reduction of NADP to NADPH. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP), which functions to generate ribose 5-phosphate for nucleotide and nucleic acid synthesis. Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequences are highly conserved. The protein is a homodimer in which the monomers act independently: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet. NADP is bound in a cleft in the small domain, and the substrate binds in an adjacent pocket.
This entry represents the terminal 30-40 residues of 6-phosphogluconate dehydrogenase C-terminal domain, which is lacking in certain 6PGD enzymes. The core of the C-terminal domain is represented by This region bears structural resemblance to the C-terminal portion of the Bacteriophage T4 fibritin protein, which is responsible for the attachment of long tail fibres to virus particles, and forms the, "whiskers", or fibres on the neck of the virion.
This entry represents a structural domain with a core structure consisting of a 3-helical closed bundle with a left-handed twist, in an up-and-down arrangement. This structural motif occurs as subdomain 2 within FERM domains, as well as in acyl-CoA-binding proteins. The FERM domain (band F ezrin-radixin-moesin homology domains) has such a structure, acting as a common membrane-binding module involved in localising proteins to the plasma membrane. Proteins containing FERM include cytoskeletal proteins such as erythrocyte membrane protein 4.1R, talin, and the ezrin-radixin-moesin protein family, as well as several protein tyrosine kinases and phosphatases, and the neurofibromatosis 2 tumour suppressor protein merlin. The ezrin-radixin-moesin protein family function is to crosslink the actin filaments of cytoskeletal structures to the plasma membrane.
In addition, acyl-CoA-binding protein (ACBP) contains a domain with a similar 3-helical bundle structure. ACBP plays an important role in fatty acid metabolism, maintaining a pool of fatty acyl-CoA molecules in the cell.
The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.
Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.
TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
This entry represents the conserved N-terminal domain found in the transcription elongation factors TFIIS. This entry contains predominantly fungal forms of TFIIS, which is encoded by the gene PPR2 in Saccharomyces cerevisiae (Baker's yeast). The N-terminal domain in these transcription factors is conserved from yeast to man, and has a 4-helical bundle fold with a left-handed twist within a left-handed superhelix.
The splicing factor Prp18 is required for the second step of pre-mRNA splicing. PRP18 appears to be primarily associated with the U5 snRNP.
The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles.
This domain consists of a multi-helical fold comprised of two curved layers of alpha helices arranged in a regular right-handed superhelix, where the repeats that make up this structure are arranged about a common axis. These superhelical structures present an extensive solvent-accessible surface that is well suited to binding large substrates such as proteins and nucleic acids. This topology has been found with a number of repeats and domains, including the armadillo repeat (found in beta-catenins and importins), the HEAT repeat (found in protein phosphatase 2a and initiation factor eIF4G), the PHAT domain (found in Smaug RNA-binding protein), the leucine-rich repeat variant, the Pumilo repeat, and in the H regulatory subunit of V-type ATPases. The sequence similarity among these different repeats or domains is low, however they exhibit considerable structural similarity. Furthermore, the number of repeats present in the superhelical structure can vary between orthologues, indicating that rapid loss/gain of repeats has occurred frequently in evolution. A common phylogenetic origin has been proposed for the armadillo and HEAT repeats.
This domain consists of a multi-helical fold comprised of two curved layers of alpha helices arranged in a regular right-handed superhelix, where the repeats that make up this structure are arranged about a common axis. These superhelical structures present an extensive solvent-accessible surface that is well suited to binding large substrates such as proteins and nucleic acids. This topology has been found with a number of repeats and domains, including the tetratricopeptide repeat (TPR) (found in kinesin light chains, SNAP regulatory proteins, clathrin heavy chains and bacterial aspartyl-phosphate phosphatases), and the pentatricopeptide repeat (PPR) (RNA-processing proteins). The TPR is likely to be an ancient repeat, since it is found in eukaryotes, bacteria and archaea, whereas the PPR repeat is found predominantly in higher plants. The superhelix formed from these repeats can bind ligands at a number of different regions, and has the ability to acquire multiple functional roles.
Protein prenyltransferases catalyze the transfer of the carbon moiety of C15 farnesyl pyrophosphate or geranylgeranyl pyrophosphate synthase to a conserved cysteine residue in a CaaX motif of protein and peptide substrates. The addition of a farnesyl group is required to anchor proteins to the cell membrane. In the 3D structure of a mammalian Ras farnesyltransferases (Ftase), both subunits are largely composed of alpha-helices. The alpha-2 to alpha-15 helices in the alpha subunit fold into a novel helical hairpin structure, resulting in a crescent-shape domain that envelopes part of the subunit. The 12 helices of the beta-subunit form an alpha-alpha barrel. Six additional helices connect the inner core of helices and form the outside of the helical barrel. A deep cleft surrounded by hydrophobic amino acids in the centre of the barrel is proposed as the FPP-binding pocket. A single Zn2+ ion is located at the junction between the hydrophilic surface groove near the subunit interface.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents the C-terminal domain of subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat.
More information about this protein can be found at Protein of the Month: ATP Synthases.
This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Family members contain two or more structurally similar domains of this fold connected by unstructured linkers; this entry covers types 1, 2 and 3 MIF4G-like domains. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low.
The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80. The complex binds to 5'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans.
Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.
Nonsense-mediated mRNA decay (NMD) in eukaryotes involves UPF1, UPF2 and UPF3 to accelerate the decay rate of two unique classes of transcripts: (1) nonsense mRNAs that arise through errors in gene expression, and (2) naturally occurring transcripts that lack coding errors but have built-in features that target them for accelerated decay (error-free mRNAs). NMD can trigger decay during any round of translation and can target CBC-bound or eIF-4E-bound transcripts. UPF2 contains MIF4G domains, while UPF3 contains an RNP domain.
The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.
The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.
This entry represents the N-terminal domain found in several eukaryotic translation initiation factor 3 subunit 12 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multi-subunit complex that is required for binding of mRNA to 40S ribosomal subunits, stabilisation of ternary complex binding to 40S subunits, and dissociation of 40S and 60S subunits.
Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.
TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
This entry represents the conserved N-terminal domain found in the transcription elongation factors TFIIS (predominantly the metazoan and plant forms), and CRSP70. The N-terminal domain in these transcription factors is conserved from yeast to man, and has a 4-helical bundle fold with a left-handed twist within a left-handed superhelix. CRSP70 is an essential subunit of the CRSP complex, which is required for the activity of the enhancer-binding protein Sp1.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.
This entry represents the alpha-helical zigzag linker region connecting the conserved N-terminal beta-propeller region to the C-terminal alpha-alpha-superhelical region in clathrin heavy chains.
More information about these proteins can be found at Protein of the Month: Clathrin.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The role of the accessory domain of phosphoinositide 3-kinase (PI3-kinase) is unclear. It may be involved in substrate presentation .
This entry represents domains with a multi-helical, alpha-alpha 2-layered structural fold as found in: the ENTH domain of Epsin; the VHS domain of Hrs, Tom1, and ADP-ribosylation factors; the RPR domain of PCF11 protein; and the N-terminal domain of phosphoinositide-binding clathrin adaptor.
The epsin NH2-terminal homology (ENTH) domain is a membrane interacting module composed of a superhelix of alpha-helices. It is present at the NH2-terminus of proteins that often contain consensus sequences for binding to clathrin coat components and their accessory factors, and therefore function as endocytic adaptors. ENTH domain containing proteins have additional roles in signalling and actin regulation and may have yet other actions in the nucleus. The ENTH domain is structurally similar to the VHS domain.
The ENTH domain is approximately 150 amino acids long. The ENTH domain forms a compact globular structure, composed of eight alpha-helices connected by loops of varying length. Three helical hairpins that are stacked consecutively with a right-handed twist determine the general topology of the domain. This stacking gives the ENTH domain a rectangular appearance when viewed face on. The most highly conserved amino acids fall roughly into two classes: internal residues that are involved in packing and therefore are necessary for structural integrity, and solvent accessible residues that may be involved in protein-protein interactions.
VHS domains are found at the N-termini of select proteins involved in intracellular membrane trafficking. The domain consists of eight helices arranged in a superhelix. The surface of the domain has two main features: a basic patch on one side due to several conserved positively charged residues on helix 3 and a negatively charged ridge on the opposite side, formed by residues on helix 2. Comparison of the two VHS domains and the ENTH domain reveals a conserved surface, composed of helices 2 and 4, that is utilised for protein-protein interactions. In addition, VHS domain-containing proteins are also often localized to membranes. It has therefore been suggested that the conserved positively charged surface of helix 3 in VHS and ENTH domains plays a role in membrane binding.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry represents the C-terminal domain of the Escherichia coli LexA protein and the C-terminal domain of the E. coli signal peptidase (SPase). They share the same structural topology, consisting of a complex fold made of several coiled beta-sheets, and containing an SH3-like beta-barrel. This entry is associated with serine peptidases belong to MEROPS peptidase families: S24 (LexA family, clan SF); S26A (signal peptidase I) and S26B (signalase).
The S26 family includes E. coli signal peptidase, SPase, which is a membrane-bound endopeptidase, with two N-terminal transmembrane segments and a C-terminal catalytic region. SPase functions to release proteins that have been translocated into the inner membrane from the cell interior, by cleaving off their signal peptides.
The S24 family includes:
All of these proteins, with the possible exception of RulA, interact with RecA, which activates self cleavage either derepressing transcription in the case of CI and LexA or activating the lesion-bypass polymerase in the case of UmuD and MucA. UmuD'2, is the homodimeric component of DNA pol V, which is produced from UmuD by RecA-facilitated self-cleavage. The first 24 N-terminal residues of UmuD are removed; UmuD'2 is a DNA lesion bypass polymerase. MucA, like UmuD, is a plasmid encoded a DNA polymerase (pol RI) which is converted into the active lesion-bypass polymerase by a self-cleavage reaction involving RecA
This group of proteins also contains proteins not recognised as peptidases as well as those classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolyzing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.
This entry represents complement control protein (CCP) modules, which are also known as sushi domains or short consensus repeats (SCR). The CCP module is a disulphide-rich domain with an all-beta fold. These domains are found in a wide variety of complement and adhesion proteins, such as complement receptors Cr1 and Cr2, complement C1R and C1S proteases, and complement decay-accelerating factor CD55, as well as in mannan-binding lectin serine protease 2 (MASP-2), GABA-B receptor 1, beta2-glycoprotein, membrane cofactor protein CD46, and as the 15th and 16th modules of Factor H.
Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential, which is linked to ATP synthesis. The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur 'Rieske' subunit, which contains a high potential 2Fe-2S cluster.The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), present in cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits.
The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines that one Fe ion while the other Fe ion is coordinated by two conserved histidines. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.
This entry represents a six-bladed beta-propeller domain consisting of six 4-stranded beta-sheet motifs. This domain can be found in TolB proteins (C-terminal), in soluble quinoprotein glucose dehydrogenase, in calcium-dependent phosphotriesterases, in the low density lipoprotein (LDL) receptor YWTD domain, in nidogen, and in serine/threonine-protein kinase (PknD) NHL repeat domain.
TolB is a periplasmic protein from Escherichia coli that is part of the Tol-dependent translocation system involving group A and E colicins that is used to penetrate and kill cells. TolB has two domains, an alpha-helical N-terminal domain that shares structural similarity with the C-terminal domain of transfer RNA ligases, and a beta-propeller C-terminal domain that shares structural similarity with numerous members of the prolyl oligopeptidase family and, to a lesser extent, to class B metallo-beta-lactamases (although its does not necessarily occur at the C-terminal in these proteins). The C-terminal domain of TolB may mediate protein-protein interactions with colicins.
Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.
The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.
This entry represents the 6-bladed Kelch beta-propeller, which consists of six 4-stranded beta-sheet motifs (or six Kelch repeats).
This entry represents a WD40/YVTN repeat-like domain. Both the WD40 and the YVTN repeated motifs consist of about 40 residues, and although they consist of distinct sequences, they do share a similar structure. Structurally, both the WD40 and the YVTN repeated motifs form seven-bladed propellers (although some members can contain eight blades), which consist of seven 4-stranded beta-sheets.
The WD40-type repeat domain is found in the beta-1 subunit of the signal-transducing G protein, in yeast Tup1 protein, in Groucho, in the yeast cell cycle protein Cdc4 and in actin-interacting protein 1.
The YVTN-type repeat domain is found in archaeal surface layer proteins (SLPs) that protect cells from extreme environments, in quinohemoprotein amine dehydrogenase (QHNDH), and in methylamine dehydrogenase.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.
This entry represents a region covering the N-terminal beta-propeller region of clathrin heavy chains that extends away from the hub of triskelia, and which is responsible for peptide binding, as well as the core motif for the alpha-helical zigzag linker region connecting the conserved N-terminal beta-propeller region to the C-terminal alpha-alpha-superhelical region in clathrin heavy chains.
More information about these proteins can be found at Protein of the Month: Clathrin.
The beta-lactamase-inhibitor protein II (BLIP-II) is a secreted protein produced by the soil bacteria Streptomyces exfoliates SMF19. BLIP-II acts as a potent inhibitor of beta-lactamases such as TEM-1, which is the most widespread resistance enzyme to penicillin antibiotics. BLIP-II binds competitively to TEM-1, but no direct contacts are made with TEM-1 active site residues. BLIP-II shows no sequence similarity with BLIP, even though both bind to and inhibit TEM-1. However, BLIP-II does share significant sequence identity with the regulator of chromosome condensation (RCC1) family of proteins. These two families are clearly related, both having a seven-bladed beta-propeller structure, although they differ in the number of strands per blade, BLIP-II having three antiparallel beta-strands per blade, while RCC1 has four-stranded blades. RCC1 is a eukaryotic nuclear protein that acts as a guanine nucleotide exchange factor for Ran, a member of the Ras GTPase family. RCC1 mediates a Ran-GTP gradient necessary for the regulation of spindle formation and nuclear assembly during mitosis, as well as for the transport of macromolecules across the nuclear membrane during interphase.
Glutamate synthase (GltS) is a complex iron-sulphur flavoprotein that catalyses the reductive synthesis of L-glutamate from 2-oxoglutarate and L-glutamine via intramolecular channelling of ammonia, a reaction in the bacterial, yeast and plant pathways for ammonia assimilation. GltS is a multifunctional enzyme that functions through three distinct active centres carrying out multiple reaction steps: L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor. The active centres are synchronised to avoid the wasteful consumption of L-glutamine. There are three classes of GltS, which share many functional properties: bacterial NADPH-dependent GltS, ferredoxin-dependent GltS from photosynthetic cells, and NAD(P)H-dependent GltS from yeast, fungi and lower animals.
The dimeric alpha subunits each consist of four domains: N-terminal amidotransferase domain, the central domain, the FMN binding domain and the C-terminal domain. The C-terminal domain forms a right-handed beta-helix that comprises seven helical turns. Each helical turn has a sharp bend that is associated with a repeated sequence motif consisting of G-XX-G-XXX-G. This domain does not contain any residues directly involved in catalysis, but has a crucial structural role.
This domain is also found in proteins such as subunit C of formylmethanofuran dehydrogenase, which catalyses the first step in methane formation from carbon dioxide in methanogenic archaea. There are two isoenzymes of formylmethanofuran dehydrogenase: a tungsten-containing isoenzyme (FwdC) and a molybdenum-containing isoenzyme (FmdC). The tungsten isoenzyme is constitutively transcribed, whereas transcription of the molybdenum operon is induced by molybdate.
Cyclase-associated proteins (CAPs) are highly conserved monomeric actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways. Only yeast CAPs are involved in adenylate cyclase activation. The C-terminal domain of CAP proteins is responsible for G-actin-binding that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes.
In Escherichia coli, three Min proteins (MinC, MinD and MinE) negatively regulate FtsZ assembly at the cell poles in order to ensure the Z-ring only assembles at cell midpoint. MinC inhibits formation of the Z-ring by preventing FtsZ assembly. MinD binds to MinC near the cell poles, sequestering MinC away from the cell midpoint so the Z-ring can form there. MinC is an oligomer, probably a dimer, that consists of two domains: the N-terminal domain is responsible for FtsZ inhibition, while the C-terminal domain is responsible for binding to MinD and to a component of the division septum.
This entry represents a structural domain found at the C-terminal of CAP proteins as well as MinC. This domain has a superhelical structure, where the superhelix turns are made of either two (CAP) or three (MinC) beta-strands each.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry represents a structural motif, consisting of a complex alpha/beta topology that forms the N-terminal DNA-binding domain of certain eukaryotic topoisomerase I (type IB) enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.
More information about this protein can be found at Protein of the Month: DNA Topoisomerases.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha (RpoA) subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta') can interact.
The dimerisation domains differ between the different subunit families. In eukaryotic Rpb3, archaeal D and bacterial RpoA subunits, the dimerisation domain is comprised of a central insert domain, which interrupts an Rpb11-like domain, dividing it into two halves. In eukaryotic Rpb11 and archaeal L subunits, the insert domain is lacking, leaving the Rpb11-like domain intact and contiguous.
The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices.
Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain.
This entry represents a structural domain with a complex fold consisting of several coiled beta-sheets. This domain exists as a duplication, consisting of a tandem repeat of two similar structural motifs. This entry represents copies of this structural motif in the following protein families:
Mss4 is a conserved accessory factor for Rab GTPases, which function as ubiquitous regulators of intracellular membrane trafficking. Mss4 acts to promote nucleotide release from exocytic but not endocytic Rab GTPases. Mss4 has a complex fold made of several coiled beta-sheets, and consists of a duplication of tandem repeats of two similar structural motifs. It contains a zinc-binding site.
Other proteins that show structural similarity to Mss4 include the translationally controlled tumour-associated proteins TCTPs, which contain an insertion of an alpha helical hairpin, and lack the zinc-binding site. TCTPs are a highly conserved and abundantly expressed family of eukaryotic proteins that are implicated in both cell growth and the human acute allergic response.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.
This entry represents the C-terminal beta-sheet domain.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents the core domain of ribosomal proteins L37ae and L37e, which share a common rubredoxin-like metal-binding fold containing two CX(n)C motifs (where n is usually two).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian ribosomal protein L24; yeast ribosomal protein L30A/B (Rp29) (YL21); Kluyveromyces lactis ribosomal protein L30; Arabidopsis thaliana ribosomal protein L24 homolog; Haloarcula marismortui ribosomal protein HL21/HL22; and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1201. These proteins have 60 to 160 amino-acid residues.
Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP, and can dissociate HMGB1 already bound to TBP/TATA-box.
Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2.
This entry represents the beta-barrel domain found at the C-terminal of both TOA1 (or alpha/beta) and TOA2 (or gamma) subunits of TFIIA, and their homologues.
The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals; related FAS1 domains are also found in bacteria.
The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.
FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains.
Proteins known to contain a FAS1 domain include:
The FAS1 domains of both human periostin and BIgH3 proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues. Gamma-carboxyglutamate residues are more commonly associated with GLA domains, where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.
In prokaryotes, the nucleotide exchange factor GrpE and the chaperone DnaJ are required for nucleotide binding of the molecular chaperone DnaK. The DnaK reaction cycle involves rapid peptide binding and release, which is dependent upon nucleotide binding. DnaJ accelerates the hydrolysis of ATP by DnaK, which enables the ADP-bound DnaK to tightly bind peptide. GrpE catalyses the release of ADP from DnaK, which is required for peptide release. In eukaryotes, GrpE is essential for mitochondrial Hsp70 function, however the cytosolic Hsp70 homologues are GrpE-independent.
GrpE binds as a homodimer to the ATPase domain of DnaK, and may interact with the peptide-binding domain of DnaK. GrpE accomplishes nucleotide exchange by opening the nucleotide-binding cleft of DnaK. GrpE is comprised of two domains, the N-terminal coiled coil domain, which may facilitate peptide release, and the C-terminal head domain, which forms part of the contact surface with the ATPase domain of DnaK. The head domain is comprised of six short beta strands with a limited hydrophobic core.
Pleckstrin homology (PH) domains are small modular domains that occur once, or occasionally several times, in a large variety of signalling proteins, where they serve as simple targeting domains that recognize only phosphoinositide headgroups. PH domains can target their host protein to the plasma and internal membranes through its association with phosphoinositides. PH domains have a partly opened beta-barrel topology that is capped by an alpha helix. Proteins containing PH domains include pleckstrin (N-terminal), phospholipase C delta-1, beta-spectrin, dynamin, son-of-sevenless, Grp1, Unc-89, Tapp1 and Rac-alpha kinase.
The structure of PH domains is similar to the phosphotyrosine-binding domain (PTB) found in IRS-1 (insulin receptor substrate 1), Shc adaptor and Numb; to the Ran-binding domain, found in Nup nuclear pore complex and Ranbp1; to the Enabled/VASP homology domain 1 (EVH1 domain), found in Enabled, VASP (vasodilator-stimulated phosphoprotein), Homer and WASP actin regulatory protein; to the third domain of FERM, found in moesin, radixin, ezrin, merlin and talin; and to the PH-like domain of neurobeachin.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Several prokaryotic and eukaryotic proteins that are involved in the translation process contain an SH3-like domain, consisting of a partly opened beta barrel, where the last strand is interrupted by a 3-10 helical turn. This entry represents the SH3-like beta barrel domain found in the ribosomal proteins L24 and L26, the structure of which has been determined for L24 from Archaea Haloarcula marismortui. The 50S subunit proteins function primarily to stabilize inter-domain interactions that are necessary to maintain the subunit's structural integrity, displaying a wide variety of protein-RNA interactions. Interactions between RNA and the SH3 domains appear to be mediated by the loops connecting the beta-strands and not the beta-barrel itself. L24 uses these loops between beta-strands to contact H19 and H24.
This entry represents the p29 subunit (also known as Rpp29 or Pop4) of the related ribonucleoproteins ribonuclease (RNase) P and RNase MRP, which can be found in both eukaryotes and arachea. The structure of the RNase P subunit, Rpp29, from Methanobacterium thermoautotrophicum has been determined. Mth Rpp29 is a member of the oligonucleotide/oligosaccharide binding fold family. It contains a structured beta-barrel core and unstructured N- and C-terminal extensions bearing several highly conserved amino acid residues that could be involved in RNA contacts in the protein-RNA complex. Rpp29 catalyses the endonucleolytic cleavage of RNA, removing 5'-extranucleotides from tRNA precursor. It interacts with the Rpp25 and Pop5 subunits.
RNase P is a ubiquitous ribonucleoprotein enzyme primarily responsible for cleaving the 5' leader sequence during maturation of tRNAs in all three domains of life. In eubacteria, this enzyme is made up of two subunits: a large RNA (approximately 120 kDa) responsible for mediating catalysis, and a small protein cofactor (approximately 15 kDa) that modulates substrate recognition and is required for efficient in vivo catalysis. In contrast, multiple proteins are associated with eukaryotic and archaeal RNase P, and these proteins exhibit no recognizable homology to the conserved bacterial protein subunit. In reconstitution experiments with recombinantly expressed and purified protein subunits Mth Rpp29, a homologue of the Rpp29 protein subunit from eukaryotic RNase P, is an essential protein component of the archaeal holoenzyme. In Saccharomyces cerevisiae (Baker's yeast), RNase P consists of 9 protein subunits (Pop1, Pop3-8, Rpr2 and Rpp1), while in humans there are 10 subunits (Rpp14, 20, 21, 25, 29, 30, 38, 40, hPop1, 5).
RNase MRP (mitochondrial RNA processing) is an rRNA processing enzyme that cleaves a specific site within precursor rRNA to generate the mature 5'-end of 5.8S rRNA. RNase MRP also cleaves primers for mitochondrial DNA replication and CLB2 mRNA. In yeast, RNase MRP possesses one putatively catalytic RNA and at least 9 protein subunits and is highly related to RNase P (Pop1, Pop3-Pop8, Rpp1, Snm1 and Rmp1).
The fundamental activity of the ribosome is two-fold: to decode the message of the mRNA in the small subunit, and to form a peptide bond between peptidyl-tRNA and aminoacyl-tRNA by a peptidyl transferase activity in the large subunit. Several prokaryotic and eukaryotic proteins that are involved in the translation process contain an SH3-like domain. The structure of the translation protein SH3-like domain is a partly opened beta barrel, where the last strand is interrupted by a 3-10 helical turn. The structure of the RNA-binding C-terminal domain of the Bacillus stearothermophilus (Geobacillus stearothermophilus) ribosomal protein L2 has been shown to adopt the SH3-like barrel topology. The L2 protein is located near the peptidyl transferase centre in the large ribosomal subunit where it may contribute to peptidyl transferase activity, and is involved in the assembly of the 23SrRNA. Likewise, the N-terminal domain of the ubiquitous eukaryotic initiation translation factor 5a (IF-5A) protein adopts the SH3-like barrel topology. IF-5A is involved in the initial step of peptide bond formation in translation and in cell-cycle regulation. IF-5A acts as a cofactor of the Rev protein in HIV-1-infected cells and of the Rex protein in T-cell leukaemia virus 1-infected cells.
This entry represents a subset of those identified in.
The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins . These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10).
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.
Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins. L14 is a protein of 119 to 137 amino-acid residues.
Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases.
This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.
Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair.
In many cases, including aspartate decarboxylase and aspartic proteinases, strands 1 and 4 are each bent and consist of two sections. The two sections normally make a right angle; sometimes their hydrogen-bond patterns are disrupted at the corner by a bulge or even by a large insertion. In these cases, the barrel can also be viewed as a pair of orthogonally packed sheets, each with four strands.
RNA polymerases catalyse the DNA dependent polymerisation of RNA from DNA, using the four ribonucleoside triphosphates as substrates. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Eukaryotic RNA polymerase I is essentially used to transcribe ribosomal RNA units, polymerase II is used for mRNA precursors, and III is used to transcribe 5S and tRNA genes. Each class of RNA polymerase is assembled from nine to fourteen different polypeptides. Members of the family include the largest subunit from eukaryotes; the gamma subunit from Cyanobacteria; the beta' subunit from bacteria; the A' subunit from archaea; and the B'' subunit from chloroplast RNA polymerases.
A five-stranded beta-barrel was first noted as a common structure among four proteins binding single-stranded nucleic acids (staphylococcal nuclease and aspartyl-tRNA synthetase) or oligosaccharides (B subunits of enterotoxin and verotoxin-1), and has been termed the oligonucleotide/oligosaccharide binding motif, or OB fold, a five-stranded beta-sheet coiled to form a closed beta-barrel capped by an alpha helix located between the third and fourth strands. Two ribosomal proteins, S17 and S1, are members of this class, and have different variations of the OB fold theme. Comparisons with other OB fold nucleic acid binding proteins suggest somewhat different mechanisms of nucleic acid recognition in each case.
There are many nucleic acid-binding proteins that contain domains with this OB-fold structure, including anticodon-binding tRNA synthetases, ssDNA-binding proteins (CDC13, telomere-end binding proteins), phage ssDNA-binding proteins (gp32, gp2.5, gpV), cold shock proteins, DNA ligases, RNA-capping enzymes, DNA replication initiators and RNA polymerase subunit RBP8.
Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
These aspartate proteases all contain a common closed beta barrel structure, which includes pepsin, cathepsin, chymosin, beta-secretase, plasmepsin, plant acid proteases and retroviral proteases.
Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.
In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits, which is known as Vb in mammals, V in Dictyostelium discoideum (Slime mold) and IV in yeast, binds a zinc atom. The sequence of subunit Vb is well conserved and includes three conserved cysteines that coordinate the zinc ion. Two of these cysteines are clustered in the C-terminal section of the subunit.
This entry represents domains with a double-stranded beta-helix jelly roll fold such as that found in RmlC (deoxythimodone diphosphates-4-dehydrorhamnose 3,5-epimerase;, a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria.
Other protein families contain domains that share this jelly roll fold, including glucose-6-phosphate isomerase; germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities; auxin-binding protein; seed storage protein 7S; acireductone dioxygenase; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase, phosphomannose isomerase and homogentisate dioxygenase, the last three sharing a 2-domain fold with storage protein 7s.
The cAMP-binding domains found in the cAMP receptor protein (CRP) family display a similar double-stranded beta-helix jelly roll fold. These proteins include CooA, a CO-sensing haem protein that functions as a transcription activator, and the CnbD (cyclic nucleotide binding domain) of the HCN cation channel in which cAMP binding modulates gating of the channel.
Lectins and glucanases exhibit the common property of reversibly binding to specific complex carbohydrates. The lectins/glucanases are a diverse group of proteins found in a wide range of species from prokaryotes to humans. The different family members all contain a concanavalin A-like domain, which consists of a sandwich of 12-14 beta strands in two sheets with a complex topology. Members of this family are diverse, and include the lectins: legume lectins, cereal lectins, viral lectins, and animal lectins. Plant lectins function in the storage and transport of carbohydrates in seeds, the binding of nitrogen-fixing bacteria to root hairs, the inhibition of fungal growth or insect feeding, and in hormonally regulated plant growth. Protein members include concanavalin A (Con A), favin, isolectin I, lectin IV, soybean agglutinin and lentil lectin. Animal lectins include the galectins, which are S-type lactose-binding and IgE-binding proteins such as S-lectin, CLC protein, galectin1, galectin2, galectin3 CRD, and Congerin I.
Other members with a Con A-like domain include the glucanases. Bacterial and fungal beta-glucanases, such as Bacillus 1-3,1-4-beta-glucanse, carry out the acid catalysis of beta-glucans found in microorganisms and plants. Similarly, kappa-Carrageenase degrades kappa-carrageenans from marine red algae cell walls.
This entry differs from by omitting the xylanases and glycosyl hydrolases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit). In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The forkhead-associated (FHA) domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands.
To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.
The proteins in this entry are variously annotated as iron-sulphur cluster insertion protein or Fe/S biogenesis protein. They appear to be involved in Fe-S cluster biogenesis. This family includes IscA, HesB, YadR and YfhF-like proteins. The hesB gene is expressed only under nitrogen fixation conditions. IscA, an 11 kDa member of the hesB family of proteins, binds iron and [2Fe-2S] clusters, and participates in the biosynthesis of iron-sulphur proteins. IscA is able to bind at least 2 iron ions per dimer. Other members of this family include various hypothetical proteins that also contain the NifU-like domain suggesting that they too are able to bind iron and are involved in Fe-S cluster biogenesis. The HesB family are found in species as divergent as Homo sapiens (Human) and Haemophilus influenzae suggesting that these proteins are involved in basic cellular functions.
This entry represents domains with an immunoglobulin-like (Ig-like) fold, which consists of a beta-sandwich of seven or more strands in two sheets with a Greek-key topology. Ig-like domains are one of the most common protein modules found in animals, occurring in a variety of different proteins. These domains are often involved in interactions, commonly with other Ig-like domains via their beta-sheets. Domains within this fold-family share the same structure, but can diverge with respect to their sequence. Based on sequence, Ig-like domains can be classified as V-set domains (antibody variable domain-like), C1-set domains (antibody constant domain-like), C2-set domains, and I-set domains (antibody intermediate domain-like). Proteins can contain more than one of these types of Ig-like domains. For example, in the human T-cell receptor antigen CD2, domain 1 (D1) is a V-set domain, while domain 2 (D2) is a C2-set domain, both domains having the same Ig-like fold.
Domains with an Ig-like fold can be found in many, diverse proteins in addition to immunoglobulin molecules. For example, Ig-like domains occur in several different types of receptors (such as various T-cell antigen receptors), several cell adhesion molecules, MHC class I and II antigens, as well as the hemolymph protein hemolin, and the muscle proteins titin, telokin and twitchin.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-adaptin from AP2 clathrin adaptor complexes. This subdomain has an immunoglobulin-like beta-sandwich fold containing 7 strands in 2 beta-sheets in a Greek key topology. Alpha-adaptin has a hinge region and an ear domain. The appendage domain can bind directly to clathrin and accessory proteins forming an interconnected network, and can regulate the translocation of several endocytic accessory proteins to the bud site. The N-terminal domain of the alpha subunit binds to PtdIns(4,5)P2 and has been implicated in the recruitment of AP2 to the plasma membrane.
More information about these proteins can be found at Protein of the Month: Clathrin.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of gamma1-adaptin from AP1 clathrin adaptor complex, and the homologous C-terminal GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 8 strands in 2 beta-sheets in a Greek key topology. This is a similar fold to that found in alpha- and beta-adaptins, but there is little sequence identity between them. The GAE domain is involved in the recruitment of accessory proteins, such as gamma-synergin, Rababptin-5, Eps15 and cyclin G-associated kinase, which modulate the functions of GAE domain containing proteins in the membrane trafficking events. The binding site in GAE for accessory proteins is located in a shallow hydrophobic trough surrounded by charged (mainly basic) residues.
More information about these proteins can be found at Protein of the Month: Clathrin.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking, and with ubiquitin for receptor sorting. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of gamma1-adaptin from AP1 clathrin adaptor complex, and the homologous C-terminal GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 8 strands in 2 beta-sheets in a Greek key topology. This is a similar fold to that found in alpha- and beta-adaptins, but there is little sequence identity between them. The GAE domain is involved in the recruitment of accessory proteins, such as gamma-synergin, Rababptin-5, Eps15 and cyclin G-associated kinase, which modulate the functions of GAE domain containing proteins in the membrane trafficking events. The binding site in GAE for accessory proteins is located in a shallow hydrophobic trough surrounded by charged (mainly basic) residues.
More information about these proteins can be found at Protein of the Month: Clathrin.
Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents a beta-sandwich structural motif found in the appendage domain of the gamma subunit of coatomer complexes. This subdomain has an immunoglobulin-like beta-sandwich fold containing 7 strands in 2 beta-sheets in a Greek key topology. The appendage domain of the gamma coatomer subunit has a similar overall fold to the appendage domain of clathrin adaptors, and can also share the same motif-based cargo recognition and accessory factor recruitment mechanisms.
More information about these proteins can be found at Protein of the Month: Clathrin.
This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.
The PapD-like superfamily of periplasmic chaperones directs the assembly of over 30 diverse adhesive surface organelles that mediate the attachment of many different pathogenic bacteria to host tissues, a critical early step in the development of disease. PapD, the prototypical chaperone, is necessary for the assembly of P pili. P pili contain the adhesin PapG, which mediates the attachment of uropathogenic Escherichia coli to Gal(alpha) Gal receptors present on kidney cells and are critical for the initiation of pyelonephritis. The PapD-like chaperones consist of two Ig-like domains oriented toward each other, forming L-shaped molecules. In the chaperone-subunit complex, the G1beta strand of the chaperone completes an atypical Ig fold in the subunit by occupying the groove and running parallel to the subunit C-terminal F strand. This donor strand complementation interaction simultaneously stabilizes pilus subunits and caps their interactive surfaces, preventing their premature oligomerisation in the periplasm. During pilus biogenesis, the highly conserved N-terminal extension of one subunit has been proposed to displace the chaperone G1beta strand from its neighbouring subunit in a mechanism termed donor strand exchange.
This entry represents the immunoglobulin (Ig)-like beta-sandwich domain found in PapD, as well as in other periplasmic chaperone proteins that include FimC and SfaE from E. coli, and Caf1m from Yersinia pestis. In addition, major sperm proteins (MSP) and other related sperm proteins (such as WR4 and SSP-19) contain an Ig-like domain with a similar structural fold to PapD. Major sperm proteins are central components in molecular interactions underlying sperm motility, with many isoforms existing in Caenorhabditis elegans.
Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as Cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.
This entry represents cupredoxin proteins, as well as structural homologues to cupredoxin. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Proteins with a cupredoxin-type fold are found in the following family groups:
Lipoxygenases are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes. Sequence data is available for the following lipoxygenases:
The iron atom in lipoxygenases is bound by four ligands, three of which are histidine residues. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown to be important for the activity of lipoxygenases.
This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.
Several proteins have recently been shown to contain the 5 structural motifs characteristic of GTP-binding proteins. These include murine DRG protein; GTP1 protein from Schizosaccharomyces pombe; OBG protein from Bacillus subtilis; and several others. Although the proteins contain GTP-binding motifs and are similar to each other, they do not share sequence similarity to other GTP-binding proteins, and have thus been classed as a novel group, the GTP1/OBG family. As yet, the functions of these proteins is uncertain, but they have been shown to be important in development and normal cell metabolism.
This entry represents a structural domain with an alpha-beta(4)-alpha(3) core fold. Domains of this structure are found in:
The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.
This domain has a beta-grasp fold with a core structure consisting of beta(2)-alpha-beta(2), which is similar to that found in ubiquitin. Domains with this type of structure are found in the 2Fe-2S ferredoxin family (including putidaredoxin and adrenodoxin), the 2Fe-2S ferredoxin-related family (including aldehyde reductase, and xanthine dehydrogenase), the TGS family (including threonyl-tRNA synthetase) and the MoaD/ThiS family (including molybdopterin, and thiamine biosynthesis sulphur carrier protein).
Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.
This entry represents alpha class carbonic anhydrases.
More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.
Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families include mammalian, yeast, Chlamydomonas reinhardtii and Entamoeba histolytica S27, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0250. These proteins have from 62 to 87 amino acids. They contain, in their central section, a putative zinc-finger region of the type C-x(2)-C-x(14)-C-x(2)-C.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 87 to 128 amino-acid residues. This family consists of:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian, Trypanosoma brucei, Caenorhabditis elegans and fungal L44, and Haloarcula marismortui LA.
A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see. Proteins in this group represent a subset of closely related proteins or modules, including the Escherichia coli tartrate dehydratase beta chain and the C-terminal region of the class I fumarase (where the N-terminal region is homologous to the tartrate dehydratase alpha chain). The activity of the archaeal proteins in this group is unknown.
3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents the 'swivel' domain found at the C-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the N-terminal region following the HEAT-like domain in bacterial AcnB. This domain has a three layer beta/beta/alpha structure, and in cytosolic Acn is known to rotate between the cAcn and IRP1 forms of the enzyme. This domain is also found in the small subunit of isopropylmalate dehydratase (LeuD).
More information about these proteins can be found at Protein of the Month: Aconitase.
The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others. All possess a similar structure, with a beta-alpha-beta fold characteristic of nucleotide binding proteins. The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones.
Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases.
Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity.
This entry represents a structural motif with a beta/alpha TIM barrel found in several proteins families:
These proteins share similar, but not identical, metal-binding sites. In addition, xylose isomerase and L-rhamnose isomerase each have additional alpha-helical domains involved in tetramer formation. This entry differs from IPR012307 in having a wider coverage of TIM-barrel protein families.
This entry represents a structural domain consisting of a TIM beta/alpha-barrel. These domains are found in several phospholipase C (PLC) like phosphodiesterases, including:
Phospholipase C (PLC) isozymes are directly activated by heterotrimeric G proteins and Ras-like GTPases to hydrolyze phosphatidylinositol 4,5-bisphosphate into the second messengers diacylglycerol and inositol 1,4,5-trisphosphate. PLC enzymes often play central roles in various signalling cascades.
All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesize folate de novo because they lack the active transport system of higher vertebrate cells that allows these organisms to use dietary folates. Proteins containing this domain include dihydropteroate synthase as well as a group of methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulphur protein methyltransferase (MeTr)that catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation.
Dihydropteroate synthase (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).
Pyruvate kinase controls the exit from the glysolysis pathway, catalysing the transfer of phosphate from phosphooenolpyruvate (PEP) to ADP. Mammalian pyruvate kinase is a homotetramer, where each polypeptide subunit consists of four domains: N-terminal, A domain, B domain and C-terminal. Activation of the enzyme is believed to occur via the clamping down of the B domain onto the A domain to dehydrate the active site cleft. The N- and C-terminal domains are situated at inter-subunit contact sites, and could be involved in assembly and communication within the complex. The N-terminal domain has a TIM beta/alpha-barrel structure. Homologous TIM-barrel domains are found in the following proteins:
This entry represents the TIM beta/alpha barrel found in aldolase and in related proteins. This TIM barrel usually covers the entire protein structure. Proteins containing this TIM barrel domain include class I aldolases, class I DAHP synthases, class II fructose-bisphosphate aldolases (FBP aldolases), and 5-aminolaevulinate dehydratase (a hybrid of classes I and II aldolases).
O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.
This entry represents the catalytic TIM beta/alpha barrel common to many different families of glycosyl hydrolases. Structures have been determined for several proteins containing this domain, including family 13 glycosyl hydrolases (such as alpha-amylase), beta-glycanases, family 1 glycosyl hydrolases (such as beta-glucosidase), type II chitinases, 1,4-beta-N-acetylmuraminidases, and beta-N-acetylhexosaminidases.
More information about this protein can be found at Protein of the Month: alpha-Amylase.
Tubby, an autosomal recessive mutation, mapping to mouse chromosome 7, was recently found to be the result of a splicing defect in a novel gene with unknown function. This mutation maps to the tub gene. The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and sensory deficits. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member, although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutated tub gene. TUB is the founding-member of the tubby-like proteins, the TULPs. TULPs are found in multicellular organisms from both the plant and animal kingdoms. Ablation of members of this protein family cause disease phenotypes that are indicative of their importance in nervous-system function and development.
Mammalian TUB is a hydrophilic protein of ~500 residues. The N-terminal portion of the protein is conserved neither in length nor sequence, but, in TUB, contains the nuclear localisation signal and may have transcriptional-activation activity. The C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The crystal structure of the C-terminal core domain from mouse tubby has been determined to 1.9A resolution. This domain is arranged as a 12-stranded, all anti-parallel, closed beta-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core. Structural analyses suggest that TULPs constitute a unique family of bipartite transcription factors.
Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a fidelity factor during the assembly of the ternary initiation complex which consist of the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic protein that binds to the 30S ribosomal subunit. The chloroplast initiation factor IF-3(chl) is a protein that enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal 30s subunits in which the central section is evolutionary related to the sequence of bacterial IF-3.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Prolyl tRNA synthetase exists in two forms, which are loosely related. The first form is present in the majority of eubacteria species. The second one, present in some eubacteria, is essentially present in archaea and eukaryota. Prolyl-tRNA synthetase belongs to class IIa.
This domain is found at the C-terminal in archaeal and eukaryotic enzymes, as well as in certain bacterial ones.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. This family of ribosomal proteins includes S3 from bacteria, algae and plant chloroplast, cyanelle, archaebacteria, plant mitochondria, vertebrates, insects, Caenorhabditis elegans and yeast. This entry is the C-terminal domain.
This entry describes proteins of unknown function. Structures for two of these proteins, YggU from Escherichia coli and MTH637 from the archaea Methanobacterium thermoautotrophicum, have been determined; they have a core 2-layer alpha/beta structure consisting of beta(2)-loop-alpha-beta(2)-alpha.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain.
This entry represents MECDP (2-C-methyl-D-erythritol 2,4-cyclodiphosphate) synthetase, an enzyme in the non-mevalonate pathway of isoprenoid synthesis, isoprenoids being essential in all organisms. Isoprenoids can also be synthesized through the mevalonate pathway. The non-mevolante route is used by many bacteria and human pathogens, including Mycobacterium tuberculosis and Plasmodium falciparum. This route appears to involve seven enzymes. MECDP synthetase catalyses the intramolecular attack by a phosphate group on a diphosphate, with cytidine monophosphate (CMP) acting as the leaving group to give the cyclic diphosphate product MEDCP. The enzyme is a trimer with three active sites shared between adjacent copies of the protein. The enzyme also has two metal binding sites, the metals playing key roles in catalysi.
A number of proteins from eukaryotes and prokaryotes share this common N-terminal signature and appear to be involved in terpenoid biosynthesis. The ygbB protein is a putative enzyme of this type.
DCoH is the dimerisation cofactor of hepatocyte nuclear factor 1 (HNF-1) that functions as both a transcriptional coactivator and a pterin dehydratase. X-ray crystallographic studies have shown that the ligand binds at four sites per tetrameric enzyme, with little apparent conformational change in the protein.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents a domain found at the C-terminus of ribosomal proteins L7 and L12, and also in the adaptor protein ClpS, forming an alpha/beta sandwich.
The L7 and L12 ribosomal proteins are part of the large 50S ribosomal subunit, and occur in four copies organised as two dimers. The L7/L12 dimer probably interacts with EF-Tu. L7 and L12 only differ in a single post-translational modification of the addition of an acetyl group to the N terminus of L7.
ClpS is an adaptor protein that influences protein degradation through its binding to the N-terminal domain of the chaperone ClpA in the ClpAP chaperone-protease pair. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
L5 is a protein of about 180 amino-acid residues.
The PX (phox) domain occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities. The PX domain is approximately 120 residues long, and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of p47phox binds to the SH3 domain in the same protein. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain.
The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain.
The double-stranded RNA-binding domain (dsRBD), which is found in a variety of proteins, shares a common structure with the N-terminal domain of ribosomal protein S5, namely an alpha-beta(3)-alpha structure that folds into two layers, alpha/beta. The dsRBD is found in a variety of functionally distinct proteins, including Drosophila staufen proteins (five copies of motif), dsRNA-dependent protein kinase pkr, and RNase III. Ribosomal protein S5 functions in the small ribosomal subunit, and in Escherichia coli has been shown to be important in the assembly and function of the 30S subunit.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.
The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.
Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.
This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle progression, and are required for the G1 and G2 stages of cell division. The proteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS), which is essential for their function. This regulatory subunit is a small protein of 79 to 150 residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known, while mammals have two highly related isoforms. The regulatory subunits exist as hexamers, formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual 12-stranded beta-barrel structure. Through the barrel centre runs a 12A diameter tunnel, lined by 6 exposed helix pairs. Six kinase units can be modelled to bind the hexameric structure, which may thus act as a hub for cyclin-dependent protein kinase multimerisation.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains.
NAD(+) ADP-ribosyltransferase is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). This entry represents the 2-layer sandwich domain.
In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Domain 2 of the ribosomal protein S5 has a left-handed beta-alpha-beta fold that is found in numerous RNA/DNA-binding proteins, as well as in kinases from the GHMP kinase family. Proteins containing this beta-alpha-beta fold domain include:
This entry represents prokaryotic-type K homology domains, as well as related domains that share the same 2-layer alpha/beta structure.
The K homology domain is a common RNA-binding motif present in one or multiple copies in both prokaryotic and eukaryotic regulatory proteins. The KH motifs may act cooperatively to bind RNA in the case of multiple motifs, or independently in the case of single KH motif proteins. Prokaryotic (pKH) and eukaryotic (eKH) KH domains share a KH-motif, but have different topologies. The pKH domain has been found in a number of proteins, including the N-terminal domain of the S3 ribosomal protein, the C-terminal domain of Era GTPase and the two C-terminal domains of the NusA transcription factor. The structure of the pKH domain consists of a two-layer alpha/beta fold in the arrangement alpha/beta(2)/alpha/beta.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins. The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box, and interacts with transcription factors and regulatory proteins.
The beta(2)-adaptor is one of four subunits that comprise the clathrin adaptor, which plays a central role in clathrin-mediated endocytosis by linking transmembrane receptors to be internalised to the clathrin lattice. The C-terminal domain of beta(2)-adaptor is the appendage or ear domain, which is involved in clathrin polymerisation.
Even though the C-terminal of beta(2)-adaptin has a very low sequence identity with the C-terminal of the TATA-box binding protein, they do share structural similarities, namely a beta-alpha-beta(4)-alpha core structure.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.
Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents a subdomain of the appendage (ear) domain of alpha-adaptin from AP clathrin adaptor complexes, and the appendage domain of the gamma subunit of coatomer complexes. These domains have a three-layer arrangement, alpha-beta-alpha, with a bifurcated antiparallel beta-sheet. Although the appendage domains from AP adaptins and coatomers share a similar fold, there is little sequence identity between them. However, they also share similar motif-based cargo recognition and accessory factor recruitment mechanisms.
More information about these proteins can be found at Protein of the Month: Clathrin.
This entry represents a dimerisation domain that is usually found at the C-terminal of both class I and class II oxidoreductases, as well as in NADH oxidases and peroxidases.
Proteins containing this domain form structural complexes with other known families, such asand The carbon monoxide (CO) dehydrogenase of Oligotropha carboxidovorans is a heterotrimeric complex composed of a apoflavoprotein, a molybdoprotein, and an iron-sulphur protein. It can be dissociated with sodium dodecylsulphate. CO dehydrogenase catalyzes the oxidation of CO according to the following equation:
CO + H2O = CO2 + 2e + 2H+
Subunit S represents the iron-sulphur protein of CO dehydrogenase and is clearly divided into a C- and an N-terminal domain, each binding a [2Fe-2S] cluster.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents RING-, PHD-, and FYVE-type zinc finger domains, which share a common dimetal (zinc)-bound alpha/beta structural fold, as well as the non-zinc-containing U-box domain, which is similar to the RING zinc finger only lacking the metal ion-binding residues (U-box associated with multi-ubiquitination).
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S11 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins.The histidine triad motif (HIT) is related to the sequence H-phi-H-phi-H-phi-phi (where phi is a hydrophobic amino acid). Proteins containing HIT domains form a superfamily of nucleotide hydrolases and transferases that act on the alpha-phosphate of ribonucleotides. This entry covers two HIT-containing proteins families:
Tautomerase superfamily members have a (beta-alpha-beta)2 structure in two layers, and use a similar mechanism of action involving an amino-terminal proline as a general base in a ket-enol tautomerisation reaction. Members of this superfamily include macrophage migration inhibitory factor (MIF) and related proteins such as D-dopachrome tautomerase; 4-oxalocronoate tautomerase and related enzymes such as trans-3-chloroacrylic acid dehalogenase; and 5-carboxymethyl-2-hydroxymuconate isomerase (CHMI).
Macrophage migration inhibitory factor (MIF) is a key regulatory cytokine within innate and adaptive immune responses, capable of promoting and modulating the magnitude of the response. MIF is released from T-cells and macrophages, and it can regulate cytokine secretion and the expression of receptors involved in the immune response. MIF has been linked to various inflammatory diseases, such as rheumatoid arthritis and atherosclerosis.
4-Oxalocrotonate tautomerase (4-OT) is a plasmid-encoded enzyme that catalyzes the isomerisation of beta,gamma-unsaturated enones to their alpha,beta-isomers. This enzyme is part of the plasmid-encoded catechol meta-fission pathway, which enables the bacteria to use various aromatic hydrocarbons as their sole sources of carbon and energy.
5-carboxymethyl-2-hydroxymuconate isomerase (CHMI) is a trimeric enzyme involved in the homoprotocatechuate pathway in Escherichia coli. This enzyme catalyses the isomerisation of 5-carboxymethyl-2-hydroxymuconate (CHM) to 5-carboxymethyl-2-oxo-3-hexene-1,6-dioate (COHED).
VAMPs (and its homologue synaptobrevins) define a group of SNARE proteins that contain a C-terminal coiled-coil/SNARE domain, in combination with variable N-terminal domains that are used to classify VAMPs: those containing longin N-terminal domains (~150 aa) are referred to as longins, while those with shorter N-termini are referred to as brevins. Longins are the only type of VAMP protein found in all eukaryotes, suggesting that their longin domain is essential. The longin domain is thought to exert a regulatory function. Longin domains have been shown to share the same structural fold, a profilin-like globular domain consisting of a five-stranded antiparallel beta-sheet that is sandwiched by an alpha-helix on one side, and two alpha-helices on the other (beta(2)-alpha-beta(3)-alpha(2)).
The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyze the formation of amide bonds, catalyzing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, amongst others. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates.
The ATP-grasp fold shares functional and structural similarities with the PIPK (phosphatidylinositol phosphate kinase) and protein kinase superfamilies. The ATP-grasp domain consists of two subdomains with different alpha+beta folds, which grasp the ATP molecule between them. Each subdomain provides a variable loop that forms part of the active site, with regions from other domains also contributing to the active site, even though these other domains are not conserved between the various ATP-grasp enzymes. This entry represents subdomain 2 found at the C-terminal end of the ATP-grasp domain (the N-terminal subdomain is represented by.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents the C-terminal dimerisation domain found primarily in EF-Tu (EF1A) proteins from bacteria, mitochondria and chloroplasts.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents a domain with an alpha/beta/alpha topology. This structural domain usually occurs in triplicate, with domains 1 and 3 being the most closely related since they share the same pseudo 2-fold symmetry. This entry represents domains 1 and 3. This triple domain region is found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB; in each case, this region binds the [4Fe-4S]-cluster. This triple domain region is also found in the large subunit of isopropylmalate dehydratase (LeuC).
More information about these proteins can be found at Protein of the Month: Aconitase.
Domain B5 is found in phenylalanine-tRNA synthetase beta subunits. This domain has been shown to bind DNA through a winged helix-turn-helix motif. Phenylalanine-tRNA synthetase may influence common cellular processes via DNA binding, in addition to its aminoacylation function.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA.
This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMPThis provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process cause various biological and genetic abnormalities, such as thymineless death. The enzyme is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.
The C-terminal catalytic domains of glutamine synthetase and the guanido kinase family (which includes creatine kinase and arginine kinase) share a common structural fold, namely a common core consisting of two beta-alpha-beta2-alpha repeats.
Glutamine synthetase (GS) plays an essential role in the metabolism of nitrogen by catalysing the condensation of glutamate and ammonia to form glutamine. There seem to be three different classes of GS. Class I enzymes (GSI) are specific to prokaryotes, and are oligomers of 12 identical subunits; the activity of GSI-type enzyme is controlled by the adenylation of a tyrosine residue. Class II enzymes (GSII) are found in eukaryotes and in bacteria, and are oligomers of 8 identical subunits. Class III enzymes (GSIII) have been found in Bacteroides fragilis in Butyrivibrio fibrisolvens, and are oligomers of six identical subunits. While the three classes of GS's are clearly structurally related, the sequence similarities are not so extensive.
ATP:guanido phosphotransferases are a family of structurally and functionally related enzymes that reversibly catalyse the transfer of phosphate between ATP and various phosphogens. The enzymes belonging to this family include:
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents a conserved domain usually found near the C-terminus of EF1B-gamma chains, a peptide of 410-440 residues. The gamma chain appears to play a role in anchoring the EF1B complex to the beta and delta chains and to other cellular components.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
This entry represents nucleotide excision repair (NER) proteins, such as TTDA subunit of TFIIH basal transcription factor complex (also known as subunit 5 of RNA polymerase II transcription factor B), and Rex1. These proteins have a structural motif consisting of a 2-layer sandwich structure with an alpha/beta plait topology. Nucleotide excision repair is a major pathway for repairing UV light-induced DNA damage in most organisms.
Transcription/repair factor IIH (TFIIH) is essential for RNA polymerase II transcription and nucleotide excision repair. The TFIIH complex consists of ten subunits: ERCC2, ERCC3, GTF2H1, GTF2H2, GTF2H3, GTF2H4, GTF2H5, MNAT1, CDK7 and CCNH. Defects in GTF2H5 cause the disease trichothiodystrophy (TTD), therefore GTF2H5 (general transcription factor 2H subunit 5) is also known as the TTD group A (TTDA) subunit (and as Tfb5). The TTDA subunit is responsible for the DNA repair function of the complex. TTDA is present both bound to TFIIH, and as a free fraction that shuffles between the cytoplasm and nucleus; induction of NER-type DNA lesions shifts the balance towards TTDA's more stable association with TFIIH. TTDA is also required for the stability of the TFIIH complex and for the presence of normal levels of TFIIH in the cell.
REX1 (required for excision 1) is required for DNA repair in the single-celled, photosynthetic algae Chlamydomonas reinhardtii, and has homologues in other eukaryotes.
Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.
In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.
Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.
In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.
NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.
NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome.
This entry represents the C-terminal domain found in EF2 (or EF-G) of both prokaryotes and eukaryotes (also known as eEF2), as well as in some tetracycline-resistance proteins. This domain adopts a ferredoxin-like fold consisting of an alpha/beta sandwich with anti-parallel beta-sheets. It resembles the topology of domain III found in these elongation factors, with which it forms the C-terminal block, but these two domains cannot be superimposed. This domain is often found associated with, which contains the signatures for the N-terminus of the proteins.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
This entry represents nucleotide-binding domains with an alpha-beta plait structure, which consists of either a ferredoxin-like (beta-alpha-beta)2 fold, such as that found in RNA-binding domains of various ribonucleoproteins or in viral DNA-binding domains; or a beta-(alpha)-beta-alpha-beta(2) fold, such as that found in the ribosomal protein L23.
This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2).
In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA. The crystal structure of bovine poly(A) polymerase bound to an ATP analogue at 2.5 A resolution has been determined. The structure revealed expected and unexpected similarities to other proteins. As expected, the catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase.
The C-terminal domain unexpectedly folds into a compact domain reminiscent of the RNA-recognition motif fold. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Furthermore, conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.
An alpha+beta sandwich domain with a Ferredoxin-like fold can be found in the beta chain of the translation elongation factor EF1B, and in the ribosomal protein S6 from the small subunit.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor. Experimental evidence has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.
CutA1 is a widespread protein of about 12 kDa found in bacteria, plants, and animals, including humans. The protein was originally identified in a gene locus of Escherichia coli called cutA involved in divalent metal toleranc. The cutA locus consists of two operons, one containing a single gene encoding a cytoplasmic protein, CutA1, and the other composed of two genes encoding a 50-kDa (CutA2) and a 24-kDa (CutA3) inner membrane proteins. Molecular genetics studies on the E. coli cutA locus showed that some mutations lead to copper sensitivity due to its increased uptake. However, the specific function of CutA1 in E. coli is still unknown.
However, a possible role of mammalian CutA1 in the anchoring of the enzyme acetylcholinesterase (AChE)1 in neuronal cell membranes. CutA1 does not directly interact with AChE, but the CutA1 gene is widely expressed in different regions of the brain with an expression pattern that parallels that of AChE. In addition CutA1 Co-purified with AChE from human caudate nucleus. CutA1, thus, might provide an intriguing link between copper tolerance in bacteria and a complex process in the brain of the most evolved organisms.
Both rat and E. coli CutA1 have been crystallised. Both proteins are trimeric in the crystals and in solution through an inter-subunit beta-sheet formation. Each monomer exhibits the same overall structure, adopting a ferredoxin-like fold made of an alpha-beta sandwich with antiparallel beta-sheet and containing an additional short strand and a C-terminal helix. In the beta-sheet, alternate strands are connected by helices with positive crossovers, resulting in a double beta-alpha-beta motif where the antiparallel beta-sheet packs against antiparallel alpha-helices. The C-terminal helix packs orthogonal to the N terminus.
The strong structure similarity of CutA1 with PII proteins might point to an role for CutA1 in signalling through allosteric communication between monomers. CutA1 may be involved in the tuning of a disulphide bond cascade in bacteria and mammals, acting as the PII proteins do in the nitrogen signal cascade in bacteria and plants.
The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is a versatile protein-protein interaction motif involved in many cellular functions, including transcriptional regulation, cytoskeleton dynamics, ion channel assembly and gating, and targeting proteins for ubiquitination. The BTB domain can occur alongside other domains: BTB-zinc finger (BTB-ZF), BTB-BACK-Kelch (BBK), voltage-gated potassium channel T1 (T1-Kv), MATH-BTB, BTB-NPH3 and BTB-BACK-PHR (BBP). Other proteins, such as Skp1 and ElonginC, consist almost exclusively of the core BTB fold. In all of these protein families, the BTB core fold is structurally conserved, consisting of a 2-layer alpha/beta topology where a cluster of alpha helices is flanked by short beta-sheets. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN.
This entry differs from IPR000210 in including POZ-containing Skp1 proteins.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents both the 9 kDa SRP9 and the 14 kDa SRP14 components. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.
Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.
Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family.
Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2).
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
The majority of the sequences in this entry are metallopeptidases and non-peptidase homologs belong to MEROPS peptidase family M16 (clan ME), subfamilies M16A, M16B and M16C; they include:
These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown that this H-x-x-E-H motif is involved in enzymatic activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity. The proteins classified as non-peptidase homologues either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
This entry represents the C-terminal subdomain 2 from several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:
This entry represents a structural domain consisting of a 3-layer alpha/beta/alpha fold. The beta layer is composed of seven beta-sheets, and the overall order is: (beta-hairpin)-beta(3)-alpha-beta(4)-alpha. Domains with this structure are found in the following protein families:
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
This entry represents the N-terminal subdomain 1 from several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:
3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents a domain with an alpha/beta/alpha topology. This structural domain usually occurs in triplicate, with domains 1 and 3 being the most closely related since they share the same pseudo 2-fold symmetry. This entry represents domain 2. This triple domain region is found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB; in each case, this region binds the [4Fe-4S]-cluster. This triple domain region is also found in the large subunit of isopropylmalate dehydratase (LeuC).
More information about these proteins can be found at Protein of the Month: Aconitase.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:
These proteins have about 200 amino acid residues.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed beta-sheet surrounded by three alpha-helices, which is similar to the structure of tRNA endonuclease.
Synonym(s): Di-trans-poly-cis-undecaprenyl-diphosphate synthase, Undecaprenyl pyrophosphate synthetase, Undecaprenyl pyrophosphate synthase, UPP synthetase
Di-trans-poly-cis-decaprenylcistransferase (UPP synthetase) generates undecaprenyl pyrophosphate (UPP) from isopentenyl pyrophosphate (IPP). This bacterial enzyme is also found in archaebacteria and in a number of uncharacterised proteins including some from yeasts.
This entry also matches related enzymes that transfer alkyl groups, such as dehydrodolichyl diphosphate synthase.
The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:
Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.
This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).
The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.
Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.
The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.
This entry represents domains I, II and III found in alpha-D-phosphohexomutase enzymes. All three domains share a 3-layer alpha/beta/alpha topology.
Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).
This entry represents the N-terminal domain of eukaryotic RPB5, which has a core structure consisting of 3 layers alpha/beta/alpha. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA.
This entry represents a structural motif found in three types of endonucleases: TsnA endonuclease (N-terminal), Hjc-type resolvase, and tRNA-intron endonuclease (C-terminal). These domains have a 3-layer alpha/beta/alpha topology, which is similar in structure to a motif found in several restriction endonucleases.
TsnA endonuclease is a catalytic component of the Tn7 transposition system. Tn7 transposase is composed of four proteins: TnsA, TnsB, TnsC and TsnD. DNA breakage at the 5' end of the transposon is carried out by TnsA, and breakage and joining at the 3' end is carried out by TnsB. TnsC is the molecular switch that regulates transposition. The N-terminal domain of TnsA is catalytic.
Hjc is a type of Holliday junction resolvase. The Holliday junction is an essential intermediate of homologous recombination, comprising four-stranded DNA complexes that are formed during recombination and related DNA repair events. During homologous recombination, genetic information is physically exchanged between parental DNAs via crossing single strands of the same polarity within the four-way Holliday structure. Hjc is an archaeal endonuclease, which specifically resolves the junction DNA to produce two separate recombinant DNA duplexes. This process is terminated by the endonucleolytic activity of resolvases, which convert the four-way DNA back to two double strands.
tRNA-intron endonucleases cleave pre-tRNA producing 5'-hydroxyl and 2',3'-cyclic phosphate termini, and specifically removing the intron. The splicing of transfer RNA precursors is similar in Eucarya and Archaea. In both kingdoms an endonuclease recognises the splice sites and releases the intron, but the mechanism of splice site recognition is different in each kingdom.
Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:
ADP + phosphoenolpyruvate = ATP + pyruvate
The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.
PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.
The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.
This entry represents the 3-layer alpha/beta/alpha sandwich domain.
Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including
Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.
Several biological processes regulate the activity of target proteins through changes in the redox state of thiol groups (S2 to SH2), where a hydrogen donor is linked to an intermediary disulphide protein. Such processes include the ferredoxin/thioredoxin system, the NADP/thioredoxin system, and the glutathione/glutaredoxin system. Several of these disulphide proteins share a common structure, consisting of a three-layer alpha/beta/alpha core. Proteins that contain domains with a thioredoxin fold include:
Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising and GTP-utilising enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.
PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels.
PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.
This entry represents the N-terminal domain found in both GTP-utilising and ATP-utilising phosphoenolpyruvate carboxykinase enzymes.
This entry represents a subgroup of thiolase-like domains (missing a few subfamilies). These domains have a 3-layer structure with an alpha/beta/alpha topology. This domain usually occurs in two similar copies that are related by a pseudo-dyad, and which arose through duplication. The proteins in this entry can be split into two groups: those related to thiolase, and those related to chalcone synthase. The thiolase-like enzymes include:
The chalcone synthase-like enzymes include:
This entry represents various uracil-DNA glycosylases and related DNA glycosylases, such as uracil-DNA glycosylase, thermophilic uracil-DNA glycosylase, G:T/U mismatch-specific DNA glycosylase (Mug), and single-strand selective monofunctional uracil-DNA glycosylase (SMUG1). These proteins have a 3-layer alpha/beta/alpha structure. Uracil-DNA glycosylases are DNA repair enzymes that excise uracil residues from DNA by cleaving the N-glycosylic bond, initiating the base excision repair pathway. Uracil in DNA can arise either through the deamination of cytosine to form mutagenic U:G mispairs, or through the incorporation of dUMP by DNA polymerase to form U:A pairs. These aberrant uracil residues are genotoxic. The sequence of uracil-DNA glycosylase is extremely well conserved in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localization, but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures to act as a general base in the catalytic mechanism.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins grouped on the basis of sequence similarities.
The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker. Each domain contains an rRNA binding site, and the protein functions as a structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the hydrophobic cores of the two domains at both ends.
This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases.
This entry represents the C-terminal domain found in DNA/pantothenate metabolism flavoproteins, which affects synthesis of DNA and pantothenate metabolism. These proteins contain ATP, phosphopantothenate, and cysteine binding sites. The structure of this domain has been determined in human phosphopantothenoylcysteine (PPC) synthetase and as the PPC synthase domain (CoaB) from the Escherichia coli coenzyme A bifunctional protein CoaBC. This domain adopts a 3-layer alpha/beta/alpha fold with mixed beta-sheets, which topologically resembles a combination of Rossmann-like and ribokinase-like folds. The structure of these proteins predicts a ping pong mechanism with initial formation of an acyladenylate intermediate, followed by release of pyrophosphate and attack by cysteine to form the final products PPC and AMP.
Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.
PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.
Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.
This entry represents the N-terminal domain of PGK.
Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.
PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.
Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.
This entry represents the C-terminal domain of PGK.
This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea.
The trifunctional glycinamide ribonucleotide synthetase-aminoimidazole ribonucleotide synthetase-glycinamide ribonucleotide transformylase catalyses the second, third and fifth steps in de novo purine biosynthesis. The glycinamide ribonucleotide transformylase belongs to this group.
This entry represents the substrate-binding domain of glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system. The substrate-binding domain has a 3-layer alpha/beta/alpha structure.
The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyze the formation of amide bonds, catalyzing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, amongst others. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates.
This entry represents the pre-ATP-grasp domain, which precedes the ATP-grasp domain in all superfamily members, and which usually occurs at the N-terminus of the protein. The structure of the pre-ATP-grasp domain consists of alpha/beta/alpha in three layers, and is possibly a rudiment form of the Rossmann-fold. This domain can have a substrate-binding function.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.
The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity. Some subtilisins are mosaic proteins, and others contain N- and C-terminal extensions that show no sequence similarity to any other known protein. Based on sequence homology, a subdivision into six families has been proposed.
The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site. Members of the kexin family, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of Cys-173 near to the active histidine.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish.
Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease.
This entry represents a structural domain consisting of 3-layers, alpha/beta/alpha. This domain is found in both the alpha and beta chains of succinyl-CoA synthase GDP-forming) and(ADP-forming)). This domain can also be found in ATP citrate synthase (), malate-CoA ligase () and acetate-CoA ligase (or acetyl-CoA synthase) (), as well as bacterial Fdr. Some members of the domain utilise ATP others use GTP.
This entry represents domains related by a common ancestor that have a Rossmann-like, 3-layer, alpha/beta/alpha sandwich fold, as found in the protein families listed below:
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents the alpha-beta domain of subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
This entry represents NAD- and NADP-binding domains with a core Rossmann-type fold, which consists of 3-layers alpha/beta/alpha, where the six beta strands are parallel in the order 321456. Many different enzymes contain an NAD/NADP-binding domain, including:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). This entry represents the 3-layer domain.
In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Transketolase C-terminal-like domains can be found in a number of different enzymes, including the C-terminal domain of the pyruvate dehydrogenase E1 component, the C-terminal domain of branched-chain alpha-keto acid dehydrogenases, and domain II of pyruvate-ferredoxin oxidoreductase (PFOR). Structural studies reveal this domain to comprise of three layers alpha/beta/alpha. The mixed beta sheet consists of five strands in the order 13245, where strand 1 is antiparallel to the others.
Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.
Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.
The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.
This entry represents a structural domain found in several acyl-CoA acyltransferase enzymes. This domain has a 3-layer alpha/beta/alpha structure that contains mixed beta-sheets, and can be found in the following proteins:
Several proteins carry a duplication of this domain, which consists of two NAT-like domains swapped with the C-terminal strands, including:
Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.
PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.
This entry represents subdomain 1 of the major region of PLP-dependent transferases. This domain has a 3-layer alpha/beta/alpha sandwich topology. The major region can be found in the following PLP-dependent transferase families:
Isocitrate dehydrogenase (IDH) is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ or on NADP+. In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.
3-isopropylmalate dehydrogenase (IMDH) catalyses the third step in the biosynthesis of leucine in bacteria and fungi, the oxidative decarboxylation of 3-isopropylmalate into 2-oxo-4-methylvalerate. Tartrate dehydrogenase catalyses the reduction of tartrate to oxaloglycolate.
These enzymes are evolutionary related. The best conserved region of these enzymes is a glycine-rich stretch of residues located in the C-terminal section.
The ureohydrolase superfamily includes arginase, agmatinase, formiminoglutamase and proclavaminate amidinohydrolase. These enzymes share a 3-layer alpha-beta-alpha structure, and play important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways.
Arginase, which catalyses the conversion of arginine to urea and ornithine, is one of the five members of the urea cycle enzymes that convert ammonia to urea as the principal product of nitrogen excretion. There are several arginase isozymes that differ in catalytic, molecular and immunological properties. Deficiency in the liver isozyme leads to argininemia, which is usually associated with hyperammonemia.
Agmatinase hydrolyses agmatine to putrescine, the precursor for the biosynthesis of higher polyamines, spermidine and spermine. In addition, agmatine may play an important regulatory role in mammals.
Formiminoglutamase catalyses the fourth step in histidine degradation, acting to hydrolyse N-formimidoyl-L-glutamate to L-glutamate and formamide.
Proclavaminate amidinohydrolase is involved in clavulanic acid biosynthesis. Clavulanic acid acts as an inhibitor of a wide range of beta-lactamase enzymes that are used by various microorganisms to resist beta-lactam antibiotics. As a result, this enzyme improves the effectiveness of beta-lactamase antibiotics.
Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.
The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.
A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:
The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.
The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.
Spermidine + [eIF-5A]-lysine = 1,3-diaminopropane + [eIF-5A]-deoxyhypusineThe modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation. The structure is known for this enzyme in complex with its NAD+ cofactor.
This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader.
Proteins in this entry are found in archaea, bacteria and eukaryotes. Their function is unknown, but alignment shows several conserved polar residues which are potential catalytic residues. The structure of one of these proteins has been determined and shows homolgy to heat shock protein 33, which is a chaperone protein that inhibits the aggregation of partially denatured proteins.
chorismate + l-glutamine = anthranilate + pyruvate + l-glutamate.The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that catalyses the formation of anthranilate using ammonia rather than glutamine, while component II provides glutamine amidotransferase activity
This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphatase and adenylate cyclase.
Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 Kd which shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. Its exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1, which has weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and which contains a C-terminal PP2C domain.
PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase(PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.
S-adenosylmethionine decarboxylase (AdoMetDC) catalyzes the removal of the carboxylate group of S-adenosylmethionine to form S-adenosyl-5'-3-methylpropylamine which then acts as the n-propylamine group donor in the synthesis of the polyamines spermidine and spermine from putrescine.
The catalytic mechanism of AdoMetDC involves a covalently-bound pyruvoyl group. This group is post-translationally generated by a self-catalyzed intramolecular proteolytic cleavage reaction between a glutamate and a serine. This cleavage generates two chains, beta (N-terminal) and alpha (C-terminal). The N-terminal serine residue of the alpha chain is then converted by nonhydrolytic serinolysis into a pyruvyol group.
ATP + RNA 3'-terminal-phosphate = AMP + diphosphate + RNA terminal-2',3'-cyclic-phosphateThese enzymes might be responsible for production of the cyclic phosphate RNA ends that are known to be required by many RNA ligases in both prokaryotes and eukaryotes.
RNA cyclase is a protein of from 36 to 42 kDa. The best conserved region is a glycine-rich stretch of residues located in the central part of the sequence and which is reminiscent of various ATP, GTP or AMP glycine-rich loops.
The crystal structure of RNA 3'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3). The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a similar secondary structure element with different topology, observed in many other proteins such as thioredoxin. Although the active site of this enzyme could not be unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate acceptor, in which a number of amino acids are highly conserved in the enzyme from different sources.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L17 is one of the proteins from the large ribosomal subunit. Bacterial L17 is a protein of 120 to 130 amino-acid residues while yeast YmL8 is twice as large (238 residues). The N-terminal half of YmL8 is colinear with the sequence of L17 from Escherichia coli.
This entry represents a structural motif found at the C-terminal of lactate dehydrogenaseand malate dehydrogenases, as well as at the C-terminal of family 4 glycoside hydrolases. These domains have an unusual fold consisting of segregated alpha-helical and beta-sheet regions, although they contain predominantly anti-parallel beta-sheets.
L-lactate dehydrogenases are metabolic enzymes that catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.
O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Glycoside hydrolase family 4comprises enzymes with several known activities; 6-phospho-beta-glucosidase; 6-phospho-alpha-glucosidase; alpha-galactosidase.
RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which is a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand.
Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.
PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.
This entry represents subdomain 2 of the major region of PLP-dependent transferases. This domain has a complex alpha/beta structure. The major region can be found in the following PLP-dependent transferase families:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.
Amidase signature (AS) enzymes are a large group of hydrolytic enzymes that contain a conserved stretch of approximately 130 amino acids known as the AS sequence. They are widespread, being found in both prokaryotes and eukaryotes. AS enzymes catalyse the hydrolysis of amide bonds (CO-NH2), although the family has diverged widely with regard to substrate specificity and function. Nonetheless, these enzymes maintain a core alpha/beta/alpha structure, where the topologies of the N- and C-terminal halves are similar. AS enzymes characteristically have a highly conserved C-terminal region rich in serine and glycine residues, but devoid of aspartic acid and histidine residues, therefore they differ from classical serine hydrolases. These enzymes posses a unique, highly conserved Ser-Ser-Lys catalytic triad used for amide hydrolysis, although the catalytic mechanism for acyl-enzyme intermediate formation can differ between enzymes.
Examples of AS enzymes include:
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry represents the alpha-helical subdomain that comprises part of the catalytic core of eukaryotic and viral topoisomerase I (type IB) enzymes, which occurs near the C-terminal region of the protein.
Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.
Vaccinia virus, a cytoplasmically-replicating poxvirus, encodes a type I DNA topoisomerase that is biochemically similar to eukaryotic-like DNA topoisomerases I, and which has been widely studied as a model topoisomerase. It is the smallest topoisomerase known and is unusual in that it is resistant to the potent chemotherapeutic agent camptothecin. The crystal structure of an amino-terminal fragment of vaccinia virus DNA topoisomerase I shows that the fragment forms a five-stranded, antiparallel beta-sheet with two short alpha-helices and connecting loops. Residues that are conserved between all eukaryotic-like type I topoisomerases are not clustered in particular regions of the structure.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:
Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.
This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes the C-terminal domain of folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).
The C-terminal domain is almost always associated with the cytoplasmic peptidoglycan synthetases, N-terminal domain.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents the alpha-beta domain of subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent C-terminal region in eukaryotic topoisomerase II composed of a single polypeptide.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
In prokaryotes, the nucleotide exchange factor GrpE and the chaperone DnaJ are required for nucleotide binding of the molecular chaperone DnaK. The DnaK reaction cycle involves rapid peptide binding and release, which is dependent upon nucleotide binding. DnaJ accelerates the hydrolysis of ATP by DnaK, which enables the ADP-bound DnaK to tightly bind peptide. GrpE catalyses the release of ADP from DnaK, which is required for peptide release. In eukaryotes, GrpE is essential for mitochondrial Hsp70 function, however the cytosolic Hsp70 homologues are GrpE-independent.
GrpE binds as a homodimer to the ATPase domain of DnaK, and may interact with the peptide-binding domain of DnaK. GrpE accomplishes nucleotide exchange by opening the nucleotide-binding cleft of DnaK. GrpE is comprised of two domains, the N-terminal coiled coil domain, which may facilitate peptide release, and the C-terminal head domain, which forms part of the contact surface with the ATPase domain of DnaK. This entry represents the N-terminal coiled-coil domain.
Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising and GTP-utilising enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.
PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels.
PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.
This entry represents the C-terminal domain found in both GTP-utilising and ATP-utilising phosphoenolpyruvate carboxykinase enzymes.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.
The entry also contains proteins that have lost catalytic activity, for example Spt16 , which is a component of the FACT complex. The crystal structure of the N terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails.
The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( ) and Pob3p (). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.
The PEBP (PhosphatidylEthanolamine-Binding Protein) family is a highly conserved group of proteins that have been identified in numerous tissues in a wide variety of organisms, including bacteria, yeast, nematodes, plants, drosophila and mammals. The various functions described for members of this family include lipid binding, neuronal development, serine protease inhibition, the control of the morphological switch between shoot growth and flower structures, and the regulation of several signalling pathways such as the MAP kinase pathway, and the NF-kappaB pathway. The control of the latter two pathways involves the PEBP protein RKIP, which interacts with MEK and Raf-1 to inhibit the MAP kinase pathway, and with TAK1, NIK, IKKalpha and IKKbeta to inhibit the NF-kappaB pathway. Other PEBP-like proteins that show strong structural homology to PEBP include Escherichia coli YBHB and YBCL, the Rattus norvegicus (Rat) neuropeptide HCNP, and Antirrhinum majus (Garden snapdragon) protein centroradialis (CEN).
Structures have been determined for several members of the PEBP-like family, all of which show extensive fold conservation. The structure consists of a large central beta-sheet flanked by a smaller beta-sheet on one side, and an alpha helix on the other. Sequence alignments show two conserved central regions, CR1 and CR2, that form a consensus signature for the PEBP family. These two regions form part of the ligand-binding site, which can accommodate various anionic groups. The N- and C-terminal regions are the least conserved, and may be involved in interactions with different protein partners. The N-terminal residues 2-12 form the natural cleavage peptide HCNP involved in neuronal development. The C-terminal region is deleted in plant and bacterial PEBP homologues, and may help control accessibility to the active site.
Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::
N-formyl-L-methionine + H2O = formate + methionyl peptideCatalytic efficiency strongly depends on the identity of the bound metal.
The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L22 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L22 is known to bind 23S rRNA. It belongs to a family of ribosomal proteins which includes: bacterial L22; algal and plant chloroplast L22 (in legumes L22 is encoded in the nucleus instead of the chloroplast); cyanelle L22; archaebacterial L22; mammalian L17; plant L17 and yeast YL17.
The generic name 'NUDIX hydrolases' (NUcleoside DIphosphate linked to some other moiety X) has been coined for this domain family. The family can be divided into a number of subgroups, of which MutT anti- mutagenic activity represents only one type; most of the rest hydrolyse diverse nucleoside diphosphate derivatives (including ADP-ribose, GDP- mannose, TDP-glucose, NADH, UDP-sugars, dNTP and NTP).
Inorganic pyrophosphatase (PPase) is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.
The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.
This entry represents a structural domain consisting of segregated alpha and beta regions in 3-layers. Homologous domains with this structure are found in:
DHBP synthase RibB catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate, the latter serving as the biosynthetic precursor for the xylene ring of riboflavin. In Photobacterium leiognathi, the riboflavin synthesis genes ribB (DHBP synthase), ribE (riboflavin synthase), ribH (lumazone synthase) and ribA (GTP cyclohydrolase II) all reside in the lux operon. RibB is sometimes found as a bifunctional enzyme with GTP cyclohydrolase II that catalyses the first committed step in the biosynthesis of riboflavin. No sequences with significant homology to DHBP synthase are found in the metazoa.
The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.
Prokaryotes contain a single RNA polymerase (RNAP) that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) with specific transcriptional roles. In eukaryotes, the RPB6 subunit is common to all three polymerases. RPB6 is involved in the initiation of transcription. Bacterial DNA-dependent RNAP contains a small subunit termed omega, where the complete RNAP composition is beta'-beta-alpha(I)-alpha(II)-omega. The bacterial omega subunit is homologous in sequence and structure to the eukaryotic RPB6 subunit; they also have similar functional roles, being able to promote RNA polymerase assembly, possibly through a latching mechanism.
Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).
This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2).
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the AN1-type zinc finger domain, which has a dimetal (zinc)-bound alpha/beta fold. This domain was first identified as a zinc finger at the C-terminus of AN1 a ubiquitin-like protein in Xenopus laevis. The AN1-type zinc finger contains six conserved cysteines and two histidines that could potentially coordinate 2 zinc atoms.
Certain stress-associated proteins (SAP) contain AN1 domain, often in combination with A20 zinc finger domains (SAP8) or C2H2 domains (SAP16). For example, the human protein Znf216 has an A20 zinc-finger at the N-terminus and an AN1 zinc-finger at the C-terminus, acting to negatively regulate the NFkappaB activation pathway and to interact with components of the immune response like RIP, IKKgamma and TRAF6. The interact of Znf216 with IKK-gamma and RIP is mediated by the A20 zinc-finger domain, while its interaction with TRAF6 is mediated by the AN1 zinc-finger domain; therefore, both zinc-finger domains are involved in regulating the immune response. The AN1 zinc finger domain is also found in proteins containing a ubiquitin-like domain, which are involved in the ubiquitination pathway. Proteins containing an AN1-type zinc finger include:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Bacteria synthesize a set of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins. Examples include the HU protein in Escherichia coli is a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of eubacteria, cyanobacteria and archaebacteria, and are also encoded in the chloroplast genome of some algae. The integration host factor (IHF), a dimer of closely related chains which seem to function in genetic recombination as well as in translational and transcriptional control is found in enterobacteria and viral proteins include the African Swine fever virus protein A104R (or LMW5-AR).
The exact function of these proteins is not yet clear but they are capable of wrapping DNA and stabilizing it from denaturation under extreme environmental conditions. The structure is known for one of these proteins. The protein exists as a dimer and two "beta-arms" function as the non-specific binding site for bacterial DNA.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein. S18 belongs to a family of ribosomal proteins that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents domain 3 of the ribosomal protein L2 from the large 50S subunit. The 50S subunit proteins function primarily to stabilize inter-domain interactions that are necessary to maintain the subunit's structural integrity, displaying a wide variety of protein-RNA interactions. This domain has an irregular structure.
This family contains both lactate and malate dehydrogenases. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.
L-lactate dehydrogenase (LDH) catalyses the reversible NAD-dependent interconversion of pyruvate to L-lactate. In vertebrate muscles and in lactic acid bacteria it represents the final step in anaerobic glycolysis. This tetrameric enzyme is present in prokaryotic and eukaryotic organisms. In vertebrates there are three isozymes of LDH: the M form (LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart muscle and the X form (LDH-C), found only in the spermatozoa of mammals and birds. In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as epsilon-crystallin.
L-2-hydroxyisocaproate dehydrogenase (L-hicDH) catalyses the reversible and stereospecific interconversion between 2-ketocarboxylic acids and L-2-hydroxy-carboxylic acids. L-hicDH is evolutionary related to LDH's.
Isocitrate dehydrogenase (IDH) is an important enzyme of carbohydrate metabolism which catalyzes the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ or on NADP+. In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.
The eukaryotic, NADP-dependent isocitrate dehydrogenases, are defined by this group that includes the cytosolic, mitochondrial, and chloroplast enzymes, but does also hit a small number of bacterial proteins.
Synonyms: Inosine-5'-monophosphate dehydrogenase, Inosinic acid dehydrogenase
IMP dehydrogenase (MPDH) catalyzes the rate-limiting reaction of de novo GTP biosynthesis, the NAD-dependent reduction of IMP into XMP.
Inosine 5-phosphate + NAD+ + H2O = xanthosine 5-phosphate + NADH
IMP dehydrogenase is associated with cell proliferation and is a possible target for cancer chemotherapy. Mammalian and bacterial IMPDHs are tetramers of identical chains. There are two IMP dehydrogenase isozymes in humans. IMP dehydrogenase nearly always contains a long insertion that has two CBS domains within it and adopts a TIM barrel structure.
Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme.
GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.
GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington's disease, Alzheimer's disease, Parkinson's disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins.
2-oxoglutarate dehydrogenase is a key enzyme in the TCA cycle, converting 2-oxoglutarate, coenzyme A and NAD(+) to succinyl-CoA, NADH and carbon dioxide. This activity of this enzyme is tightly regulated and it is a major determinant of the metabolic flux through the TCA cycle. This enzyme is composed of multiple copies of three different subunits: 2-oxoglutarate dehydrogenase (E1), dihydrolipoamide succinyltransferase (E2) and lipoamide dehydrogenase (E3) which is often shared with similar enzymes such as pyruvate dehydrogenase. The E2 component forms a large multimeric core which binds the peripheral E1 and E3 subunits. The substrate is transferred between the active sites of the different subunits by a lipoyl moiety, bound to a lysine residue from the E2 polypeptide.
This entry represents the E1 subunit of 2-oxoglutarate dehydrogenase. It catalyses the decarboxylation of this compound in a thiamine pyrophosphate-dependent manner, transferring the resultant succinyl group onto the liposyl moiety bound to the E2 subunit. The E1 ortholog from Corynebacterium glutamicum (Brevibacterium flavum) is unusual in having an N-terminal extension that resembles the E2 component of 2-oxoglutarate dehydrogenase enzyme.
This entry represents a glutamate dehydrogenase.
Glutathione peroxidase (GSHPx) is an enzyme that catalyses the reduction of hydroxyperoxides by glutathione. Its main function is to protect against the damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates, several forms of GSHPx are known, including a ubiquitous cytosolic form (GSHPx-1), a gastrointestinal cytosolic form (GSHPx-GI), a plasma secreted form (GSHPx-P), and an epididymal secretory form (GSHPx-EP). In addition to these characterised forms, the sequence of a protein of unknown function has been shown to be evolutionary related to those of GSHPx's.
In filarial nematode parasites, the major soluble cuticular protein (gp29) is a secreted GSHPx, which may provide a mechanism of resistance to the immune reaction of the mammalian host by neutralising the products of the oxidative burst of leukocytes. The Escherichia coli protein btuE, a periplasmic protein involved in vitamin B12 transport, is evolutionarily related to GSHPxs, although the significance of this relationship is unclear. The structure of bovine seleno-glutathione peroxidase has been determined. The protein belongs to the alpha-beta class, with a 3 layer(aba) sandwich architecture. The catalyic site of GSHPx contains a conserved residue which is either a cysteine or, in many eukaryotic GSHPx, a selenocysteine.
Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.
The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).
Ferredoxin-NADP+ reductase (FNR) is one of several soluble partners that can receive an electron from ferredoxin once it has been reduced by photosystem I in chloroplasts and cyanobacteria. FNR catalyses the reduction of NADP+ to NADPH, using the electrons provided by the reduced ferredoxin, with the aid of a FAD cofactor.
This group represents a bifunctional dihydrofolate reductase/thymidylate synthase found in some plant species and protozoal parasites including malarial species and trypanosomes. In other species dihydrofolate reductase and thymidilate synthase are encoded on separate polypeptides.
Thymidylate synthase catalyzes the reductive methylation of dUMP to dTMP with concomitant conversion of 5,10-methylenetetrahydrofolate to dihydrofolate:
5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMPThis provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is important for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process can cause various biological and genetic abnormalities. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.
Dihydrofolate reductase (DHFR) catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate:
5,6,7,8-tetrahydrofolate + NADP+ = 7,8-dihydrofolate + NADPH + H+This is an essential step in de novo synthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis), and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate. Although DHFR is found ubiquitously in prokaryotes and eukaryotes, and is found in all dividing cells, maintaining levels of fully reduced folate coenzymes, the catabolic steps are still not well understood.
As this enzyme is essential in both nucleic acid and amino acid biosynthesis, it is an important target of antiparasitic drugs. Resistance to antimalarial drugs that target this enzyme is often due to mutations that prevent drug binding but maintain enzyme activity. The structure of the wild-type and drug resistant malarial enzymes provides insights into the development of resistance and suggests approaches for the design of new drugs against this target.
Serine hydroxymethyltransferase (SHMT) is a pyridoxal phosphate (PLP) dependent enzyme and belongs to the aspartate aminotransferase superfamily (fold type I). The pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved in all forms of the enzyme. The enzyme carries out interconversion of serine and glycine using PLP as the cofactor. SHMT catalyses the transfer of a hydroxymethyl group from N5, N10- methylene tetrahydrofolate to glycine, resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers and the mammalian enzyme forms a homotetramer. PLP dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalysed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), D-amino acid superfamily (fold type IV) and glycogen phophorylase family (fold type V).
In vertebrates, glycine hydroxymethyltransferase exists in a cytoplasmic and a mitochondrial form whereas only one form is found in prokaryotes.
Two different types of thiolase are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase and 3-ketoacyl-CoA thiolase. 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.
In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.
There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.
Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases.
This group represents a glycerol-3-phosphate O-acyltransferase.
S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.
In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.
The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.
This group represents a predicted hydroxyethylthiazole kinase. THZ kinase activity is involved in the salvage synthesis of TH-P from the thiazole:
2-methyl-4-amino-5-hydroxymethylpyrimidine diphosphate + 4-4-methyl-5-(2-phosphonooxyethyl)-thiazole = pyrophosphate + thiamin monophosphateHydroxyethylthiazole kinase expression is regulated at the mRNA level by intracellular thiamin pyrophosphate.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
In the absence of cAMP, protein kinase A (PKA) exists as an equimolar tetramer of regulatory (R) and catalytic (C) subunits. In addition to its role as an inhibitor of the C subunit, the R subunit anchors the holoenzyme to specific intracellular locations and prevents the C subunit from entering the nucleus. Typical R subunits have a conserved domain structure, consisting of the N-terminal dimerisation domain, inhibitory region, cAMP-binding domain A and cAMP-binding domain B. R subunits interact with C subunits primarily through the inhibitory site. The cAMP-binding domains show extensive sequence similarity and bind cAMP cooperatively.
On the basis of phylogenetic trees generated from multiple sequence alignment of complete sequences, this family was divided into four sub-families, types I to IV. Types I and II, found in animals, differ in molecular weight, sequence, autophosphorylation capability, cellular location and tissue distribution. Types I and II are further sub-divided into alpha and beta subtypes, based mainly on sequence similarity. Type III are from fungi and type IV are from alveolates.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
This entry represents a DNA-directed RNA polymerase, RPB5 subunit.
This entry represents a CDP-diacylglycerol-inositol 3-phosphatidyltransferase.
The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.
14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.
Dihydroorotase belongs to MEROPS peptidase family M38 (clan MJ), where it is classified as a non-peptidase homologue. DHOase catalyses the third step in the de novo biosynthesis of pyrimidine, the conversion of ureidosuccinic acid (N-carbamoyl-L-aspartate) into dihydroorotate. Dihydroorotase binds a zinc ion which is required for its catalytic activity.
In bacteria, DHOase is a dimer of identical chains of about 400 amino-acid residues (gene pyrC). In the metazoa, DHOase is part of a large multi-functional protein known as 'rudimentary' in Drosophila melanogaster and CAD in mammals and which catalyzes the first three steps of pyrimidine biosynthesis. The DHOase domain is located in the central part of this polyprotein. In yeast, DHOase is encoded by a monofunctional protein (gene URA4). However, a defective DHOase domain is found in a multifunctional protein (gene URA2) that catalyzes the first two steps of pyrimidine biosynthesis.
The comparison of DHOase sequences from various sources shows that there are two highly conserved regions. The first located in the N-terminal extremity contains two histidine residues suggested to be involved in binding the zinc ion. The second is found in the C-terminal part. Members of this family of proteins are predicted to adopt a TIM barrel fold.
This family represents the homodimeric form of dihydroorotase It is found in bacteria, plants and fungi; URA4 of yeast is a member of this group of sequences.
Two types of proteins that hydrolyse inorganic pyrophosphate (PPi), very different in both amino acid sequence and structure, have been characterised to date: soluble and membrane-bound proton-pumping pyrophosphatases (sPPases and H(+)-PPases, respectively). sPPases are ubiquitous proteins that hydrolyse PPi to release heat, whereas H+-PPases, so far unidentified in animal and fungal cells, couple the energy of PPi hydrolysis to proton movement across biological membranes. The latter type is represented by this group of proteins. H+-PPases are also called vacuolar-type inorganic pyrophosphatases (V-PPase) or pyrophosphate-energised vacuolar membrane proton pumps. In plants, vacuoles contain two enzymes for acidifying the interior of the vacuole, the V-ATPase and the V-PPase (V is for vacuolar).
Two distinct biochemical subclasses of H+-PPases have been characterised to date: K+-stimulated and K+-insensitive.
For additional information please see.
Class I aldolases catalyse carbon-carbon bond formation using a 'Schiff base' mechanism. This entry represents deoxyribose-phosphate aldolase, a widely distributed enzyme, which catalyses the following reversible reaction:
2-deoxy-D-ribose 5-phosphate = D-glyceraldehyde 3-phosphate + acetaldehydeWhile the physiological role of this enzyme remains unknown in eukaryotes, in prokaroytes it is thought to function in the catabolism of deoxyribonucleotides.
In all studied structures, the deoxyribose-phophate aldolase subunits adopt the classical eight-bladed TIM barrel fold. The oligomerisation state of the enzyme appears to depend on the living temperature of the organism - the Escherichia coli enzyme is a homodimer, while the enzymes from the thermophilic microorganisms Thermus thermophilus and Aeropyrum pernix are homotetramers. The degree of oligomerisation does not, however, appear to affect catalysis.
Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.
Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Seryl-tRNA synthetase exists as monomer and belongs to class IIa.
This entry represents eukaryotic glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes yeast S7 (YS6); archaeal S4e; and mammalian and plant cytoplasmic S4. Two highly similar isoforms of mammalian S4 exist, one coded by a gene on chromosome Y, and the other on chromosome X. These proteins have 233 to 264 amino acids.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S11 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins.Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This protein is known typically as S12 in bacteria, S23 in eukaryotes and as either S12 or S23 in the Archaea.
Bacterial S12 molecules contain a conserved aspartic acid residue which undergoes a novel post-translational modification, beta-methylthiolation, to form the corresponding 3-methylthioaspartic acid.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. It is a basic protein of 115 to 177 amino-acid residues. This family of ribosomal proteins is present in procaryotes and eukaryotes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 82 to 87 amino acids. The amino termini are all N alpha-acetylated. The N-terminal halves of the protein molecules are highly conserved in contrast to the carboxy-terminal parts.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
L5 is a protein of about 180 amino-acid residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.
This group represents a clathrin heavy chain.
This group represents an adaptor protein complex, beta subunit.
Prokaryotes and eukaryotes respond to heat shock and other forms of environmental stress by inducing synthesis of heat-shock proteins (hsp). The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1Â2% of cellular proteins under non-stress conditions. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.
The sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal domain by a highly-acidic, flexible linker region.
Translation initiation factor 5A (IF-5A) is reported to be involved in the first step of peptide bond formation in translation, to be involved in cell-cycle regulation and to be a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively. IF-5A contains an unusual amino acid, hypusine N-epsilon-(4-aminobutyl-2-hydroxy)lysine), that is required for its function. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported.
The crystal structure of IF-5A from the archaeon Pyrobaculum aerophilum has been determined to 1.75 A. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding.
Cells have evolved elaborate mechanisms to rid themselves of aberrant proteins and transcripts. The nonsense-mediated mRNA decay pathway (NMD) is an example of a pathway that eliminates aberrant mRNAs. In addition to its role in recognition of the AUG codon during translation initiation and maintenance of the appropriate reading frame during translation elongation by directing the ribosome to the proper start site of translation by functioning in concert with eIF-2 and the initiator tRNA-Met, the SUI1 protein plays a role in the NMD pathway.
Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.
The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices.
The eubacterial secY protein interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.
Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export. Homologs of secY are found in archaebacteria. SecY is also encoded in the chloroplast genome of some algae where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.
Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::
N-formyl-L-methionine + H2O = formate + methionyl peptideCatalytic efficiency strongly depends on the identity of the bound metal.
The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .
Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.
PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.
Proteins in this entry occur in archaea, bacteria and eukaryotes. They are encoded by genes which are often co-transcribed with proline biosysnthesis genes, although their function in vivo has not yet been demonstrated.
The structure of the yeast protein YBL036C has been determined to a resolution of 2.0 A. Similar in structure to the N-terminal domains of alanine racemase and ornithine decarboxylase, it forms a TIM barrel fold which begins with a long N-terminal helix, rather than the classical beta strand found at the beginning of most other TIM barrels. Unlike alanine racemase and ornithine decarboxylase, which are two-domain dimeric proteins, the yeast protein is a single domain monomer. A pyridoxal 5'-phosphate cofactor is covalently bound towards the C-terminal end of the barrel, which is the usual active site in TIM-barrel folds. Some racemase activity was observed for this protein and it was suggested by the authors that it may function as a general racemase.
This group represents a diphthamide biosynthesis protein 1.
This group represents a predicted translation machinery-associated RNA binding protein.
This group represents a group of ATP-dependent RNA helicases including the antiviral protein SK12 and DOB1, which is involved in 3' end formation of rRNA and mRNA transport.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This group represents a 23S ribosomal RNA methyltransferase.
This group represents a coatomer, beta' subunit.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A variety of eukaryotic and plant ribosomal L10e proteins can be grouped. This family consists of vertebrate L10 (QM), plant L10, Caenorhabditis elegans L10, yeast L10 (QSR1) and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0543.
This group represents a number of eukaryotic and archaeal DNA repair and recombination proteins which are homologous to the bacterial protein RecA.
The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents the mu subunit of various clathrin adaptors (AP1, AP2 and AP3). The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle .
More information about these proteins can be found at Protein of the Month: Clathrin.
Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising and GTP-utilising enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.
PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels.
PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.
This entry represents ATP-utilising phosphoenolpyruvate carboxykinase enzymes.
A conserved heterotrimeric integral membrane protein complex--the Sec61 complex (eukaryotes) or SecY complex (prokaryotes)--forms a protein-conducting channel that allows polypeptides to be transferred across (or integrated into) the endoplasmic reticulum (eukaryotes) or across the cytoplasmic membrane (prokaryotes). This complex is itself a part of a larger translocase complex.
The alpha subunits, called Sec61alpha in mammals, Sec61p in Saccharomyces cerevisiae (Baker's yeast), and SecY in prokaryotes, and the gamma subunits, called Sec61gamma in mammals, Sss1p in S. cerevisiae, and SecE in prokaryotes, show significant sequence conservation. Both subunits are required for cell viability in S. cerevisiae and Escherichia coli. The beta subunits, called Sec61beta in mammals, Sbh in S. cerevisiae, and Sec-beta in archaea, are not essential for cell viability; they are similar in eukaryotes and archaea, but show no obvious homology to the corresponding SecG subunits in bacteria. SecY forms the channel pore, and it is the cross-linking partner of polypeptide chains passing through the membrane. SecY and SecE constitute the high-affinity SecA-binding site on the membrane.
The channel is a passive conduit for polypeptides. It must therefore associate with other components that provide a driving force. The partner proteins in bacteria and eukaryotes differ. In bacteria, the translocase complex comprises 7 proteins, including a chaperone protein (SecB;, an ATPase (SecA;, an integral membrane complex (SecY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD;and SecF;. The SecA ATPase interacts dynamically with the SecYEG integral membrane components to drive the transmembrane movement of newly synthesized preproteins. In S. cerevisiae (and probably in all eukaryotes), the full translocase comprises another membrane protein subcomplex (the tetrameric Sec62/63p complex), and the lumenal protein BiP, a member of the Hsp70 family of ATPases. BiP promotes translocation by acting as a molecular ratchet, preventing the polypeptide chain from sliding back into the cytosol.
This family includes eukaryotic translation initiation factor 6 (eIF6) as well as presumed archaeal homologues.
The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This event is rate-limiting for translation, and depends on external stimuli and the status of the cell. Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and prevents its association with the 40S ribosomal subunit ribosomes. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.
This is a subfamily of glycine cleavage T proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferasethat catalyses the following reaction:
(6S)-tetrahydrofolate + S-aminomethyldihydrolipoylprotein = (6R)-5,10-methylenetetrahydrofolate + NH3 + dihydrolipoylprotein
This group represents a tyrosine tRNA ligase, archaeal/eukaryotic types.
This group represents a small nuclear ribonucleoprotein SmF.
Members of this family catalyse the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae (Baker's yeast) acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria.
This entry represents transcription elongation factors of the IIS type. TFIIS is a component of RNA polymerase II preinitiation complexes, and is required for preinitiation complex assembly and stability. The association of TFIIS with a promoter depends on functional preinitiation complex components including Mediator and the SAGA complex. TFIIS is composed of three domains: domain 1 forms a 4-helical bundle that appears to bind certain initiation factors; domain 2 forms a 3-helical bundle and is required for Pol II binding; domain 3 forms a zinc ribbon and is essential for stimulation of RNA cleavage.
This group represents a phosphatidylinositol N-acetylglucosaminyltransferase, GPI19/PIG-P subunit.
This group represents a mitochondrial fission 1 protein.
This group represents a DNA primase, large subunit.
This group represents a polyubiquitin-tagged protein recognition complex, Npl4 component.
1L-myo-Inositol-1-phosphate synthase catalyzes the conversion of D-glucose 6-phosphate to 1L-myo-inositol-1-phosphate, the first committed step in the production of all inositol-containing compounds, including phospholipids, either directly or by salvage. The enzyme exists in a cytoplasmic form in a wide range of plants, animals, and fungi. It has also been detected in several bacteria and a chloroplast form is observed in alga and higher plants. Inositol phosphates play an important role in signal transduction.
In Saccharomyces cerevisiae (Baker's yeast), the transcriptional regulation of the INO1 gene has been studied in detail and its expression is sensitive to the availability of phospholipid precursors as well as growth phase. The regulation of the structural gene encoding 1L-myo-inositol-1-phosphate synthase has also been analyzed at the transcriptional level in the aquatic angiosperm, Spirodela polyrrhiza (Giant duckweed) and the halophyte, Mesembryanthemum crystallinum (Common ice plant).
This group represents an adaptor protein complex, sigma subunit.
This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.
This group represents a nascent polypeptide-associated complex, alpha subunit.
This group represents a TFIIH basal transcription factor complex, subunit SSL1.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents subunit F found in the V1 complex of V-ATPases in eukaryotes. Subunit F is a 16 kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.
The 19S component is divided into a "base" subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid" subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12). Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation.
This group represents a 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn2/Psmd1 subunit.
A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:
There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA.
This entry contains Utp11, a large ribonuclear protein that associates with snoRNA U3.
Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.
The 19S component is divided into a "base" subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid" subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12). Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation.
This group represents a 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn1 (regulatory-particle non-ATPase subunit 1). This subunit is essential for embryogenesis in Arabidopsis thaliana.
RER1 family proteins are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C terminus of yeast Rer1p interacts with a coatomer complex.
Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively.
This group represents an eukaryotic translation initiation factor 3, subunit 6.
Leucine carboxymethyltransferases methylate the carboxyl group of leucine residues to form alpha-leucine ester residues. It includes LCTM1 which regulate the activity of serine/threonine phosphatase 2A (PP2A) through methylation of the C-terminal leucine residue of the catalytic subunit of PP2A . This affects the heteromultimeric composition of PP2A which in turn affects protein recognition and substrate specificity. Like many other methyltransferases LCTM1 uses S-adenosylmethionine (SAM) as the methyl donor. LCTM1 contains the common SAM-dependent methyltransferase core fold, with various insertions and additions creating a specific PP2A binding site.
This group represents the LCMT1 subgroup of leucine carboxymethyltransferases.
This group represents an ubiquitin-specific protease (ubiquitin carboxyl-terminal hydrolase).
This group represents a tRNA (guanine-N(1)-)-methyltransferase, eukaryotic type.
This group represents an U6 snRNA-associated Sm-like protein LSm2.
This group represents a prefoldin, subunit 3.
This group represents a prefoldin, subunit 4.
This entry represents 60S ribosome subunit biogenesis protein Nip7, which is required for proper 27S pre-rRNA processing and 60S ribosome subunit assembly. In yeast, Nip7 interacts with nucleolar proteins such as Nol8, and with the exosome subunit Rrp43p. Nip7 contains a PUA domain.
Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of 2 cysteine thiol groups to a disulphide, accompanied by the transfer of 2 electrons and 2 protons. The net result is the covalent interconversion of a disulphide and a dithiol.
Compared to human thioredoxin, human U5 snRNP-specific protein U5-15kD contains 37 additional residues that may cause structural changes which most likely form putative binding sites for other spliceosomal proteins or RNA. Although U5-15kD apparently lacks protein disulphide isomerase activity, it is strictly required for pre-mRNA splicing.
This group represents a predicted transmembrane protein 85.
This group represents a translation initiation factor eIF-2A. Please see the following relevant reference:.
This group represents a tRNA guanosine-2'-O-methyltransferase, TRM11 type.
This group represents a TRAPP I complex, Trs31 subunit.
This group represents a cleavage and polyadenylation specificity factor, 25 kDa subunit.
This group represents a TRAPP I complex, Bet3 subunit.
This group represents a DNA polymerase alpha, subunit B.
Members of this group are poly(A) polymerases (polynucleotide adenylyltransferases, PAP). In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the central enzyme of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA.
The catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Other conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.
The central domain of poly(A) polymerase shares structural similarity with the allosteric activity domain of ribonucleotide reductase R1, which comprises a four-helix bundle and a three-stranded mixed beta-sheet. Even though the two enzymes bind ATP, the ATP-recognition motifs are different. The C-terminal domain is predicted to be an RNA-binding domain because it folds into a compact domain reminiscent of the RNA-recognition motif fold.
The C-terminal region beyond the predicted RNA-binding domain is only conserved in vertebrates and is dispensable for catalytic activity in vitro. The extended C-terminal domain of vertebrate PAPs is rich in serines and threonines, and enzyme activity can be down regulated by phosphorylation at multiple sites. The extreme C terminus of PAP is also the target for another type of regulation. The U1A protein, a component of the U1 snRNP which functions in 5 splice site recognition, is known to inhibit polyadenylation of its own mRNA by binding to PAP. The C terminus of PAP is also involved in protein-protein interactions with the splicing factor U2AF65 and the snRNP protein U1-70K.
Note thatcontains an unrelated group with at least some of the members displaying poly(A) polymerase activity.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents subunit D from the V0 complex of V-ATPases, which are involved in the translocation of protons across a membrane. There is more than one type of D subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity .
More information about this protein can be found at Protein of the Month: ATP Synthases.
This group represents a Vesicle-associated membrane protein.
This group represents a profilin, apicomplexa type.
This group represents a nitric oxide synthase-interacting protein.
This family consists of several eukaryotic transcription initiation Spt4 proteins. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles.
This group represents a Cdk5 and c-Abl linker protein cables. Please see the following relevant references:.
This group represents a negative regulatory factor PREG. Please see the following relevant reference:.
This group represents an E3 ubiquitin ligase SCF complex, Skp subunit.
The Saccharomyces cerevisiae ISN1 (YOR155c) gene encodes an IMP-specific 5'-nucleotidase, which catalyses degradation of IMP to inosine as part of the purine salvage pathway.
Snz1p is a highly conserved protein involved in growth arrest in Saccharomyces cerevisiae (Baker's yeast). Sor1 (singlet oxygen resistance) is essential in pyridoxine (vitamin B6) synthesis in Cercospora nicotianae and Aspergillus flavus. Pyridoxine quenches singlet oxygen at a rate comparable to that of vitamins C and E, two of the most highly efficient biological antioxidants, suggesting a previously unknown role for pyridoxine in active oxygen resistance..
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat.
More information about this protein can be found at Protein of the Month: ATP Synthases.
This group represents a serine/threonine protein phosphatase, BSU1 type. Please see the following relevant reference:.
This group represents a translation initiation factor eIF-3b, which binds to the 40S ribosome and promotes the binding of methionyl-tRNAi and mRNA. eIF-3 is composed of at least 12 different subunits.
Diphthine synthase, also known as diphthamide biosynthesis S-adenosylmethionine-dependent methyltransferase, participates in the modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. It is required for the methylation step in dipthamide biosynthesis. The protein was characterised in Saccharomyces cerevisiae and designated DPH5.
This group represents a DNA replication factor C, large subunit.
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
This group represents a cyclin, L type.
This group represents a predicted 26S proteasome regulatory complex, non-ATPase subcomplex, subunit s5a, Plasmodium type. Please see the following relevant reference:.
This family consists of several eukaryotic transcription elongation Spt5 proteins. These proteins contain two copies of a domain (Supt5; that is characteristic of proteins involved in chromatin regulation. An NGN domain separates the Supt5 domains. In yeast Spt5 protein, this domain possesses a RNP-like fold and it is thought to confer affinity for Spt4 protein. Supt5 domains are followed by four to five copies of a KOW domain, present in many ribosomal proteins.
Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans, including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles.
This information was partially derived from InterPro.
This group represents a splicing factor 3B, subunit 5.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry represents Apicomplexa rhomboid-like proteins, Rom4/Rom5, which are members of the S54 peptidase family of proteins. These proteins are putative serine protease involved in intra-membrane proteolysis and the subsequent release of polypeptides from their membrane anchors. They cleave type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains.
This group represents a coatomer, gamma subunit.
Adapter-like complex 4 (AP-4) is a heterotetramer composed of two large adaptins (epsilon-type subunit AP4E1 and beta-type subunit AP4B1), a medium adaptin (mu-type subunit AP4M1) and a small adaptin (sigma-type AP4S1). It is a subunit of a novel type of clathrin- or non-clathrin-associated protein coat involved in targeting proteins from the trans-Golgi network (TGN) to the endosomal-lysosomal system.
This group represents an adaptor protein complex AP-4, epsilon subunit.
This group represents a D-site 20S pre-rRNA nuclease.
This group represents an U6 snRNA-associated Sm-like protein LSm7.
This protein previously of unknown biochemical function is essential in Escherichia coli. It has now been characterised as 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase, which converts 2C-methyl-D-erythritol 2,4-cyclodiphosphate (ME-2,4CPP) into 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate in the sixth step of nonmevalonate terpenoid biosynthesis. The family is restricted to bacteria, where it is widely but not universally distributed. No homology can be detected between this family and other proteins.
This entry represents a group of atypical 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthases which contain a partially-duplicated domain.
This group represents a multidomain scavenger receptor-like protein PxSR.
This group represents a transmembrane transporter protein, chloroquine resistant type.
This group represents a DNA mismatch repair protein Msh6.
This entry represents cytochrome c oxidase assembly factors Sco1 and Sco2 (Synthesis of Cytochrome c Oxidase, factors 1 and 2), mitochondrial inner membrane-tethered metallochaperones that have regulatory roles in the maintenance of cellular copper homeostasis. These proteins are essential for the assembly of the catalytic core of cytochrome c oxidase (COX or complex IV), as well as other roles in copper homeostasis such as mitochondrial redox signalling. Both Sco1 and Sco2 contain highly conserved CXXXC motifs thought to be required for copper binidng.
COX is the terminal enzyme of the energy transducing respiratory chain in eukaryotes and certain prokaryotes. It catalyses the transfer of electrons from cytochrome c to molecular oxygen and pumps protons across the mitochondrial inner membrane to establish a proton gradient for ATP synthesis. It consists of 12-13 protein subunits, with 3 subunits (Cox1-Cox3) forming the enzyme core. COX uses haem and copper as cofactors: Cox1 contains a 1-copper centre (CuB) that interacts with the haem moiety and Cox2 contains a 2-copper centre (CuA). Sco1 and Sco2 act as copper chaperones, transporting copper to the CuA site in Cox2, and are thought to have cooperative functions in COX assembly. In addition, human Sco2 is also the downstream mediator of the balance between the utilization of respiratory and glycolytic pathways and both Sco1 and Sco2 may have regulatory roles in regulating cellular copper levels (homeostasis). Sco2 may have a copper-level-detection signalling role, acting upstream and in conjunction with Sco1.
Defects in Sco1 are a cause of cytochrome c oxidase deficiency (COX deficiency) (OMIM:220110), a clinically heterogeneous disorder with features ranging from isolated myopathy to severe multisystem disease, and onset from infancy to adulthood. Defects in Sco2 are the cause of fatal infantile cardioencephalomyopathy with cytochrome c oxidase deficiency (FIC) (OMIM:604377, OMIM:220110), which is characterised by hypertrophic cardiomyopathy, lactic acidosis, and gliosis.
This group represents a predicted methyltransferase, METTL2 type.
This group represents a histone deposition protein Asf1.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins and others are activated by this route. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C-terminus.
This entry contains serine peptidases belonging to MEROPS peptidase family S8A (subtilisin family, clan SB). All of the peptidases in this entry derive from Plasmodium spp. and are called 'subtilisin-like peptidase 1'.
The peptidase in Plasmodium falciparum (isolate 3D7) termed pfSUB1 is a component of the exoneme. PfSUB1 mediates the proteolytic maturation of at least two essential members of another enzyme family called SERA. This proteolytic processing event is required for the release of viable parasites from the host erythrocyte.
HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence-specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positivecharge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.
HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence- specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positive charge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.
This group represents a translation initiation factor 3, RNA-binding subunit.
This group represents an uncharacterised protein with zinc finger Ran-binding domain, ZRANB2-type.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family consists of ribosomal protein L5 from eukaryotes. The ribosomal 5S RNA is the only known rRNA species to bind a ribosomal protein before its assembly into the ribosomal subunits . In eukaryotes, the 5S rRNA molecule binds one protein species, a 34-kDa protein which has been implicated in the intracellular transport of 5 S rRNA..
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family covers bacterial ribosomal protein L20 and its chloroplast equivalent.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L35 is a basic protein of 60 to 70 amino-acid residues from the large (50S) subunit. Like many basic polypeptides, L35 completely inhibits ornithine decarboxylase when present unbound in the cell, but the inhibitory function is abolished upon its incorporation into ribosomes. It belongs to a family of ribosomal proteins, including L35 from bacteria, plant chloroplast, red algae chloroplasts and cyanelles. In plants it is a nuclear encoded gene product, which suggests a chloroplast-to-nucleus relocation during the evolution of higher plants.
Fatty acid desaturases are enzymes that catalyze the insertion of a double bond at the delta position of fatty acids.
There seems to be two distinct families of fatty acid desaturases which do not seem to be evolutionary related: the first contains stearoyl-CoA desaturase (SCD); the second includes plant stearoyl-acyl-carrier protein and cyanobacteria desA protein.
SCD is a key regulatory enzyme of unsaturated fatty acid biosynthesis. In association with cytochrome b5 and NADP-dependent cytochrome b5 reductase, it constitutes part of a microsomal membrane-bound 3-component system in animals and fungi. SCD contains 4 putative transmembrane (TM) regions that anchor it in the microsomal membrane. SCD uses oxygen and electrons from reduced cytochrome b5 to catalyse the insertion of a cis double bond between carbons 9 and 10 of a spectrum of fatty acids. The preferred substrates of SCD are palmitoyl-CoA and stearoyl-CoA, which are converted to palmitoleic (16:1) and oleic (18:1) acids respectively. These unsaturated molecules are the major storage form of fatty acids (as triacylglycerols) in adipocytes.
6-Phosphogluconate dehydrogenase (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP). Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequence are highly conserved. The protein is a homodimer in which the monomers act independently: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet. NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket.
Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.
Glutamate dehydrogenases (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids.
Leucine dehydrogenase (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.
Phenylalanine dehydrogenase (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate.
Valine dehydrogenase (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate.
This family contains both lactate and malate dehydrogenases. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.
L-lactate dehydrogenase (LDH) catalyses the reversible NAD-dependent interconversion of pyruvate to L-lactate. In vertebrate muscles and in lactic acid bacteria it represents the final step in anaerobic glycolysis. This tetrameric enzyme is present in prokaryotic and eukaryotic organisms. In vertebrates there are three isozymes of LDH: the M form (LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart muscle and the X form (LDH-C), found only in the spermatozoa of mammals and birds. In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as epsilon-crystallin.
L-2-hydroxyisocaproate dehydrogenase (L-hicDH) catalyses the reversible and stereospecific interconversion between 2-ketocarboxylic acids and L-2-hydroxy-carboxylic acids. L-hicDH is evolutionary related to LDH's.
Glutamine amidotransferase (GATase) activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. This activity is found in a range of biosynthetic enzymes, including glutamine amidotransferase, anthranilate synthase component II, p-aminobenzoate, and glutamine-dependent carbamoyl-transferase (CPSase). Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. On the basis of sequence similarities two classes of GATase domains have been identified, class-I (also known as trpG-type) and class-II (also known as purF-type). Class-I GATase domains are defined by a conserved catalytic triad consisting of cysteine, histidine and glutamate. Class-I GPTase domains have been found in the following enzymes, the second component of anthranilate synthase and 4-amino-4-deoxychorismate (ADC) synthase; CTP synthase; GMP synthase; glutamine-dependent carbamoyl-phosphate synthase; phosphoribosylformylglycinamidine synthase II; and the histidine amidotransferase hisH.
In some bacteria, such as Escherichia coli, component II can be much larger than in other organisms, due to the presence of phosphoribosyl-anthranilate transferase (PRTase) activity. This is the second step in tryptophan biosynthesis and results in the addition of 5-phosphoribosyl-1-pyrophosphate to anthranilate to create N-5'-phosphoribosyl-anthranilate. Some studies have suggested that the larger component II could have arisen by gene fusion, a hypothesis supported by the fact that the two activities are found in discrete domains that are physically separated in the 3D model.
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the large subunit of carbamoyl phosphate synthase.
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the domain responsible for GATase (glutamine amidotransferase) activity in CPSases, which catalyses the hydrolysis of glutamine to glutamate and ammonia. This reaction occurs at the active site on the small subunit of CPSases. This function has been detected in some other enzymes, including aminodeoxychorismate synthase and anthranilate synthase component II, all of which show sequence similarity in the area thought to contain the GATase activity. The active site contains a conserved Cys residue, which is necessary for catalytic activity, and several conserved residues in the areas surrounding this Cys have also been found to be important.
DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation.
A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Tyrosine phosphorylating activity was originally detected in two viral transforming proteins, but many retroviral transforming proteins and their cellular counterparts have since been shown to possess such activity. The growth factor receptors, which are activated by ligand binding, and the insulin-related peptide receptor, are also family members.
Protein phosphorylation plays a central role in the regulation of cell functions, causing the activation or inhibition of many enzymes involved in various biochemical pathways. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental.
Protein phosphatase-1 (PP1) and protein phosphatase-2A (PP2A) have a broad specificity and there are two closely related isoforms of each, alpha and beta. PP2A is a trimeric enzyme that consists of a core composed of a catalytic subunit associated with a 65 kDa regulatory subunit and a third variable subunit. Protein phosphatase-2B (PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by calmodulin, is composed of two subunits the catalytic A-subunit and the calcium-binding B-subunit. The specificity of PP2B is restricted. Other serine/threonine specific protein phosphatases that have been characterised include mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V), which are closely related but yet distinct from PP2A; yeast phosphatase PPH3, which is similar to PP2A, but with different enzymatic properties; and Drosophila phosphatase-Y (PP-Y), and yeast phosphatases Z1 and Z2 which are closely related but yet distinct from PP1.
L-Arginine is converted to nitric oxide and citrulline by the enzyme nitric oxide synthase and by the enzyme arginase as a part of the hepatic urea cycle. Arginase is a manganese metalloenzymes containing a metal-activated hydroxide ion, a critical nucleophile in metalloenzymes that catalyze hydrolysis or hydration reactions. A hydrogen bond formed by the metal-bound hydroxide holds the enzyme in the proper orientation for catalysis however nonmetal substrate-binding sites are also implicated in the enzyme mechanism. Regeneration of metal-bound hydroxide ion from a metal-bound water molecule requires proton transfer to bulk solvent mediated by a histidine proton shuttle residue.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.
This entry represents the several classes of P-type ATPases, including those that transport K+, Mg2+, Cd2+, Cu 2+, Zn2+, Na+, Ca2+, Na+/K+, and H+/K+. These P-ATPases are found in both prokaryotes and eukaryotes.
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.
This entry represents the alpha subunit found in the P-type cation exchange ATPases found in the plasma membranes of both prokaryotes and eukaryotes. These P-ATPases include both H+/K+-ATPases and Na+/K+-ATPases, which belong to the IIC subfamily of ATPases. These ATPases catalyse the hydrolysis of ATP coupled with the exchange of cations, pumping one cation out of the cell (H+ or Na+) in exchange for K+. These ATPases contain an alpha subunit that is the catalytic component, and a regulatory beta subunit that stabilizes the alpha/beta assembly. Different alpha and beta isoforms exist, permitting greater regulatory control.
An example of a H+/K+-ATPase is the gastric pump responsible for acid secretion in the stomach, transporting protons from the cytoplasm of parietal cells to create a large pH gradient in exchange for the internalization of potassium ions, using ATP hydrolysis to drive the pump.
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents the 16 kDa proteolipid subunit c that is part of the V0 complex of V-ATPase in eukaryotic organelles and in certain bacteria. There are three proteolipid subunits (c, c and cÂÂ) that form part of the proton-conducting pore, each containing a buried glutamic acid residue that is essential for proton transport, and together they form a hexameric ring spanning the membrane.
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase family S14 (ClpP endopeptidase family, clan SK). ClpP is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of ATP, although the P subunit alone does possess some catalytic activity. This family of sequences represent the P subunit.
Proteases highly similar to ClpP have been found to be encoded in the genome of bacteria, metazoa, some viruses and in the chloroplast of plants. A number of the proteins in this family are classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Threonine peptidases are characterised by a threonine nucleophile at the N terminus of the mature enzyme. The threonine peptidases belong to clan PB or are unassigned, clan T-. The type example for this clan is the archaean proteasome beta component of Thermoplasma acidophilum.
This family of threonine peptidases belong to MEROPS peptidase family T1 (clan PB(T)), subfamily T1A.
The proteasome (or macropain) is a eukaryotic and archaeal multicatalytic proteinase complex that seems to be involved in an ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the proteasome is composed of about 28 distinct subunits which form a highly ordered ring-shaped structure (20S ring) of about 700 kDa. Most proteasome subunits can be classified, on the basis on sequence similarities into two groups, alpha (A) and beta (B). This family contains the beta subunit sequences which range from 190 to 290 amino acids.
Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).
Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.
This entry represents types I and II citrate synthase enzymes, as well as the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive. ATP citrate synthase (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial produced citrate.
Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.
Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.
Phosphoenolpyruvate carboxylase (PEPCase), an enzyme found in all multicellular plants, catalyses the formation of oxaloacetate from phosphoenolpyruvate (PEP) and a hydrocarbonate ion. This reaction is harnessed by C4 plants to capture and concentrate carbon dioxide into the photosynthetic bundle sheath cells. It also plays a key role in the nitrogen fixation pathway in legume root nodules: here it functions in concert with glutamine, glutamate and asparagine synthetases and aspartate amido transferase, to synthesise aspartate and asparagine, the major nitrogen transport compounds in various amine-transporting plant species.
PEPCase also plays an antipleurotic role in bacteria and plant cells, supplying oxaloacetate to the TCA cycle, which requires continuous input of C4 molecules in order to replenish the intermediates removed for amino acid biosynthesis. The C-terminus of the enzyme contains the active site that includes a conserved lysine residue, involved in substrate binding, and other conserved residues important for the catalytic mechanism.
Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.
Actin is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins.
In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.
Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.
The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis, as well as muscle contraction.
The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins.
The beta subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species.
Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains, 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy. The 3-D structure of the head portion of myosin has been determined and a model for actin-myosin complex has been constructed.
The globular head is well conserved, some highly-conserved regions possibly relating to functional and structural domains. The rod-like tail starts with an invariant proline residue, and contains many repeats of a 28 residue region, interrupted at 4 regularly-spaced points known as skip residues. Although the sequence of the tail is not well conserved, the chemical character is, hydrophobic, charged and skip residues occuring in a highly ordered and repeated fashion.
Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins, and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.
The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.
This entry represents the GTPase domain, containing the GTP-binding motifs that are needed for guanine-nucleotide binding and hydrolysis. The conservation of these motifs is absolute except for the the final motif in guanylate-binding proteins. The GTPase catalytic activity can be stimulated by oligomerisation of the protein, which is mediated by interactions between the GTPase domain, the middle domain and the GED.
Synaptobrevin is an intrinsic membrane protein of small synaptic vesicles, specialised secretory organelles of neurons that actively accumulate neurotransmitters and participate in their calcium-dependent release by exocytosis. Vesicle function is mediated by proteins in their membranes, although the precise nature of the protein-protein interactions underlying this are still uncertain. Synaptobrevin may play a role in the molecular events underlying neurotransmitter release and vesicle recycling and may be involved in the regulation of membrane flow in the nerve terminal, a process mediated by interaction with low molecular weight GTP-binding proteins. Synaptic vesicle-associated membrane proteins (VAMPs) from Torpedo californica (Pacific electric ray) and SNC1 from yeast are related to synaptobrevin.
The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins . These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10).
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.
The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10). Type II chaperonins, found in eukaryotic cytosol and in Archaebacteria, comprise only a cpn60 member.
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner . The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease, and is thought to play a role in the protection of the Legionella spp. bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A.
A group of ATP-binding proteins that includes the regulatory subunit of the ATP-dependent protease clpA; heat shock proteins clpB, 104 and 78; and chloroplast proteins CD4a (ClpC) and CD4b belong to this family. The proteins are thought to protect cells from stress by controlling the aggregation and denaturation of vital cellular structures. They vary in size, but share a domain which contains an ATP-binding site.
These signatures which span the ATP binding region also identify the bacterial DNA polymerase III subunit tau, ATP-dependent protease La and the mitochondrial lon protease homolog, both of which belong to MEROPS peptidase family S16.
Little is known of the function of hsp70 proteins. Some evidence suggests that the constitutive members have a role in the disassembly of clathrin cages, and may also participate in the post-translational transmembrane targetting of proteins to cellular organelles. No specific activities or associations have been found for the inducible members, although it has been suggested that they may accept incoming precursor proteins, keep them unfolded, then pass them on to the hsp60/hsp10 (cpn60/cpn10) complex for folding and assembly.
Protein folding is thought to be the sole result of properties inherent in polypeptide primary sequences. Sometimes, however, additional proteins are required to mediate correct folding and subsequent oligomer assembly. These 'helpers', or chaperones, bind to specific protein surfaces, preventing incorrect folding and formation of non-functional structures.
The tailless complex polypeptide 1 (TCP-1) is a highly structurally conserved molecular chaperone located in the cytosol. The protein has also been shown to bind to Golgi membranes and to microtubules, this latter property suggesting a role in mitotic spindle formation in dividing cells (especially in sperm, where it is highly abundant). TCP-1 forms a double ring structure, similar to the 10kDa and 60kDa chaperonins, with 6-8 subunits per ring. The amino acid sequence is significantly similar to the 60kDa chaperonin, and to TF55, a chaperone from the archaebacterium Sulfolobus shibatae.
The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.
14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents the mu subunit of various clathrin adaptors (AP1, AP2 and AP3). The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle .
More information about these proteins can be found at Protein of the Month: Clathrin.
In both prokaryotes and eukaryotes, there are three distinct types of elongation factors, EF-1alpha (EF-Tu), which binds GTP and an aminoacyl-tRNAand delivers the latter to the A site of ribosomes; EF-1beta (EF-Ts), which interacts with EF-1a/EF-Tu to displace GDP and thus allows the regeneration of GTP-EF-1a; and EF-2 (EF-G), which binds GTP and peptidyl-tRNA and translocates the latter from the A site to the P site. In EF-1-alpha, a specific region has been shown to be involved in a conformational change mediated by the hydrolysis of GTP to GDP. This region is conserved in both EF-1alpha/EF-Tu as well as EF-2/EF-G and thus seems typical for GTP-dependent proteins which bind non-initiator tRNAs to the ribosome. The GTP-binding protein synthesis factor family also includes the eukaryotic peptide chain release factor GTP-binding subunits and prokaryotic peptide chain release factor 3 (RF-3); the prokaryotic GTP-binding protein lepA and its homolog in yeast (GUF1) and Caenorhabditis elegans (ZK1236.1); yeast HBS1; rat statin S1; and the prokaryotic selenocysteine-specific elongation factor selB.
Several proteins have recently been shown to contain the 5 structural motifs characteristic of GTP-binding proteins. These include murine DRG protein; GTP1 protein from Schizosaccharomyces pombe; OBG protein from Bacillus subtilis; and several others. Although the proteins contain GTP-binding motifs and are similar to each other, they do not share sequence similarity to other GTP-binding proteins, and have thus been classed as a novel group, the GTP1/OBG family. As yet, the functions of these proteins are uncertain, but they have been shown to be important in development and normal cell metabolism.
The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.
The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 ÂupÂ, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.
Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1, SUMO1, NEDD8, Rad23, Elongin B and Parkin, the latter being involved in Parkinson's disease.
Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation.
Despite functional similarities, oxidoreductases of this family show no sequence similarity with adrenodoxin reductases and flavoprotein pyridine nucleotide cytochrome reductases (FPNCR). Assuming that disulphide reductase activity emerged later, during divergent evolution, the family can be referred to as FAD-dependent pyridine nucleotide reductases, FADPNR.
To date, 3D structures of glutathione reductase, thioredoxin reductase , mercuric reductase, lipoamide dehydrogenase, trypanothione reductase and NADH peroxidase have been solved. The enzymes share similar tertiary structures based on a doubly-wound alpha/beta fold, but the relative orientations of their FAD- and NAD(P)H-binding domains may vary significantly. By contrast with the FPNCR family, the folds of the FAD- and NAD(P)H-binding domains are similar, suggesting that the domains evolved by gene duplication.
Flavoprotein pyridine nucleotide cytochrome reductases (FPNCR) catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. The enzymes include ferredoxin:NADP+reductases (FNR), plant and fungal NAD(P)H:nitrate reductases, NADH:cytochrome b5 reductases, NADPH:P450 reductases, NADPH:sulphite reductases, nitric oxide synthases, phthalate dioxygenase reductase, and various other flavoproteins.
Despite functional similarities, FPNCRs show no sequence similarity to NADPH:adrenodoxin reductases, nor to bacterial ferredoxin:NAD+reductases and their homologues. To date, 3D-structures of 4 members of the family have been solved: Spinacia oleracea (Spinach) ferredoxin:NADP+ reductase; Burkholderia cepacia (Pseudomonas cepacia) phthalate dioxygenase reductase; the flavoprotein domain of Zea mays (Maize) nitrate reductase; and Sus scrofa (Pig) NADH:cytochrome b5 reductase. In all of them, the FAD-binding domain (N-terminal) has the topology of an anti-parallel beta-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel beta-sheet with 2 helices on each side). In spite of such structural similarities, the level of amino acid identity between family members is at or below the limit of significance (e.g., nitrate reductase is only 15% identical to FNR).
Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.
The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.
A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:
The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.
The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.
The cyclic nucleotide phosphodiesterases (PDE) comprise a group of enzymes that degrade the phosphodiester bond in the second messenger molecules cAMP and cGMP. They are divided into 11 families. They regulate the localisation, duration and amplitude of cyclic nucleotide signalling within subcellular domains. PDEs are therefore important for signal transduction.
PDE enzymes are often targets for pharmacological inhibition due to their unique tissue distribution, structural properties, and functional properties. Inhibitors include: Roflumilast for chronic obstructive pulmonary disease and asthma, Sildenafil for erectile dysfunction and Cilostazol for peripheral arterial occlusive disease, amongst others.
Retinal 3',5'-cGMP phosphodiesterase is located in photoreceptor outer segments: it is light activated, playing a pivotal role in signal transduction. In rod cells, PDE is oligomeric, comprising an alpha-, a beta- and 2 gamma-subunits, while in cones, PDE is a homodimer of alpha chains, which are associated with several smaller subunits. Both rod and cone PDEs catalyse the hydrolysis of cAMP or cGMP to the corresponding nucleoside 5' monophosphates, both enzymes also binding cGMP with high affinity. The cGMP-binding sites are located in the N-terminal half of the protein sequence, while the catalytic core resides in the C-terminal portion.
Phosphoinositol-specific phospholipase C (PLC; plays an important role in signal transduction processes, mediating the cellular actions of a variety of hormones, neurotransmitters and growth factors. Upon agonist-dependent activation, PLC catalyses the hydrolysis of membrane phosphatidylinositol 4,5-bisphosphate (PIP2), generating the second messengers inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG). IP3 binds specific intracellular receptors to trigger Ca2+ mobilisation, while DAG mediates activation of a family of protein kinase C isozymes. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins. Based on molecular size, immunoreactivity and amino acid sequence, several subtypes have been classified. Overall, sequence identity between sub-types is low, yet all isoforms share two conserved domains, designated X and Y.
All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues, for example, in PLC-beta subtypes, X and Y domains are separated by a stretch of 70-120 amino acids rich in Ser, Thr and acidic residues (their C terminus is rich in basic residues). However, in PLC-gammas, there is an insert of more than 400 residues containing a PH domain, two SH2 domains, and one SH3 domain. The two conserved X and Y domains have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain, possibly involved in Ca-dependent membrane attachment. PLCs show little similarity in the 300-residue N-terminal region preceding the X-domain.
This entry represents a PLC region found towards the C-terminus which contains the X and Y boxes and the Ca2+-dependent membrane-targeting module of these proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal S2 proteins have been shown to belong to a family that includes 40S ribosomal subunit 40kDa proteins, putative laminin-binding proteins, NAB-1 protein and 29.3kDa protein from Haloarcula marismortui. The laminin-receptor proteins are thus predicted to be the eukaryotic homologue of the eubacterial S2 risosomal proteins.
This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.
The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved. It consists of a three-stranded beta-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis.
The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates nuclear export of unspliced and partly-spliced viral RNAs. Rev contains an RNA-binding domain and an effector domain; the latter is believed to interact with a cellular cofactor required for the Rev response and hence HIV-1 replication. Human Rev interacting protein (hRIP) specifically interacts with the Rev effector. The amino acid sequence of hRIP is characterised by an N-terminal, C-4 class zinc finger motif.
Flavoprotein pyridine nucleotide cytochrome reductases (FPNCR) catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. The enzymes include
NADH:cytochrome b5 reductase (CBR) serves as electron donor for cytochrome b5, a ubiquitous electron carrier (see, thus participating in a variety of metabolic pathways (including steroid biosynthesis, desaturation and elongation of fatty acids, P450-dependent reactions, methaemoglobin reduction, etc.). A membrane-bound form of CBR is located on the cytosolic side of the endoplasmic reticulum, while a soluble form is found in erythrocytes. In the membrane-bound form, the N-terminal residue is myristoylated. Deficiency of the erythrocyte form causes hereditary methaemoglobinemia.
In biological nitrate assimilation, reduction of nitrate to nitrite is catalysed by the multidomain redox enzyme NAD(P)H:nitrate reductase (NR). Three forms of NR are known: an NADH-specific enzyme found in higher plants and algae; an NAD(P)H-bispecific enzyme found in higher plants, algae and fungi; and an NADPH-specific enzyme found only in fungi. NR can be divided into 3 structure/function domains: the molybdopterin cofactor binds in the N-terminal domain; the central region is the cytochrome b domain, which is similar to animal cytochrome b5 (see; and the C-terminal portion of the protein is occupied by the FAD/NAD(P)H binding domain, which is similar to CBR. The catalytic reduction of nitrate to nitrite can be viewed as a single polypeptide electron transport chain with electron flow from NAD(P)H -> FAD -> cytochrome b5 -> molybdopterin -> NO(3). Thus, the flavin domain of NR is functionally identical to CBR.
To date, the 3D-structures of the flavoprotein domain of Zea mays (Maize) nitrate reductase and of Sus scrofa (Pig) NADH:cytochrome b5 reductase have been solved. The overall fold is similar to that of ferredoxin:NADP+ reductase: the FAD-binding domain (N-terminal) has the topology of an anti-parallel beta-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel beta-sheet flanked by 2 helices on each side).
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry describes the core region of type IA topoisomerases, which are highly conserved enzymes that are structurally distinct from type IB enzymes. The structures of both topoisomerases I and III have been elucidated, and consist of four domains that together form a toroidal molecule with a central hole that is large enough to accommodate single- and double-stranded DNA. It is believed that the domains transiently separate from one another to allow the entrance and exit of DNA strands.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has ATPase and subunit interaction capacity.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein .
Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.
A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity. The various forms of PDI which are currently known are:
Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:
This entry represents the thioredoxin domain and homologous domains in other proteins. The motifs in this signature span an invariant Trp residue, the N-terminal of helix 2 that contains two cysteines that form the redox-active disulphide bond, the fourth beta strand containing and invariant cis-proline.
The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functionally related proteins defined by a conserved hydrophobic core of ten transmembrane domains. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others. It is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations.
The natural resistance of mice to infection with intracellular parasites is controlled by the Bcg locus, which modulates the cytostatic/cytocidal activity of phagocytes. Nramp1, the gene responsible, is expressed exclusively in macrophages and poly-morphonuclear leukocytes, and encodes a polypeptide (natural resistance-associated macrophage protein) with features typical of integral membrane proteins. Other transporter proteins from a variety of sources also belong to this family.
Regulated exocytosis of neurotransmitters and hormones, as well as intracellular traffic, requires fusion of two lipid bilayers. SNARE proteins are thought to form a protein bridge, the SNARE complex, between an incoming vesicle and the acceptor compartment. SNARE proteins contribute to the specificity of membrane fusion, implying that the mechanisms by which SNAREs are targeted to subcellular compartments are important for specific docking and fusion of vesicles. This mechanism involves a family of conserved proteins, members of which appear to function at all sites of constitutive and regulated secretion in eukaryotes. Among them are 2 types of cytosolic protein, NSF (N-ethyl-maleimide-sensitive protein) and the SNAPs (alpha-, beta- and gamma-soluble NSF attachment proteins). The yeast vesicular fusion protein, sec17, a cytoplasmic peripheral membrane protein involved in vesicular transport between the endoplasmic reticulum and the golgi apparatus, shows a high degree of sequence similarity to the alpha-SNAP family.
SNAP-25 and its non-neuronal homologue Syndet/SNAP-23 are synthesized as soluble proteins in the cytosol. Both SNAP-25 and Syndet/SNAP-23 are palmitoylated at cysteine residues clustered in a loop between two N- and C-terminal coils and palmitoylation is essential for membrane binding and plasma membrane targeting. The C-terminal and the N-terminal helices of SNAP-25, are each targeted to the plasma membrane by two distinct cysteine-rich domains and appear to regulate the availability of SNAP to form complexes with SNARE.
Many members of the Ras superfamily of GTPases have been implicated in the regulation of hematopoietic cells, with roles in growth, survival, differentiation, cytokine production, chemotaxis, vesicle-trafficking, and phagocytosis. The Ras superfamily of proteins now includes over 150 small GTPases (distinguished from the large, heterotrimeric GTPases, the G-proteins). It comprises six subfamilies, the Ras, Rho, Ran, Rab, Arf, and Kir/Rem/Rad subfamilies. They exhibit remarkable overall amino acid identities, especially in the regions interacting with the guanine nucleotide exchange factors that catalyse their activation.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.
Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.
The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.
The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.
PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6- bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP, as the 2 products are now further apart. These conformations are thought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react.
Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting, muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise. Sufferers are usually able to lead a reasonably ordinary life by learning to adjust activity levels.
Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.
PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.
Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.
This entry represents the full PGK enzyme.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine).
Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids.
The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain.
Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.
The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.
This entry represents a kelch sequence motif that comprises one beta-sheet blade.
The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.
In prokaryotes, the major role of DNA methylation is to protect host DNA against degradation by restriction enzymes. There are 2 major classes of DNA methyltransferase that differ in the nature of the modifications they effect. The members of one class (C-MTases) methylate a ring carbon and form C5-methylcytosine (see. Members of the second class (N-MTases) methylate exocyclic nitrogens and form either N4-methylcytosine (N4-MTases) or N6-methyladenine (N6-MTases). Both classes of MTase utilise the cofactor S-adenosyl-L-methionine (SAM) as the methyl donor and are active as monomeric enzymes.
N-6 adenine-specific DNA methylases (A-Mtase) are enzymes that specifically methylate the amino group at the C-6 position of adenines in DNA. Such enzymes are found in the three existing types of bacterial restriction-modification systems (in type I system the A-Mtase is the product of the hsdM gene, and in type III it is the product of the mod gene). All of these enzymes recognise a specific sequence in DNA and methylate an adenine in that sequence. It has been shown that A-Mtases contain a conserved motif Asp/Asn-Pro-Pro-Tyr/Phe in their N-terminal section, this conserved region could be involved in substrate binding or in the catalytic activity. The structure of N6-MTase TaqI (M.TaqI) has been resolved to 2.4 A. The molecule folds into 2 domains, an N-terminal catalytic domain, which contains the catalytic and cofactor binding sites, and comprises a central 9-stranded beta-sheet, surrounded by 5 helices; and a C-terminal DNA recognition domain, which is formed by 4 small beta-sheets and 8 alpha-helices. The N- and C-terminal domains form a cleft that accommodates the DNA substrate. A classification of N-MTases has been proposed, based on conserved motif (CM) arrangements. According to this classification, N6-MTases that have an NPPY motif (CM II) occuring after the FxGxG motif (CM I) are designated N12 class N6-adenine MTases.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamilies M24A and M24B.
Methionine aminopeptidase (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity.
Two subfamilies of MAP enzymes are known to exist. While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group is made up of archaeal MAP and eukaryotic MAP-2. The second subfamily also includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein.
Ambler recognised four classes of cytC.
Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N-terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C-terminus. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cyt C and prokaryotic 'short' cyt C2 exemplified by Rhodopila globiformis cyt C2; Class IA includes 'long' cyt C2, such as Rhodospirillum rubrum cyt C2 and Aquaspirillum itersonii cyt C-550, which have several extra loops by comparison with Class IB cyt C.
The 3D structures of a considerable number of class IA and IB cytC have been determined. The proteins consist of 3-6 alpha-helices; the three most conserved 'core' helices form a 'basket' around the haem group, with one haem edge exposed to the solvent. Most class I cytC have conserved aromatic residues clustered around the haem and axial ligands.
The CCAAT-binding factor (CBF) is a mammalian transcription factor that binds to a CCAAT motif in the promoters of a wide variety of genes, including type I collagen and albumin. The factor is a heteromeric complex of A and B subunits, both of which are required for DNA-binding. The subunits can interact in the absence of DNA-binding, conserved regions in each being important in mediating this interaction.
The A subunit can be split into 3 domains on the basis of sequence similarity, a non-conserved N-terminal 'A domain'; a highly-conserved central 'B domain' involved in DNA-binding; and a C-terminal 'C domain', which contains a number of glutamine and acidic residues involved in protein-protein interactions. The A subunit shows striking similarity to the HAP3 subunit of the yeast CCAAT-binding heterotrimeric transcription factor. The Kluyveromyces lactis HAP3 protein has been predicted to contain a 4-cysteine zinc finger, which is thought to be present in similar HAP3 and CBF subunit A proteins, in which the third cysteine is replaced by a serine. This family also includes DNA topoisomerase II, which controls the topology of DNA by transient breaking of the strands and rejoining.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Histone H3 is one of the four histones, along with H2A, H2B and H4, which form the eukaryotic nucleosome octomer core; the nucleosome octamer winds ~146 DNA base-pairs. It is a highly conserved protein of 135 amino acid residues.
Several proteins have been found to contain a C-terminal H3-like domain, including the mammalian centromeric protein CENP-A (which may act as a core histone necessary for the assembly of centromeres); yeast chromatin- associated protein CSE4; and Caenorhabditis elegans chromosome III proteins YL82_CAEEL and YMH3_CAEEL, whose function is unknown.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolizing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.
DnaJ comprises a 70-residue N-terminal domain (the J-domain); a 30-residue glycine-rich region (the G-domain); a central domain containing 4 repeats of a CxxCxGxG motif (the CRR-domain); and a 120-170 residue C-terminal region. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately.
The regulator of chromosome condensation (RCC1) is a eukaryotic protein which binds to chromatin and interacts with ran, a nuclear GTP-binding protein to promote the loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS). The interaction of RCC1 with ran probably plays an important role in the regulation of gene expression.
RCC1, known as PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in Drosophila, is a protein that contains seven tandem repeats of a domain of about 50 to 60 amino acids. As shown in the following schematic representation, the repeats make up the major part of the length of the protein. Outside the repeat region, there is just a small N-terminal domain of about 40 to 50 residues and, in the Drosophila protein only, a C-terminal domain of about 130 residues.
The RCC1-type of repeat is also found in the X-linked retinitis pigmentosa GTPase regulator. The RCC repeats form a beta-propeller structure.Proteins resident in the lumen of the endoplasmic reticulum (ER) contain a C-terminal tetrapeptide, commonly known as Lys-Asp-Glu-Leu (KDEL) in mammals and His-Asp-Glu-Leu (HDEL) in yeast (Saccharomyces cerevisiae) that acts as a signal for their retrieval from subsequent compartments of the secretory pathway. The receptor for this signal is a ~26 kDa Golgi membrane protein, initially identified as the ERD2 gene product in S. cerevisiae. The receptor molecule, known variously as the ER lumen protein retaining receptor or the 'KDEL receptor', is believed to cycle between the cis side of the Golgi apparatus and the ER. It has also been characterised in a number of other species, including plants, Plasmodium, Drosophila and mammals. In mammals, 2 highly related forms of the receptor are known.
The KDEL receptor is a highly hydrophobic protein of 220 residues; its sequence exhibits 7 hydrophobic regions, all of which have been suggested to traverse the membrane. More recently, however, it has been suggested that only 6 of these regions are transmembrane (TM), resulting in both N- and C-termini on the cytoplasmic side of the membrane.
Phosphoglucose isomerase (PGI) is a dimeric enzyme that catalyses the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the Entner-Doudouroff pathway. The multifunctional protein, PGI, is also known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumour-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, it catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine.
PGI from Bacillus stearothermophilus has an open twisted alpha/beta structural motif consisting of two globular domains and two protruding parts. It has been suggested that the top part of the large domain together with one of the protruding loops might participate in inducing the neurotrophic activity. The structure of rabbit muscle phosphoglucose isomerase complexed with various inhibitors shows that the enzyme is a dimer with two alpha/beta-sandwich domains in each subunit. The location of the bound D-gluconate 6-phosphate inhibitor leads to the identification of residues involved in substrate specificity. In addition, the positions of amino acid residues that are substituted in the genetic disease nonspherocytic hemolytic anemia suggest how these substitutions can result in altered catalysis or protein stability.
Genes that negatively regulate proliferation inside the cell are of considerable interest because of the implications in processes such as development and cancer. Prohibitin, a novel cytoplasmic anti-proliferative protein widely expressed in a variety of tissues, inhibits DNA synthesis. Studies have suggested that prohibitin may be a suppressor gene and is associated with tumour development and/or progression of at least some breast cancers. Sequence comparisons suggest that the prohibitin gene is an analogue of Cc, a Drosophila melanogaster gene that is vital for normal development.
In eukaryotes, transcription initiation by polymerase II is modulated by both general and specific transcription factors. The general factors (which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIG and TFIIH) operate through common promoter elements, such as the TATA box. Transcription factor IIB (TFIIB) is of central importance in transcription of class II genes. It associates with TFIID-TFIIA bound to DNA (the DA complex) to form a ternary TFIID-IIA-IBB (DAB) complex, which is recognised by RNA polymerase II. TFIIB comprises ~315-340 residues and contains an imperfect C-terminal repeat of a 75-residue domain that may contribute to the symmetry of the folded protein.
The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins.
The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box and interacts with transcription factors and regulatory proteins . By contrast, the N-terminal region varies in both length and sequence.
Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.
ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.
Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.
The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.
Other proteins containing an ACB domain include:
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.
The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity. Members of the papain family are widespread, found in baculovirus, eubacteria, yeast, and practically all protozoa, plants and mammals. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate.
The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.
Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.
The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.
Stomatin is also known as erythrocyte membrane protein band 7.2b. It is a 31 kDa membrane protein, and was named after the rare human disease: haemolytic anaemia hereditary stomatocytosis. The protein contains a single hydrophobic domain, close to the N-terminus, and is phosphorylated.
Stomatin is believed to be involved in regulating monovalent cation transport through lipid membranes. Absence of the protein in hereditary stomatocytosis is believed to be the reason for the leakage of Na+ and K+ ions into and from erythrocytes.
A second function of stomatin is to act as a cytoskeletal anchor. One possible example of this is its interaction with some anti-malarial drugs. Current opinion speculates that such drugs bind to high density lipoproteins in serum. The lipoproteins are delivered to erythrocytes, where it is believed they Interact with stomatin as a means of transfer to the intracellular parasite, via a pathway used for the uptake of exogenous phospholipid.
Stomatin-like proteins have been identified in various organisms, including Caenorhabditis elegans and Mus musculus.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.
The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity. Some subtilisins are mosaic proteins, and others contain N- and C-terminal extensions that show no sequence similarity to any other known protein. Based on sequence homology, a subdivision into six families has been proposed.
The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site. Members of the kexin family, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of Cys-173 near to the active histidine.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish.
Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26A.
At least 3 eubacterial leader peptidases are known: murein prelipoprotein peptidase, which cleaves the leader peptide from a component of the bacterial outer membrane; type IV prepilin leader peptidase; and the serine-dependent leader peptidase 1, which has the more general role of cleaving the leader peptide from a variety of secreted proteins and proteins directed to the periplasm and periplasmic membrane. Leader peptidase 1 is similar to the eukaryotic signal peptidase, although the bacterial protein is monomeric, while the eukaryotic protein is multimeric.
Mitochondria contain a similar two-subunit serine protease that removes leader peptides from nuclear- and mitochondrial-encoded proteins, which localise in the inner mitochondrial space. The catalytic residues of a number of these peptides have been identified as a serine/lysine dyad.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26B.
Eukaryotic microsomal signal peptidase is involved in the removal of signal peptides from secretory proteins as they pass into the endoplasmic reticulum lumen. The peptidase is more complex than its mitochondrial and bacterial counterparts, containing a number of subunits, ranging from two in the chicken oviduct peptidase, to five in the dog pancreas protein. They share sequence similarity with the bacterial leader peptidases (family S26A), although activity here is mediated by a serine/histidine dyad rather than a serine/lysine dyad. Archaeal signal peptidases also belong to this group.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes the grpE protein. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
The X-ray crystal structure of GrpE in complex with the ATPase domain of DnaK revealed that GrpE is an asymmetric homodimer, bent in a manner that favours extensive contacts with only one DnaKATPase monomer. GrpE does not actively compete for the atomic positions occupied by the nucleotide. GrpE and ADP mutually reduce one another's affinity for DnaK 200-fold, and ATP instantly dissociates GrpE from DnaK.
Prokaryotes and eukaryotes respond to heat shock and other forms of environmental stress by inducing synthesis of heat-shock proteins (hsp). The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1Â2% of cellular proteins under non-stress conditions. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.
The sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal domain by a highly-acidic, flexible linker region.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to MEROPS peptidase family M22 (clan MK). The type example being O-sialoglycoprotein endopeptidase from Pasteurella haemolytica (Mannheimia haemolytica).
O-Sialoglycoprotein endopeptidase is secreted by the bacterium P. haemolytica, and digests only proteins that are heavily sialylated, in particular those with sialylated serine and threonine residues. Substrate proteins include glycophorin A and leukocyte surface antigens CD34, CD43, CD44 and CD45. Removal of glycosylation, by treatment with neuraminidase, completely negates susceptibility to O-sialoglycoprotein endopeptidase digestion.
Sequence similarity searches have revealed other members of the M22 family, from yeast, Mycobacterium, Haemophilus influenzae and the cyanobacterium Synechocystis. The zinc-binding and catalytic residues of this family have not been determined, although the motif HMEGH may be a zinc-binding region.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .
More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation.
Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event.
This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.
L-aspartate + 2-oxoglutarate = oxaloacetate + L-glutamateAminotransferases share certain mechanistic features with other pyridoxal-phosphate-dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue . This family includes some aromatic-amino-acid aminotransferases too.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase family S16 (lon protease family, clan SF). The type example being the lon protease of Escherichia coli.
Lon (La) protease was the first ATP-dependent protease to be purified from E. coli. The enzyme is a homotetramer of 87kDa subunits, with one proteolytic and one ATP-binding site per monomer, making it structurally less complex than other known ATP-dependent proteases. Despite this relative structural simplicity, lon recognises its substrates directly, without delegating the task of substrate recognition to other enzymes. By contrast, ClpP endopeptidases (S14, clan SK) are multimeric assemblies of two different types of subunit, one of which has ATPase activity, and the other has proteolytic activity.
Other members of this group include:
The family also include proteins classified as non-peptidase homologues that either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity. A significant number of the non-peptidase homologues of S16 are found in which are described as Mg chelatase-related proteins.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases and non-peptidase homologs belong to the MEROPS peptidase family S1, subfamily S1C (protease Do subfamily, clan PS(S)). A type example is the protease Do from Escherichia coli.
Other members of this group include the E. coli htrA gene product (HrtA or DegP protein), which is essential for bacterial survival at temperatures above 42 degrees and for digesting misfolded protein in the periplasm. Mature DegP from E. coli has 448 residues, of which His105, Asp135, and Ser210 form the catalytic triad. The protein has an N-terminal sequence typical of a leader peptide. Structural analysis indicates that bacterial HtrA is a serine protease belonging to the family of cage-forming proteases and that only unfolded polypeptides can be threaded in extended conformation into the cage to access the proteolytic sites. Disulphide bonds of partially unfolded substrates impede protein breakdown and represent a conformational constraint for entering the inner cavity. This preference for unfolded polypeptides might be also a reason for the ATP-independent mode of action and for the increased proteolytic activity at higher temperatures.
The HtrA family shares a modular architecture composed of an N-terminal segment believed to have regulatory functions, a conserved trypsin-like protease domain, and one or two PDZ domains which mediate specific protein-protein interactions and bind preferentially to the C-terminal three to four residues of the target protein. HtrA belongs to the trypsin clan SA. SA proteases have a two-domain structure with each domain forming a six-stranded barrel. The active site cleft is located at the interface of the two perpendicularly arranged barrel domains. The active site is constructed by several loops located at the C-terminal side of both barrel domains. The functional unit of HtrA appears to be a trimer, which is stabilized exclusively by residues of the protease domains. The basic trimer has a funnel-like shape with the protease domains located at its top and the PDZ domains protruding to the outside. Once substrates have been bound, they have to be delivered into the interior of the funnel and the proteolytic sites. In contrast to other protease-chaperone systems, ATP does not drive binding and release of substrates.
The degQ and degS genes of E. coli encode proteins of 455 and 355 residues that are homologues of the DegP protease. Purified DegQ protein has the properties of a serine endopeptidase, and is processed by the removal of a 27-residue N-terminal signal sequence. Deletion studies suggest that DegQ, like DegP, functions as a periplasmic protease in vivo.
Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.
XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.
This entry represents XP group B (XP-B) give rise to both XP and Cockayne syndrome. The DNA/RNA helicase domainis also present in this group of proteins.
Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.
XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.
DNA carries the biological information that instructs cells how to exist in an ordered fashion. Accurate replication is thus one of the most important events in the cell life cycle. This function is mediated by DNA-directed DNA polymerases, which add nucleotide triphosphate (dNTP) residues to the 3'-end of the growing DNA chain, using a complementary DNA as template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used. DNA-dependent DNA polymerases have been grouped into families, denoted A, B and X, on the basis of sequence similarities. Members of family A, which includes bacterial and bacteriophage polymerases, share significant similarity to Escherichia coli polymerase I; hence family A is also known as the pol I family. The bacterial polymerases also contain an exonuclease activity, which is coded for in the N-terminal portion. Three motifs, A, B and C, are seen to be conserved across all DNA polymerases, with motifs A and C also seen in RNA polymerases. They are centred on invariant residues, and their structural significance was implied from the Klenow (E. coli) structure. Motif A contains a strictly-conserved aspartate at the junction of a beta-strand and an alpha-helix; motif B contains an alpha-helix with positive charges; and motif C has a doublet of negative charges, located in a beta-turn-beta secondary structure.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The genomic structure and sequence of the human ribosomal protein L7a has been determined and shown to resemble other mammalian ribosomal protein genes. The sequence of a gene for ribosomal protein L4 of yeast has also been determined; its single open reading frame is highly similar to mammalian ribosomal protein L7a. Several other ribosomal proteins have been found to share sequence similarity with L7a, including Saccharomyces cerevisiae NHP2, Bacillus subtilis hypothetical protein ylxQ, Haloarcula marismortui Hs6, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1203.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The genomic structure and sequence of the human ribosomal protein L7a has been determined. The gene contains 8 exons and 7 introns, encompassing 3179 bp. The human gene resembles other mammalian ribosomal protein genes in so far as it contains a short first exon, a short 5' untranslated leader and its transcriptional start sites at C residues embedded in a poly-pyrimidine tract.
The sequence of a gene for ribosomal protein L4 of Saccharomyces cerevisiae (Baker's yeast) has also been determined, which, unlike most of its other ribosomal protein genes, has no intron. The single open reading frame is highly similar to mammalian ribosomal protein L7a.
There appear to be two genes for L4, both of which are active. Yeast cells containing a disruption of the L4-1 gene form smaller colonies than either wild-type or disrupted L4-2 strains. Disruption of both L4 genes is lethal, probably resulting from an inability of the organism to produce functional ribosomes.
Several other ribosomal proteins have been found to share sequence similarity with L7a, including yeast NHP2, Bacillus subtilis hypothetical protein ylxQ, Haloarcula marismortui (Halobacterium marismortui) Hs6, and Methanocaldococcus jannaschii MJ1203.
This InterPro entry focus on regions that characterise the ribosomal L7A proteins but distinguish them from the rest of the HMG-like family.
The biological implications of the observed similarities to S6 and L7a are unclear, as biochemical studies have indicated that NHP2 is not a ribosomal protein. Nevertheless, deletion experiments have indicated NHP2 to have an essential physiological function.
High mobility group (HMG or HMGB) proteins constitute a family of relatively low molecular weight non-histone components in chromatin. HMG1 and HMG2 are highly similar, and preferentially bind single-stranded DNA and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.
The 3D structure of part of the sequence (57-136), termed box 2, has been determined using 3D NMR. The protein exhibits an unusual all-alpha fold, which forms a V-shaped arrow-head, with helices along two edges and one rather flat face. Such an architecture is not shown by any of the currently known DNA-binding motifs. The majority of conserved residues in the HMG box family are those involved in maintaining the 3D fold.
This entry contains Pob3 which is a subunit of the heterodimeric yeast FACT complex (Spt16p-Pob3p). The FACT complex facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.
The mechanism of REP-1-mediated membrane association of Rab5 is similar to that mediated by Rab GDP dissociation inhibitor (GDI). REP-1 and Rab GDI also share other functional properties, including the ability to inhibit the release of GDP and to remove Rab proteins from membranes.
The crystal structure of the bovine alpha-isoform of Rab GDI has been determined to a resolution of 1.81A. The protein is composed of two main structural units: a large complex multi-sheet domain I, and a smaller alpha-helical domain II.
The structural organisation of domain I is closely related to FAD-containing monooxygenases and oxidases. Conserved regions common to GDI and the choroideraemia gene product, which delivers Rab to catalytic subunits of Rab geranylgeranyltransferase II, are clustered on one face of the domain. The two most conserved regions form a compact structure at the apex of the molecule; site-directed mutagenesis has shown these regions to play a critical role in the binding of Rab proteins.
The crystal structure of the bovine alpha-isoform of Rab GDI has been determined to a resolution of 1.81A. The protein is composed of two main structural units: a large complex multi-sheet domain I, and a smaller alpha-helical domain II.
The structural organisation of domain I is closely related to FAD-containing monooxygenases and oxidases. Conserved regions common to GDI and the choroideraemia gene product, which delivers Rab to catalytic subunits of Rab geranylgeranyltransferase II, are clustered on one face of the domain. The two most conserved regions form a compact structure at the apex of the molecule; site-directed mutagenesis has shown these regions to play a critical role in the binding of Rab proteins.
A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane. Such proteins include: ADP/ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others.
Sequence analysis of selected members of the carrier protein family has suggested the presence of six transmembrane (TM) domains, with varying degrees of sequence conservation and hydrophilicity. The TM regions, and adjacent hydrophilic loops, are more highly conserved than other regions of the proteins. All members of the family appear to consist of a tripartite structure, each of the repeated segments being about 100 residues in length. Each repeat contains two TM domains, the first being more hydrophobic, with conserved glycyl and prolyl residues. Five of the six TM domains are followed by the conserved sequence (D/E)-Hy(K/R) {where - denotes any residue, and Hy is a hydrophobic position}.
A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane. Such proteins include: ADP,ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others.
Sequence analysis of selected members of the carrier protein family has suggested the presence of six transmembrane (TM) domains, with varying degrees of sequence conservation and hydrophilicity. The TM regions, and adjacent hydrophilic loops, are more highly conserved than other regions of the proteins. All members of the family appear to consist of a tripartite structure, each of the repeated segments being ~100 residues in length. Each repeat contains two TM domains, the first being more hydrophobic, with conserved glycyl and prolyl residues. Five of the six TM domains are followed by the conserved sequence (D/E)-Hy(K/R){where - denotes any residue, and Hy is a hydrophobic position}.
Mitochondrial ADP/ATP translocase, an abundant component of the inner membrane, carries ATP from the matrix into the inter-membrane space and transports ADP back. The protein is an integral membrane protein that functions as a homodimer.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide.
The type example is aminopeptidase I from Saccharomyces cerevisiae (Baker's yeast), the sequence of which has been deduced, and the mature protein shown to consist of 469 amino acids. A 45-residue presequence contains both positively- and negatively-charged and hydrophobic residues, which could be arranged in an N-terminal amphiphilic alpha-helix. The presequence differs from signal sequences that direct proteins across bacterial plasma membranes and endoplasmic reticulum or into mitochondria. It is unclear how this unique presequence targets aminopeptidase I to yeast vacuoles, and how this sorting utilises classical protein secretory pathways.
A conserved 30-residue domain has been found in a number of these heavy metal transport or detoxification proteins. The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding. The HMA domain has been identified in the N-terminal regions of a variety of cation-transporting ATPases (E1-E2 ATPases). In addition, the domain has been found in bacterial mercuric reductase; the copP copper-binding protein of Helicobacter pylori; and in the N-terminal regions of mercuric transport protein periplasmic component (gene merP) and plasmids carried by mercury-resistant Gram-negative bacteria, where it seems to be a mercury scavenger that specifically binds to one Hg(2+) ion, passing this to mercuric reductase via the merT protein.
The structure of the mercuric ion-binding protein MerP from Shigella flexneri has been determined. The fold has been classed as a ferredoxin-like alpha-beta sandwich, having a beta-alpha beta-beta alpha-beta architecture, with the two alpha-helices overlaying a four-stranded anti-parallel beta- sheet. Structural differences between the reduced and mercury-bound forms of merP are localised to the metal-binding loop containing the consensus sequence GMTCXXC, the two cysteines of which are involved in bi-coordination of Hg(2+).
Mercuric reductase, which contains a single copy of the HMA domain, is involved in a specialised system that confers resistance to Hg(2+) on catalysing the reaction:
Hg + NADP+ + H+ = Hg2+ + NADPHThe protein functions as a homodimer, with an FAD flavoprotein; its active site is a redox-active disulphide bond.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The small ribosomal subunit protein S12 contains 130-150 amino acid residues, and is thought to be involved in the translation initiation step. This family consists of eukaryotic S12 ribosomal proteins, including those from vertebrates, Trypanosoma brucei, Caenorhabditis elegans, Drosophila and Saccharomyces cerevisiae (Baker's yeast).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The ribosomal proteins catalyse ribosome assembly and stabilise the rRNA, tuning the structure of the ribosome for optimal function. Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S17 is known to bind specifically to the 5' end of 16S ribosomal RNA in Escherichia coli (primary rRNA binding protein), and is thought to be involved in the recognition of termination codons. Experimental evidence has revealed that S17 has virtually no groups exposed on the ribosomal surface.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Alanyl-tRNA synthetase is an alpha4 tetramer that belongs to class IIc.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Seryl-tRNA synthetase exists as monomer and belongs to class IIa.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Lysyl-tRNA synthetase is an alpha 2 homodimer that belong to both class I and class II. In eubacteria and eukaryota lysyl-tRNA synthetases belong to class II in the same family as aspartyl tRNA synthetase. The class Ic lysyl-tRNA synthetase family is present in archaea and some eubacteria. Moreover in some eubacteria there is a gene X, which is similar to a part of lysyl-tRNA synthetase from class II. Lysyl-tRNA synthetase is duplicated in some species with, for example in Escherichia coli, as a constitutive gene (lysS) and an induced one (lysU). No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. Lysine is activated by being attached to the alpha-phosphate of AMP before being transferred to the cognate tRNA. The refined crystal structures give "snapshots" of the active site corresponding to key steps in the aminoacylation reaction and provide the structural framework for understanding the mechanism of lysine activation. The active site of LysU is shaped to position the substrates for the nucleophilic attack of the lysine carboxylate on the ATP alpha-phosphate. No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. A loop close to the catalytic pocket, disordered in the lysine-bound structure, becomes ordered upon adenine binding.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Cysteinyl-tRNA synthetase is an alpha monomer and belongs to class Ia.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Isoleucyl-tRNA synthetase is an alpha monomer that belongs to class Ia. The enzyme, isoleucyl-transfer RNA synthetase, activates not only the cognate substrate L-isoleucine but also the minimally distinct L-valine in the first, aminoacylation step. Then, in a second, "editing" step, the synthetase itself rapidly hydrolyzes only the valylated products as shown from the crystal structures.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Leucyl tRNA synthetase is an alpha monomer that belongs to class Ia. There are two different families of leucyl-tRNA synthetases. This family includes the eubacterial and mitochondrial synthetases. The crystal structure of leucyl-tRNA synthetase from the hyperthermophile Thermus thermophilus has an overall architecture that is similar to that of isoleucyl-tRNA synthetase, except that the putative editing domain is inserted at a different position in the primary structure. This feature is unique to prokaryote-like leucyl-tRNA synthetases, as is the presence of a novel additional flexibly inserted domain.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Valyl-tRNA synthetase is an alpha monomer that belongs to class Ia.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Glutamyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure.
sn-glycerol-3-phosphate + acceptor = glycerone phosphate + reduced acceptorInsulin exposure often stimulates G3PDH activity, and thus is key to reducing the effects of the disease diabetes. In obese people, where insulin resistance has been demonstrated, the amount of G3PDH has been shown to be correspondingly lower than that in normal weight people. In bacteria it is associated with the utilization of glycerol coupled to respiration. In Escherichia coli and Haemophilus influenzae, two isozymes are known: one expressed under anaerobic conditions (gene glpA) and one in aerobic conditions (gene glpD). In eukaryotes, a mitochondrial form of GPD participates in the glycerol phosphate shuttle in conjunction with an NAD-dependent cytoplasmic GPD. This mechanism is responsible for the preservation of a redox balance. In this environment, the enzyme has been recorded to increase activity in the presence of calcium. These enzymes are proteins of about 60 to 70 Kd which contain a probable FAD-binding domain in their N-terminal extremity. The mammalian enzyme differs from the bacterial or yeast proteins by having an EF-hand calcium-binding region in its C-terminal extremity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This family consists of ribosomal protein S12 from bacteria, mitochondria, and chloroplasts.
The antibiotic tetracycline has a broad spectrum of activity, acting to inhibit bacterial protein synthesis by binding to the 30S ribosomal subunit, which prevents the association of the aminoacyl-tRNA to the ribosomal acceptor A site. Tetracycline binding is reversible, therefore diluting out the antibiotic can reverse its effects. Tetracycline resistance genes are often located on mobile elements, such as plasmids, transposons and/or conjugative transposons, which can sometimes be transferred between bacterial species. In certain cases, tetracycline can enhance the transfer of these elements, thereby promoting resistance amongst a bacterial colony. There are three types of tetracycline resistance: tetracycline efflux, ribosomal protection, and tetracycline modification:
The expression of several of these tet genes is controlled by a family of tetracycline transcriptional regulators known as TetR. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in over 115 genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response.
This entry represents the tetracycline resistance protein Tet(A), a tetracycline efflux protein that functions as a metal-tetracycline/H+ antiporter. This is an energy-dependent process that decreases the accumulation of the antibiotic in whole cells. Tet(A) is encoded by the transposon Tn10, and is an integral membrane protein with twelve potential transmembrane domains. Site-directed mutagenesis studies have shown that a negative charge at position 66 is essential for tetracycline transport, and that the region that includes the dipeptide plays an important role in metal-tetracycline transport; it perhaps acts as a gate that opens on the charge-charge interaction between Asp66 and the metal-tetracycline.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry represents the core region of arginyl-tRNA synthetase, which has been crystallized and preliminary X-ray crystallographic analysis of yeast arginyl-tRNA synthetase-yeast tRNAArg complexes is available.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Tryptophanyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. The crystal structure of tryptophanyl-tRNA synthetase is known.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Tyrosyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. Studies on tyrosyl-tRNA synthetase provide the first kinetic evidence that the 'KMSKS' motif plays a role in the initial binding of tRNA(Tyr) to tyrosyl-tRNA synthetase.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Methionyl-tRNA synthetase is an alpha 2 dimer that belongs to class Ia. In some species (archaea, eubacteria and eukaryotes) a coding sequence, similar to the C-terminal end of MetRS, is present as an independent gene which is a tRNA binding domain as a dimer. In eubacteria, MetRS can also be split in two sub-classes corresponding to the presence of one or two CXXC domains specific to zinc binding. The crystal structures of a number of methionyl-tRNA synthases are known .
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Aspartyl tRNA synthetaseis an alpha2 dimer that belongs to class IIb. Structural analysis combined with mutagenesis and enzymology data on the yeast enzyme point to a tRNA binding process that starts by a recognition event between the tRNA anticodon loop and the synthetase anticodon binding module.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Prolyl-tRNA synthetase exists in two forms, which are loosely related. The first form, is present in the majority of eubacteria species. The second one, present in some eubacteria, is essentially present in archaea and eukaryota. Prolyl-tRNA synthetase belongs to class IIa. The enzyme from Escherichia coli contains all three of the conserved consensus motifs characteristic of class II aminoacyl-tRNA synthetases. The complex between Thermus thermophilus prolyl-tRNA synthetase (ProRSTT) and its cognate tRNA has been crystallized using two different isoacceptors of tRNA(Pro).
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Threonyl-tRNA synthetase exists as a monomer and belongs to class IIa. The enzyme from Escherichia coli represses the translation of its own mRNA. The crystal structure of the complex between tRNA(Thr) and ThrRS show structural features that reveal novel strategies for providing specificity in tRNA selection. These include an amino-terminal domain containing a novel protein fold that makes minor groove contacts with the tRNA acceptor stem. The enzyme induces a large deformation of the anticodon loop, resulting in an interaction between two adjacent anticodon bases, which accounts for their prominent role in tRNA identity and translational regulation. A zinc ion found in the active site is implicated in amino acid recognition/discrimination. The zinc ion may act to ensure that only amino acids that possess a hydroxyl group attached to the beta-position are activated.
Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:
ADP + phosphoenolpyruvate = ATP + pyruvate
The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.
PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.
The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.
This entry represents the two barrel domains, the beta/alpha-barrel, and the beta-barrel inserted within it.
2-methyl-4-amino-5-hydroxymethylpyrimidine diphosphate + 4-4-methyl-5-(2-phosphonooxyethyl)-thiazole = pyrophosphate + thiamin monophosphateHydroxyethylthiazole kinase expression is regulated at the mRNA level by intracellular thiamin pyrophosphate.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
This entry represents DNA topoisomerase II enzymes from eukaryotes and viruses. Topoisomerase II primarily functions in introducing negative supercoils into DNA, and is of particular importance during the segregation of chromosomes during mitosis . In eukaryotes and viruses, this enzyme occurs as a single polypeptide, with the N-terminal portion (homologous to subunit B of bacterial topoisomerase II, or gyraseB) responsible for ATPase activity and the C-terminal portion (homologous to subunit A of bacterial topoisomerase II, or gyraseA) responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges. In mammals, there are at least two isozymes of this enzyme, topoisomerases II-alpha and II-beta, which are similar in structure and catalytic properties. The alpha isoform is involved in chromosome condensation and segregation.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents subunit B found in topoisomerase II (gyrB) and topoisomerase IV (parE), primarily of bacterial origin, and which functions in ATP hydrolysis and subunit interaction. It does not include the topoisomerase II enzymes composed of a single polypeptide, as are found in most eukaryotes.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Microtubules are polymers of tubulin, a dimer of two 55-kDa subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end.
For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.
Most species, excepting simple eukaryotes, express a variety of closely- related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly.
Most species, excepting simple eukaryotes, express a variety of closely related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma-tubulins constitute a ubiquitous and highly conserved subfamily of the tubulin family. The protein is found at microtubule-organising centres, such as the spindle poles or the centrosome. It remains associated with the centrosome when microtubules are depolymerised, suggesting that it is an integral component that might play a role in minus-end nucleation of microtubule assembly.
Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).
Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.
The ornithine decarboxylases catalyse the transformation of ornithine into putrescine. Phylogenetic analysis of the mRNAs from several mammalian species suggests that ODC is encoded by orthologous genes in the different species. Analysis of divergence patterns in a number of subregions showed that the domains have evolved in a noncoordinate fashion. Evolution of each subregion has been episodic, with periods of both rapid and slow divergence, possibly indicating the existence of selection pressures that were exerted in a time- and domain-specific manner during mammalian speciation. The active form of mammalian ODC is a homodimer of 53 kDa subunits (the monomer retains no enzymatic activity). In vitro hybridisation and cross- linkage analysis have suggested that the active site of ODC is formed at the interface of the two monomers via the interaction of the cysteine-360- containing region of one subunit with the lysine-69-containing region of the other.
Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.
Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain.
The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals, the function of each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction process by hydrogen atom abstraction from the ribonucleotide. The radical-based reaction involves five cysteines: two of these are located at adjacent anti-parallel strands in a new type of ten-stranded alpha/beta-barrel; two others reside at the carboxyl end in a flexible arm; and the fifth, in a loop in the centre of the barrel, is positioned to initiate the radical reaction. There are several regions of similarity in the sequence of the large chain of prokaryotes, eukaryotes and viruses spread across 3 domains: an N-terminal domain common to the mammalian and bacterial enzymes; a C-terminal domain common to the mammalian and viral ribonucleotide reductases; and a central domain common to all three.
Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.
In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.
NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.
NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31, plant L34, yeast putative ribosomal protein YIL052c and archaebacterial L34e.
The enzyme diadenosine 5',5''-P1,P4-tetraphosphate pyrophosphohydrolase asymmetrically hydrolyses AP4A to yield AMP and ATP. The catalysed reaction is as follows:
P(1),P(4)-bis(5'-adenosyl)tetraphosphate + H(2)O = ATP + AMP.
The cDNA and derived amino acid sequence of human diadenosine 5',5"'- P1,P4-tetraphosphate pyrophosphohydrolase have been determined by means of EST analysis. The protein possesses a modification of the MutT domain found in certain nucleotide pyrophosphatases.
Microtubules are polymers of tubulin, a dimer of two 55-kDa subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end.
For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.
Most species, excepting simple eukaryotes, express a variety of closely- related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly. More recently, epsilon-tubulin has been identified in humans and Trypanosomes; in humans, it has been localised to centrosomes.
Members of this family have been called SAND proteins although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologues to domains of known structure that assemble into cargo vesicle adapter (AP) complexes.
This entry represents Spo11, a meiotic recombination protein found in eukaryotes, and subunit A of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea. These two types of proteins share structural homology.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. They can be divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA. Topoisomerase VI is a type IIB enzymes that assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein.
Spo11 is a meiosis-specific protein that is responsible for the initiation of recombination through the formation of DNA double-strand breaks by a type II DNA topoisomerase-like activity. Spo11 acts in conjunction with several other proteins, including Rec102 in yeast, to bring about meiotic recombination.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::
N-formyl-L-methionine + H2O = formate + methionyl peptideCatalytic efficiency strongly depends on the identity of the bound metal.
The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .
Mammalian translationally controlled tumour protein (TCTP) (or P23) is a protein which has been found to be preferentially synthesised in cells during the early growth phase of some types of tumour, but which is also expressed in normal cells. The physiological function of TCTP is still not known. It was first identified as a histamine-releasing factor, acting in IgE +-dependent allergic reactions. In addition, TCTP has been shown to bind to tubulin in the cytoskeleton, has a high affinity for calcium, is the binding target for the antimalarial compound artemisinin, and is induced in vitamin D-dependent apoptosis. TCTP production is thought to be controlled at the translational as well as the transcriptional level.
TCTP is a hydrophilic protein of 18 to 20 Kd. TCTPs do not share significant sequence similarity with any other class of proteins. Recently, the structure of TCTP was determined and exhibited significant structural similarity to the human protein Mss4, which is a guanine nucleotide-free chaperone of the Rab protein. Close homologues have been found in plants, earthworm, Caenorhabditis elegans (F52H2.11), Hydra, Saccharomyces cerevisiae (YKL056c) and Schizosaccharomyces pombe (SpAC1F12.02c).
MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins.
This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.
The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.
The MCM proteins bind together in a large complex. Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7. This core complex in human MCMs has been associated with helicase activity in vitro, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.
Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus.
The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.
The components of the MCM2-7 complex in Homo sapiens (Human) are:
.Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.
MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.
The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.
In addition to its role in initiation of DNA replication, MCM2 is able to inhibit the MCM4,6,7 helicase. Studies on murine MCM2 indicate that its C-terminus is required for interaction with MCM4, as well as for inhibition of the DNA helicase activity of the MCM4,6,7 complex. The N-terminal region, which contains an H3-binding domain and a region required for nuclear localisation, is required for the phosphorylation by CDC7 kinase.
The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.
The components of the MCM2-7 complex in Homo sapiens (Human) are:
.Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.
MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.
The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.
Members of the MCM3 class have been isolated from a number of organisms. Human MCM3 was first described as a protein associated with DNA polymerase alpha-primase, although subsequent analysis failed to show a direct interaction between the them. The gene encoding human MCM3 has been localised to chromosome 6p21.1-p12. In Saccharomyces cerevisiae (Baker's yeast), MCM3 is a phospho-protein that exists in multiple isoforms; distinct isoforms can be detected at specific stages of the cell cycle. MCM3 has been implicated in limb development in Xenopus; identification of maternal and zygotic proteins suggests that specific forms may be used at different developmental stages. The MCM3 protein contains a nuclear localisation signal, which is necessary for its translocation into the nucleus.
The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.
The components of the MCM2-7 complex in Homo sapiens (Human) are:
.Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.
MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.
The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.
MCM4 is thought to play a pivotal role in ensuring DNA replication occurs only once per cell cycle. Phosphorylation of MCM4 dramatically reduces its affinity for chromatin - it has been proposed that this cell cycle-dependent phosphorylation is the mechanism that inactivates the MCM complex from late S phase through mitosis, thus preventing illegitimate DNA replication during that period of the cell cycle.
The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.
The components of the MCM2-7 complex in Homo sapiens (Human) are:
.Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.
MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.
The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.
In addition to its role as a replication factor, the MCM6 protein has DNA helicase activity when complexed as a hexamer (containing two molecules each of MCM4, MCM6 and MCM7), suggesting that this complex is involved in the initiation of DNA replication as a DNA-unwinding enzyme. Xenopus MCM6 exists in two forms, maternal and zygotic, suggesting that specific forms of MCM6 may be used at different developmental stages.
RNA-binding motif protein 8 (RBM8) contains a putative RNA-binding domain known as an RNA recognition motif (RRM). The RRM motif is found in numerous RNA-binding proteins, including heterogenous nuclear ribonucleoproteins (hnRNPs), and proteins implicated in regulation of alternative splicing. The RRM is a 90-residue domain that binds single-stranded RNA; the structure consists of four beta-stands and two alpha-helices arranged in an alpha/beta sandwich, with a third helix present in some cases during RNA binding. Three-dimensional modelling of the RBM8 RRM domain indicates that the sequences fold into an RNA-binding domain, forming a hydrophobic core between a beta-sheet and two helices.
The human RBM8A protein is ubiquitously expressed; the protein is localised predominantly in the cell nucleus and diffused throughout the cytoplasm. It preferentially associates with mRNAs produced by splicing, including both nuclear mRNAs and newly exported cytoplasmic mRNAs. Evidence suggests the protein remains associated with spliced mRNAs as a tag to indicate the position of spliced introns. Human RBM8A protein specicially binds to MAGOH, the human homologue of Drosophila mago nashi, a protein required for normal germ plasm development in the Drosophila embryo; a similar association occurs with the Drosophila RBM8 protein, Tsunagi.
The RBM8A and RBM8B protein sequences contain a putative bipartate nuclear localisation signal at the N-terminus, as well a stretch of glycine residues. In addition, the RRM contained within RBM8A and RBM8B contains one set of the two consensus nucleic acid-binding motifs, RNP-1 and RNP-2, characteristic of heterogeneous nuclear ribonucleoprotein (hnRNP).
There are four different enzymes that share a similar catalytic mechanism which involves the phosphorylation by ATP (or GTP) of a specific histidine residue in the active site. These enzymes are: ATP citrate-lyase, the primary enzyme responsible for the synthesis of cytosolic acetyl-CoA in many tissues, catalyzes the formation of acetyl-CoA and oxaloacetate from citrate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. ATP-citrate lyase is a tetramer of identical subunits; Succinyl-CoA ligase (GDP-forming) is a mitochondrial enzyme that catalyzes the substrate level phosphorylation step of the tricarboxylic acid cycle: the formation of succinyl-CoA from succinate with a concomitant hydrolysis of GTP to GDP and phosphate. This enzyme is a dimer composed of an alpha and a beta subunits; Succinyl-CoA ligase (ADP-forming) is a bacterial enzyme that during aerobic metabolism functions in the citric acid cycle, coupling the hydrolysis of succinyl-CoA to the synthesis of ATP. It can also function in the other direction for anabolic purposes. This enzyme is a tetramer composed of two alpha and two beta subunits; and Malate-CoA ligase (malyl-CoA synthetase), is a bacterial enzyme that forms malyl-CoA from malate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. Malate-CoA ligase is composed of two different subunits.
This entry corresponds to two regions, a glycine-rich conserved region, located in the second half of ATP citrate lyase and in the alpha subunits of succinyl-CoA ligases and malate-CoA ligase; and the active site phosphorylated histidine residue, which is located some 50 residues to the C-terminal of the first region.
Striated fibre assemblin (SFA), an acidic 33kDa protein, is the major component of striated microtubule-associated fibres (SMAFs) in the flagellar basal apparatus of green flagellates. In Chlamydomonas, and other green flagellates, the SMAFs form a cross-like pattern and run alongside the proximal parts of four bundles of flagellar root microtubules.
The sequence of SFA contains two structurally distinct domains. The head domain, with ~30 residues, contains all the prolines (3-8 depending on species) and is rich in hydroxyamino acids. This non-helical domain is further characterised by the presence of repetitive SP-motifs, some of them in the context SP(M/T)R, which is a putative substrate for p34-CDC2 kinase. The rod domain, with ~250 residues, is predicted to be mostly alpha- helical (the alpha-helix content was estimated to be 76% for the entire molecule or 85% for the postulated rod domain). This domain shows a pronounced coiled-coil-forming ability and contains a 29-residue repeat pattern based on four heptads, followed by a skip residue.
All proteins in this family for which functions are known are components of a multiprotein complex used for targeting nucleotide excision repair to specific parts of the genome. Rad23 contains a ubiquitin-like domain that interacts with catalytically active proteasomes and two ubiquitin (Ub)-associated (UBA) sequences that bind Ub. Rad23 interacts with ubiquitinated cellular proteins through the synergistic action of its UBA domains.
In humans, Rad23 complexes with the XPC protein.
The U2 small nuclear ribonucleoprotein auxiliary factor (U2AF) is a heterodimeric splicing factor composed of a large and a small subunit. The large U2AF subunit recognises the intronic polypyrimidine tract, a sequence located adjacent to the 3' splice site that serves as an important signal for both constitutive and regulated pre-mRNA splicing. The small subunit interacts with the 3' splice site dinucleotide AG and is essential for regulated splicing. The subunits shuttle continuously between the nucleus and the cytoplasm via a mechanism that involves carrier receptors and is independent of binding to mRNA. Both subunits contain an arginine/ serine-rich (RS) domain, which acts as a nuclear localisation signal. Furthermore, the presence of an RS domain on either subunit is sufficient to trigger the nucleocytoplasmic import of the heterodimeric complex.
The human form of the U2 auxiliary factor small subunit, hU2AF35, contains a degenerate RNA recognition motif (RRM) and a C-terminal RS domain. The murine form has been shown to be genomically imprinted with monoallelic expression from the paternal allele. However, this is not the case in humans.
The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme is responsible for activating ubiquitin, the first step in ubiquitinylation. The E1 enzyme hydrolyses ATP and adenylates the C-terminal glycine residue of ubiquitin, and then links this residue to the active site cysteine of E1, yielding a ubiquitin-thioester and free AMP. To be fully active, E1 must non-covalently bind to and adenylate a second ubiquitin molecule. The E1 enzyme can then transfer the thioester-linked ubiquitin molecule to a cysteine residue on the ubiquitin-conjugating enzyme, E2, in an ATP-dependent reaction.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.
ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.
The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.
The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.
The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
This entry represents the ABCE family of ATP-binding cassette (ABC) transporters and solely comprises of the ABCE1 gene product, a 68kDa polypeptide found in insect cells and multi- cellular eukaryotes, but not in yeast. ABCE1 contains 2 nucleotide-binding domains (NBDs) typical of the ABC transporter protein superfamily; however, it lacks the transmembrane domains required for membrane transport functions. ABCE1 is an endoribonuclease inhibitor that interacts directly with RNase L to prevent it from binding 2-5A (5'-phosphorylated 2',5'-linked oligo- adenylates). RNase L plays a major role in the anti-viral and anti-proliferative activities of interferons, and its inhibition by ABCE1 occurs in a concentration-dependent manner. Recently, ABCE1 has been shown to be essential for the assembly of immature HIV-1 capsids in insect cells and higher eukaryotic cell types. ABCE1 expression is induced during HIV type I infection, and is understood to bind HIV-1 Gag (p55) polypeptides following their translation, and to promote their assembly into immature HIV-1 capsids,,.
Prokaryotic and eukaryotic organisms respond to heat shock or other environmental stress by inducing the synthesis of proteins collectively known as heat-shock proteins (hsp). Amongst them is a family of proteins with an average molecular weight of 20 Kd, known as the hsp20 proteins. These seem to act as chaperones that can protect other proteins against heat-induced denaturation and aggregation. Hsp20 proteins seem to form large heterooligomeric aggregates. Structurally, this family is characterised by the presence of a conserved C-terminal domain of about 100 residues.
The 'pleckstrin homology' (PH) domain is a domain of about 100 residues that occurs in a wide range of proteins involved in intracellular signalling or as constituents of the cytoskeleton.
The function of this domain is not clear, several putative functions have been suggested:
It is possible that different PH domains have totally different ligand requirements.
The 3D structure of several PH domains has been determined. All known cases have a common structure consisting of two perpendicular anti-parallel beta sheets, followed by a C-terminal amphipathic helix. The loops connecting the beta-strands differ greatly in length, making the PH domain relatively difficult to detect. There are no totally invariant residues within the PH domain.
Proteins reported to contain one more PH domains belong to the following families:
The 3D structure of the C2 domain of synaptotagmin has been reported, the domain forms an eight-stranded beta sandwich constructed around a conserved 4-stranded motif, designated a C2 key. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel beta sandwich.
The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins. It mediates proteinÂprotein interactions and the assembly of multiprotein complexes. The TPR motif consists of 3Â16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.
The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helixÂturnÂhelix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel alpha-helices. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24° within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.
The forkhead-associated (FHA) domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands.
To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.
Phosphatidylinositol-specific phospholipase C, an eukaryotic intracellular enzyme, plays an important role in signal transduction processes (see. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins.
In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC.
All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' (see and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain (see possibly involved in Ca-dependent membrane attachment.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Eukaryotic protein kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue which is important for the catalytic activity of the enzyme. This entry includes protein kinases from eukaryotes and viruses and may include some bacterial hits too.The regulator of chromosome condensation (RCC1) is a eukaryotic protein which binds to chromatin and interacts with ran, a nuclear GTP-binding protein to promote the loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS). The interaction of RCC1 with ran probably plays an important role in the regulation of gene expression.
RCC1, known as PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in Drosophila, is a protein that contains seven tandem repeats of a domain of about 50 to 60 amino acids. As shown in the following schematic representation, the repeats make up the major part of the length of the protein. Outside the repeat region, there is just a small N-terminal domain of about 40 to 50 residues and, in the Drosophila protein only, a C-terminal domain of about 130 residues.
The RCC1-type of repeat is also found in the X-linked retinitis pigmentosa GTPase regulator. The RCC repeats form a beta-propeller structure.The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the PHD (homeodomain) zinc finger domain, which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.
The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Synonym(s): Rsp5 or WWP domain
The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. It is frequently associated with other domains typical for proteins in signal transduction processes.
A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein and others.
The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins. It comprises the following groups of actin-binding domains:
A comprehensive review of proteins containing this type of actin-binding domains is given in.
The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene have only a single copy. The structure of an example CH-domain has recently been solved.
UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases. The human homologue of yeast Rad23A is one example of a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The solution structure of human Rad23A UBA(2) showed that the domain forms a compact three-helix bundle. Comparison of the structures of UBA(1) and UBA(2) reveals that both form very similar folds and have a conserved large hydrophobic surface patch which may be a common protein-interacting surface present in diverse UBA domains. Evidence that ubiquitin binds to UBA domains leads to the prediction that the hydrophobic surface patch of UBA domains interacts with the hydrophobic surface on the five-stranded beta-sheet of ubiquitin.
This domain is similar in sequence to the N-terminal domain of translation elongation factor EF1B (or EF-Ts) from bacteria, mitochondria and chloroplasts.
More information about EF1B (EF-Ts) proteins can be found at Protein of the Month: Elongation Factors.
The EH (for Eps15 Homology) domain is a protein-protein interaction module of approximately 95 residues which was originally identified as a repeated sequence present in three copies at the N-terminus of the tyrosine kinase substrates Eps15 and Eps15R . The EH domain was subsequently found in several proteins implicated in endocytosis, vesicle transport and signal transduction in organisms ranging from yeast to mammals. EH domains are present in one to three copies and they may include calcium-binding domains of the EF-hand type. Eps15 is divided into three domains: domain I contains signatures of a regulatory domain, including a candidate tyrosine phosphorylation site and EF-hand-type calcium-binding domains, domain II presents the characteristic heptad repeats of coiled-coil rod-like proteins, and domain III displays a repeated aspartic acid-proline-phenylalanine motif similar to a consensus sequence of several methylases.
EH domains have been shown to bind specifically but with moderate affinity to peptides containing short, unmodified motifs through predominantly hydrophobic interactions. The target motifs are divided into three classes: class I consists of the concensus Asn-Pro-Phe (NPF) sequence; class II consists of aromatic and hydrophobic di- and tripeptide motifs, including the Phe-Trp (FW), Trp-Trp (WW), and Ser-Trp-Gly (SWG) motifs; and class III contains the His-(Thr/Ser)-Phe motif (HTF/HSF). The structure of several EH domains has been solved by NMR spectroscopy. The fold consists of two helix-loop-helix characteristic of EF-hand domains, connected by a short antiparallel beta-sheet. The target peptide is bound in a hydrophobic pocket between two alpha helices. Sequence analysis and structural data indicate that not all the EF-hands are capable of binding calcium because of substitutions of the calcium-liganding residues in the loop.
This domain is often implicated in the regulation of protein transport/sorting and membrane trafficking. Messenger RNA translation initiation and cytoplasmic poly(A) tail shortening require the poly(A)-binding protein (PAB) in yeast. The PAB-dependent poly(A) ribonuclease (PAN) is organised into distinct domains containing repeated sequence elements.
Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents a conserved domain usually found near the C-terminus of EF1B-gamma chains, a peptide of 410-440 residues. The gamma chain appears to play a role in anchoring the EF1B complex to the beta and delta chains and to other cellular components.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins.
This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.
The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.
The MCM proteins bind together in a large complex. Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7. This core complex in human MCMs has been associated with helicase activity in vitro, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.
Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus.
Guanylate kinase (GK) catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes (such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.
Proteins containing one or more copies of the DHR domain, an SH3 domain as well as a C-terminal GK-like domain, are collectively termed MAGUKs (membrane-associated guanylate kinase homologs), and include Drosophila lethal(1)discs large-1 tumor suppressor protein (gene dlg1); mammalian tight junction protein Zo-1; a family of mammalian synaptic proteins that seem to interact with the cytoplasmic tail of NMDA receptor subunits (SAP90/PSD-95, CHAPSYN-110/PSD-93, SAP97/DLG1 and SAP102); vertebrate 55 kD erythrocyte membrane protein (p55); Caenorhabditis elegans protein lin-2; rat protein CASK; and human proteins DLG2 and DLG3. There is an ATP-binding site (P-loop) in the N-terminal section of GK, which is not conserved in the GK-like domain of the above proteins. However these proteins retain the residues known, in GK, to be involved in the binding of GMP.
Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1, SUMO1, NEDD8, Rad23, Elongin B and Parkin, the latter being involved in Parkinson's disease.
Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:
Based on their cellular localisation, PTPases are also classified as:
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.
This entry represents dual specificity protein-tyrosine phosphatases. Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr and tyrosine specific protein phosphatase activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. The crystal structure of a human DSP, vaccinia H1-related phosphatase (or VHR), has been determined at 2.1 angstrom resolution. A shallow active site pocket in VHR allows for the hydrolysis of phosphorylated serine, threonine, or tyrosine protein residues, whereas the deeper active site of protein tyrosine phosphatases (PTPs) restricts substrate specificity to only phosphotyrosine. Positively charged crevices near the active site may explain the enzyme's preference for substrates with two phosphorylated residues. The VHR structure defines a conserved structural scaffold for both DSPs and PTPs. A "recognition region" connecting helix alpha1 to strand beta1, may determine differences in substrate specificity between VHR, the PTPs, and other DSPs.
These proteins may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect again dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a new regulatory centre.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:
Based on their cellular localisation, PTPases are also classified as:
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.
This entry includes proteins of two subfamilies: Ser/Thr and Tyr dual specificity protein phosphatase and tyrosine specific protein phosphatase. Both of these subfamilies may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect against dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a regulatory centre.
Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr and tyrosine specific protein phosphatase activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. Tyrosine specific protein phosphatases catalyze the removal of a phosphate group attached to a tyrosine residue. They are also very important in the control of cell growth, proliferation, differentiation and transformation.
Synonym(s): Peptidylprolyl cis-trans isomerase
FKBP-type peptidylprolyl isomerases in vertebrates, are receptors for the two immunosuppressants, FK506 and rapamycin. The drugs inhibit T cell proliferation by arresting two distinct cytoplasmic signal transmission pathways. Peptidylprolyl isomerases accelerate protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. These proteins are found in a variety of organisms.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains.
NAD(+) ADP-ribosyltransferase is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.
The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.
A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:
The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.
The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.
Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF), play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins.
This entry represents the cullin homology region, which is composed of three domains: a 4-helical bundle domain, an alpha+beta domain, and a winged helix-like domain.
Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.
Acyl carrier protein (ACP) is an essential cofactor in the synthesis of fatty acids by the fatty acid synthetases systems in bacteria and plants. In addition to fatty acid synthesis, ACP is also involved in many other reactions that require acyl transfer steps, such as the synthesis of polyketide antibiotics, biotin precursor, membrane-derived oligosaccharides, and activation of toxins, and functions as an essential cofactor in lipoylation of pyruvate and alpha-ketoglutarate dehydrogenase complexes. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. Phosphopantetheine is attached to a serine residue in these proteins. The core structure of ACP consists of a four-helical bundle, where helix three is shorter than the others.
Several other proteins share structural homology with ACP, such as the bacterial apo-D-alanyl carrier protein, which facilitates the incorporation of D-alanine into lipoteichoic acid by a ligase, necessary for the growth and development of Gram-positive organisms; and the thioester domain of the bacterial peptide carrier protein (PCP) found within large modular non-ribosomal peptide synthetases, which are responsible for the synthesis of a variety of microbial bioactive peptides.
The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.
Such a structure is shown in the following schematic representation:
It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.
The HEAT repeat is a tandemly repeated, 37-47 amino acid long module occurring in a number of cytoplasmic proteins, including the four name-giving proteins huntingtin, elongation factor 3 (EF3), the 65 Kd alpha regulatory subunit of protein phosphatase 2A (PP2A) and the yeast PI3-kinase TOR1. Arrays of HEAT repeats consists of 3 to 36 units forming a rod-like helical structure and appear to function as protein-protein interaction surfaces. It has been noted that many HEAT repeat-containing proteins are involved in intracellular transport processes.
In the crystal structure of PP2A PR65/A, the HEAT repeats consist of pairs of antiparallel alpha helices, as predicted in.
Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumour promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC). Phorbol esters can directly stimulate PKC. The N-terminal region of PKC, known as C1, has been shown to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain, which is about 50 amino-acid residues long, and which is essential for DAG/PE-binding. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain.
WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.
The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins. It has been shown to bind RNA. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitro RNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently.
According to structural analysis the KH domain can be separated in two groups. The first group or type-1 contain a beta-alpha-alpha-beta-beta-alpha structure, whereas in the type-2 the two last beta-sheet are located in the N terminal part of the domain (alpha-beta-beta-alpha-alpha-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-1 KH domain include bacterial polyribonucleotide nucleotidyltransferases; vertebrate fragile X mental retardation protein 1 (FMR1); eukaryotic heterogeneous nuclear ribonucleoprotein K (hnRNP K), one of at least 20 major proteins that are part of hnRNP particles in mammalian cells; mammalian poly(rC) binding proteins; Artemia salina glycine-rich protein GRP33; yeast PAB1-binding protein 2 (PBP2); vertebrate vigilin; and human high-density lipoprotein binding protein (HDL-binding protein).
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.
The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents RING-type zinc finger domains. The RING-finger is a specialised type of Zn-finger of 40 to 60 residues that binds two atoms of zinc, and is probably involved in mediating protein-protein interactions.. There are two different variants, the C3HC4-type and a C3H2C3-type, which are clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as 'RING-H2 finger'. The RING domain is a protein interaction domain that has been implicated in a range of diverse biological processes. E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain. E3 ubiquitin-protein ligases determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3. Various RING fingers also exhibit binding to E2 ubiquitin-conjugating enzymes (Ubc's).
Several 3D-structures for RING-fingers are known. The 3D structure of the zinc ligation system is unique to the RING domain and is referred to as the 'cross-brace' motif. The spacing of the cysteines in such a domain is C-x(2)-C-x(9 to 39)-C-x(1 to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. Metal ligand pairs one and three co-ordinate to bind one zinc ion, whilst pairs two and four bind the second, as illustrated in the following schematic representation:
Note that in the older literature, some RING-fingers are denoted as LIM-domains. The LIM-domain Zn-finger is a fundamentally different family, albeit with similar Cys-spacing.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain , three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers . EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.
This repeat was first described in 1986 by Lawler and Hynes. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) as well as extracellular matrix protein like mindin, F-spondin, SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium contain one or more instance of this repeat. It has been involved in cell-cell interraction, inhibition of angiogenesis and apoptosis.
The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat.
Lipoxygenases are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes. Sequence data is available for the following lipoxygenases:
The iron atom in lipoxygenases is bound by four ligands, three of which are histidine residues. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown to be important for the activity of lipoxygenases.
This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.
Calmodulin (CaM) is recognized as a major calcium sensor and orchestrator of regulatory events through its interaction with a diverse group of cellular proteins. Three classes of recognition motifs exist for many of the known CaM binding proteins; the IQ motif as a consensus for Ca2+-independent binding and two related motifs for Ca2+-dependent binding, termed 18-14 and 1-5-10 based on the position of conserved hydrophobic residues.
The regulatory domain of scallop myosin is a three-chain protein complex that switches on this motor in response to Ca2+ binding. Side-chain interactions link the two light chains in tandem to adjacent segments of the heavy chain bearing the IQ-sequence motif. The Ca2+-binding site is a novel EF-hand motif on the essential light chain and is stabilized by linkages involving the heavy chain and both light chains, accounting for the requirement of all three chains for Ca2+ binding and regulation in the intact myosin molecule.
Many eukaryotic proteins containing one or more copies of a putative RNA-binding domain of about 90 amino acids are known to bind single-stranded RNAs. The largest group of single strand RNA-binding proteins is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence. RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu). The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) appears to have two RRM-like domains with specialised features for protein recognition. The motif also appears in a few single stranded DNA binding proteins.
The typical RRM consists of four anti-parallel beta-strands and two alpha-helices arranged in a beta-alpha-beta-beta-alpha-beta fold with side chains that stack with RNA bases. Specificity of RNA binding is determined by multiple contacts with surrounding amino acids. A third helix is present during RNA binding in some cases. The RRM is reviewed in a number of publications.
The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins, nevertheless with a low affinity constant. SAM domains also appear to possess the ability to bind RNA. Smaug  a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA  binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding.
Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures.
PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 µM. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.
PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.
This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.
The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved. It consists of a three-stranded beta-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis.
The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates nuclear export of unspliced and partly-spliced viral RNAs. Rev contains an RNA-binding domain and an effector domain; the latter is believed to interact with a cellular cofactor required for the Rev response and hence HIV-1 replication. Human Rev interacting protein (hRIP) specifically interacts with the Rev effector. The amino acid sequence of hRIP is characterised by an N-terminal, C-4 class zinc finger motif.
High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.
The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins; LEF1 lymphoid enhancer binding factor 1; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents B-box-type zinc finger domains, which are around 40 residues in length. B-box zinc fingers can be divided into two groups, where types 1 and 2 B-box domains differ in their consensus sequence and in the spacing of the 7-8 zinc-binding residues. Several proteins contain both types 1 and 2 B-boxes, suggesting some level of cooperativity between these two domains. B-box domains are found in over 1500 proteins from a variety of organisms. They are found in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). TRIM proteins contain a type 2 B-box domain, and may also contain a type 1 B-box. In proteins that do not contain RING or coiled-coil domains, the B-box domain is primarily type 2. Many type 2 B-box proteins are involved in ubiquitinylation. Proteins containing a B-box zinc finger domain include transcription factors, ribonucleoproteins and proto-oncoproteins; for example, MID1, MID2, TRIM9, TNL, TRIM36, TRIM63, TRIFIC, NCL1 and CONSTANS-like proteins.
The microtubule-associated E3 ligase MID1 contains a type 1 B-box zinc finger domain. MID1 specifically binds Alpha-4, which in turn recruits the catalytic subunit of phosphatase 2A (PP2Ac). This complex is required for targeting of PP2Ac for proteasome-mediated degradation. The MID1 B-box coordinates two zinc ions and adopts a beta/beta/alpha cross-brace structure similar to that of ZZ, PHD, RING and FYVE zinc fingers.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.
In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of RNA-associated proteins. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site.
The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.
There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents ZZ-type zinc finger domains, named because of their ability to bind two zinc ions. These domains contain 4-6 Cys residues that participate in zinc binding (plus additional Ser/His residues), including a Cys-X2-Cys motif found in other zinc finger domains. These zinc fingers are thought to be involved in protein-protein interactions. The structure of the ZZ domain shows that it belongs to the family of cross-brace zinc finger motifs that include the PHD, RING, and FYVE domains. ZZ-type zinc finger domains are found in:
Single copies of the ZZ zinc finger occur in the transcriptional adaptor/coactivator proteins P300, in cAMP response element-binding protein (CREB)-binding protein (CBP) and ADA2. CBP provides several binding sites for transcriptional coactivators. The site of interaction with the tumour suppressor protein p53 and the oncoprotein E1A with CBP/P300 is a Cys-rich region that incorporates two zinc-binding motifs: ZZ-type and TAZ2-type. The ZZ-type zinc finger of CBP contains two twisted anti-parallel beta-sheets and a short alpha-helix, and binds two zinc ions. One zinc ion is coordinated by four cysteine residues via 2 Cys-X2-Cys motifs, and the third zinc ion via a third Cys-X-Cys motif and a His-X-His motif. The first zinc cluster is strictly conserved, whereas the second zinc cluster displays variability in the position of the two His residues.
In Arabidopsis thaliana (Mouse-ear cress), the hypersensitive to red and blue 1 (Hrb1) protein, which regulating both red and blue light responses, contains a ZZ-type zinc finger domain.
ZZ-type zinc finger domains have also been identified in the testis-specific E3 ubiquitin ligase MEX that promotes death receptor-induced apoptosis. MEX has four putative zinc finger domains: one ZZ-type, one SWIM-type and two RING-type. The region containing the ZZ-type and RING-type zinc fingers is required for interaction with UbcH5a and MEX self-association, whereas the SWIM domain was critical for MEX ubiquitination.
In addition, the Cys-rich domains of dystrophin, utrophin and an 87kDa post-synaptic protein contain a ZZ-type zinc finger with high sequence identity to P300/CBP ZZ-type zinc fingers. In dystrophin and utrophin, the ZZ-type zinc finger lies between a WW domain (flanked by and EF hand) and the C-terminal coiled-coil domain. Dystrophin is thought to act as a link between the actin cytoskeleton and the extracellular matrix, and perturbations of the dystrophin-associated complex, for example, between dystrophin and the transmembrane glycoprotein beta-dystroglycan, may lead to muscular dystrophy. Dystrophin and its autosomal homologue utrophin interact with beta-dystroglycan via their C-terminal regions, which are comprised of a WW domain, an EF hand domain and a ZZ-type zinc finger domain. The WW domain is the primary site of interaction between dystrophin or utrophin and dystroglycan, while the EF hand and ZZ-type zinc finger domains stabilise and strengthen this interaction.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The breast cancer type 2 susceptibility protein has a number of 39 amino acid repeats that are critical for binding to RAD51 (a key protein in DNA recombinational repair) and resistance to methyl methanesulphonate treatment. BRCA2 is a breast tumour suppressor with a potential function in the cellular response to DNA damage. At the cellular level, expression is regulated in a cell-cycle dependent manner and peak expression of BRCA2 mRNA is found in S phase, suggesting BRCA2 may participate in regulating cell proliferation. There are eight repeats in BRCA2 designated as BRC1 to BRC8. BRC1, BRC2, BRC3, BRC4, BRC7, and BRC8 are highly conserved and bind to Rad51, whereas BRC5 and BRC6 are less well conserved and do not bind to Rad51. It has been suggested that BRCA2 plays a role in positioning Rad51 at the site of DNA repair or in removing Rad51 from DNA once repair has been completed.
Although apparently functionally unrelated, intracellular TRAFs and extracellular meprins share a conserved region of about 180 residues, the meprin and TRAF homology (MATH) domain. Meprins are mammalian tissue-specific metalloendopeptidases of the astacin family implicated in developmental, normal and pathological processes by hydrolysing a variety of proteins. Various growth factors, cytokines, and extracellular matrix proteins are substrates for meprins. They are composed of five structural domains: an N-terminal endopeptidase domain, a MAM domain (see, a MATH domain, an EGF-like domain (see and a C-terminal transmembrane region. Meprin A and B form membrane bound homotetramer whereas homooligomers of meprin A are secreted. A proteolitic site adjacent to the MATH domain, only present in meprin A, allows the release of the protein from the membrane.
TRAF proteins were first isolated by their ability to interact with TNF receptors . They promote cell survival by the activation of downstream protein kinases and, finally, transcription factors of the NF-kB and AP-1 family. The TRAF proteins are composed of 3 structural domains: a RING finger (see in the N-terminal part of the protein, one to seven TRAF zinc fingers (see in the middle and the MATH domain in the C-terminal part . The MATH domain is necessary and sufficient for self-association and receptor interaction. From the structural analysis two consensus sequence recognized by the TRAF domain have been defined: a major one, [PSAT]x[QE]E and a minor one, PxQxxD.
The structure of the TRAF2 protein reveals a trimeric self-association of the MATH domain. The domain forms a new, light-stranded antiparallel beta sandwich structure. A coiled-coil region adjacent to the MATH domain is also important for the trimerisation. The oligomerisation is essential for establishing appropriate connections to form signalling complexes with TNF receptor-1. The ligand binding surface of TRAF proteins is located in beta-strands 6 and 7.
The sterol-sensing domain (SSD) consists of approximately 180 amino acids organised into a cluster of five consecutive membrane-spanning domains and is found in proteins which have key roles in different aspects of cholesterol homeostasis or cholesterol-linked signalling such as sterol-regulated movement or the trafficking of specific cargoes. Examples of proteins containing SSDs include the Hedgehog signalling protein (Patched protein) from Drosophila; 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGCR), which is involved in the control of cholesterol biosynthesis; SREBP cleavage-activating protein (SCAP); the Niemann-Pick type C (NPC1) protein; and a number of bacterial drug resistance proteins.
The role of the SSD is still open to debate. The domain may may either bind directly to sterols, sterol-modified proteins or proteins that change conformation in response to sterol levels, or trigger an intramolecular response in response to sterols.
C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the classical C2H2 type zinc finger domain.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:
where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. It is a basic protein of 115 to 177 amino-acid residues. This family of ribosomal proteins is present in procaryotes and eukaryotes.
This domain belongs to a more diverse superfamily, including catalytic domain of the mRNA capping enzyme and NAD-dependent DNA ligase.
The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage.
RecA is a protein of about 350 amino-acid residues. Its sequence is very well conserved among eubacterial species. It is also found in the chloroplast of plants. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament visualised by X-ray crystallography, ß-strand 3, the loop C-terminal to ß-strand 2, and alpha-helix D of the core domain form one surface that packs against alpha-helix A and ß-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between ß-strand 1 and alpha-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another nucleotide binding motif, the Walker B box is found at ß-strand 4 in the RecA structure. The Walker B box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP binding interactions are contributed by the amino acid residues at ß-strand 2 and the loop C-terminal to that strand, all of which are greater than 90% conserved among bacterial RecA proteins.
The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage.
RecA is a protein of about 350 amino-acid residues. Its sequence is very well conserved among eubacterial species. It is also found in the chloroplast of plants. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament visualised by X-ray crystallography, ß-strand 3, the loop C-terminal to ß-strand 2, and alpha-helix D of the core domain form one surface that packs against alpha-helix A and ß-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between ß-strand 1 and alpha-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another nucleotide binding motif, the Walker B box is found at ß-strand 4 in the RecA structure. The Walker B box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP binding interactions are contributed by the amino acid residues at ß-strand 2 and the loop C-terminal to that strand, all of which are greater than 90% conserved among bacterial RecA proteins.
The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.
Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins, which is important for importin-beta mediated transport.
Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. As a result, the N-terminal auto-inhibitory region on importin-alpha is free to loop back and bind to the major NLS-binding site, causing the cargo to be released. There are additional release factors as well.
More information about these proteins can be found at Protein of the Month: Importins.
C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
A specific C2H2 Zn-finger is conserved in matrin and several RNA-binding proteins. The Zn-finger follows the general pattern C-x2-C-x(12,16)-H-x5-H, and is different from the 'classical' DNA-binding C2H2 Zn-finger.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.
A chitin biosynthesis protein from yeast also seems to belong to this group.
This entry represents the N-terminal domain of UmuC-like DNA repair proteins. In Escherichia coli, UV and many chemicals appear to cause mutagenesis by a process of translesion synthesis that requires DNA polymerase III and the SOS-regulated proteins UmuD, UmuC and RecA. This machinery allows the replication to continue through DNA lesion, and therefore avoid lethal interruption of DNA replication after DNA damage. UmuC is a well conserved protein in prokaryotes, with a homologue in yeast species.
Proteins currently known to belong to this family are listed below:
The armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila melanogaster segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumour suppressor protein, and the nuclear transport factor importin-alpha, amongst others. A subset of these proteins is conserved across eukaryotic kingdoms. In higher plants, some Arm-repeat proteins function in intracellular signalling like their mammalian counterparts, while others have novel functions.
The 3-dimensional fold of an armadillo repeat is known from the crystal structure of beta-catenin, where the 12 repeats form a superhelix of alpha helices with three helices per unit. The cylindrical structure features a positively charged grove, which presumably interacts with the acidic surfaces of the known interaction partners of beta-catenin.
Ran is an evolutionary conserved member of the Ras superfamily of small GTPases that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Import receptors bind their cargos in the cytoplasm where the concentration of RanGTP is low and release their cargos in the nucleus where the concentration of RanGTP is high. Export receptors respond to Ran GTP in the opposite manner.
Nuclear transport factor 2 (NTF2) is a homodimer of approximately 14kDa subunits which stimulates efficient nuclear import of a cargo protein. NTF2 binds to both RanGDP and FxFG repeat-containing nucleoporins. NTF2 binds to RanGDP sufficiently strongly for the complex to remain intact during transport through NPCs, but the interaction between NTF2 and FxFG nucleoporins is much more transient, which would enable NTF2 to move through the NPC by hopping from one repeat to another.
NTF2 folds into a cone with a deep hydrophobic cavity, the opening of which is surrounded by several negatively charged residues. RanGDP binds to NTF2 by inserting a conserved phenylalanine residue into the hydrophobic pocket of NTF2 and making electrostatic interactions with the conserved negatively charged residues that surround the cavity.
This entry contains predominantly eukaryotic proteins. The following proteins contain a region similar to NTF2:
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn2+ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The VHS domain is a ~140 residues long domain, whose name is derived from its occurrence in VPS-27, Hrs and STAM. Based on regions surrounding the domain, VHS-proteins can be divided into 4 groups:
Resolution of the crystal structure of the VHS domain of Drosophila Hrs and human Tom1 revealed that it consists of eight helices arranged in a double-layer superhelix. The existence of conserved patches of residues on the domain surface suggests that VHS domains may be involved in protein-protein recognition and docking. Overall, sequence similarity is low (approx 25%) amongst domain family members.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking, and with ubiquitin for receptor sorting. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of gamma1-adaptin from AP1 clathrin adaptor complex, and the homologous C-terminal GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 8 strands in 2 beta-sheets in a Greek key topology. This is a similar fold to that found in alpha- and beta-adaptins, but there is little sequence identity between them. The GAE domain is involved in the recruitment of accessory proteins, such as gamma-synergin, Rababptin-5, Eps15 and cyclin G-associated kinase, which modulate the functions of GAE domain containing proteins in the membrane trafficking events. The binding site in GAE for accessory proteins is located in a shallow hydrophobic trough surrounded by charged (mainly basic) residues.
More information about these proteins can be found at Protein of the Month: Clathrin.
This is a domain of unknown function present in signalling proteins including dishevelled, Egl-10, and pleckstrin proteins. Segment polarity dishevelled protein is required to establish coherent arrays of polarized cells and segments in embryos, and plays a role in wingless signalling. Egl-10 regulates G-protein signalling in the central nervous system. Mammalian regulators of G-protein signalling also contain these domains, and regulate signal transduction by increasing the GTPase activity of G-protein alpha subunits, thereby driving them into their inactive GDP-bound form.
The B30.2-like domain is a conserved domain of 160-170 amino acids which is found in nuclear and cytoplasmic proteins, as well as transmembrane and secreted proteins. It was named after the B30-2 exon which maps within the Homo sapiens (Human) class I histocompatibility complex region and codes for a 166-amino-acid peptide similar to the C-terminal domain of human Sjoegren's syndrome nuclear antigen A/Ro (SS-A/Ro), ret finger protein (RFP), Xenopus laevis nuclear factor 7 (XNF7), and Bos taurus (Bovine) butyrophilin. The B30.2-like domain is found associated with different N-terminal domains: immunoglobulin domain in the case of butyrophilin, zinc-binding B-box domain in the case of RFP and SS-A/Ro and leucine zipper in the case of enterophilin. The function of the B30.2-like domain is not known, but the cytoplasmic B30.2-like domain of butyrophilin has been shown to interact with xanthine oxidase.
Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.
The process of vesicular fusion with target membranes depends on a set of SNAREs (SNAP-Receptors), which are associated with the fusing membranes. Target SNAREs (t-SNAREs) are localised on the target membrane and belong to two different families, the syntaxin-like family and the SNAP-25 like family. One member of each family, together with a v-SNARE localised on the vesicular membrane, are required for fusion.
The Syntaxins are type-I transmembrane proteins that contain several regions with coiled-coil propensity in their cytosolic part, the SNARE motif. SNAP-25 is a protein consisting of two coiled-coil regions, which is associated with the membrane by lipid anchors. SNARE motifs assemble into parallel four helix bundles stabilised by the burial of these hydrophobic helix faces in the bundle core. Monomeric SNARE motifs are disordered so this assembly reaction is accompanied by a dramatic increase in alpha-helical secondary structure. The parallel arrangement of SNARE motifs within complexes bring the transmembrane anchors, and the two membranes, into close proximity. Recently, it was shown that the two coiled-coil regions of SNAP-25 and one of the coiled-coil regions of the syntaxins are related. This domain is found in both Syntaxin and SNAP-25 families as well as in other proteins.
The many different actin cross-linking proteins share a common architecture, consisting of a globular actin-binding domain and an extended rod. Whereas their actin-binding domains consist of two calponin homology domains (see, their rods fall into three families.
The rod domain of the family including the Dictyostelium discoideum (Slime mould) gelation factor (ABP120) and human filamin (ABP280) is constructed from tandem repeats of a 100-residue motif that is glycine and proline rich. The gelation factor's rod contains 6 copies of the repeat, whereas filamin has a rod constructed from 24 repeats. The resolution of the 3D structure of rod repeats from the gelation factor has shown that they consist of a beta-sandwich, formed by two beta-sheets arranged in an immunoglobulin-like fold. Because conserved residues that form the core of the repeats are preserved in filamin, the repeat structure should be common to the members of the gelation factor/filamin family.
The head to tail homodimerisation is crucial to the function of the ABP120 and ABP280 proteins. This interaction involves a small portion at the distal end of the rod domains. For the gelation factor it has been shown that the carboxy-terminal repeat 6 dimerises through a double edge-to-edge extension of the beta-sheet and that repeat 5 contributes to dimerisation to some extent.
The PX (phox) domain occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities. The PX domain is approximately 120 residues long, and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of p47phox binds to the SH3 domain in the same protein. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain.
The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain.
Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran Binding Protein 1 (RanBP1) has guanine nucleotide dissociation inhibitory activity, specific for the GTP form of Ran and also functions to stimulate Ran GTPase activating protein(GAP)-mediated GTP hydrolysis by Ran. RanBP1 contributes to maintaining the gradient of RanGTP across the nuclear envelope high (GDI activity) or the cytoplasmic levels of RanGTP low (GAP cofactor).
All RanBP1 proteins contain an approx 150 amino acid residue Ran binding domain. Ran BP1 binds directly to RanGTP with high affinity. There are four sites of contact between Ran and the Ran binding domain. One of these involves binding of the C-terminal segment of Ran to a groove on the Ran binding domain that is analogous to the surface utilised in the EVH1Âpeptide interaction. Nup358 contains four Ran binding domains. The structure of the first of these is known.
The "beige" mouse is established as an animal model of Chediak-Higashi Syndrome (CHS). The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein It is also found in distantly related proteins like, for example,andwhich are factor associated with neutral sphingomyelinase activation.
The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the zinc finger domain found in RanBP2 proteins. Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran binding protein 2 (RanBP2) is a 358-kDa nucleoporin located on the cytoplasmic side of the nuclear pore complex which plays a role in nuclear protein import. RanBP2 contains multiple zinc fingers which mediate binding to RanGDP.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans.
MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.
All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:
Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.
Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.
Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including
Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.
The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals; related FAS1 domains are also found in bacteria.
The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.
FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains.
Proteins known to contain a FAS1 domain include:
The FAS1 domains of both human periostin and BIgH3 proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues. Gamma-carboxyglutamate residues are more commonly associated with GLA domains, where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the DHHC-type zinc finger domain, which is also known as NEW1. The DHHC Zn-finger was first isolated in the Drosophila putative transcription factor DNZ1 . The function of this domain is unknown, but it has been predicted to be involved in protein-protein or protein-DNA interactions.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The D-galactoside binding lectin purified from sea urchin (Anthocidaris crassispina) eggs exists as a disulphide-linked homodimer of two subunits; the dimeric form is essential for hemagglutination activity. The sea urchin egg lectin (SUEL) forms a new class of lectins. Although SUEL was first isolated as a D-galactoside binding lectin, it was latter shown that it bind to L-rhamnose preferentially. L-rhamnose and D-galactose share the same hydroxyl group orientation at C2 and C4 of the pyranose ring structure.
A cysteine-rich domain homologous to the SUEL protein has been identified in the following proteins:
Primary structure analysis has shown the presence of a similar domain in many carbohydrate-recognition proteins like plant and bacterial AB-toxins, glycosidases or proteases. This domain, known as the ricin B lectin domain, can be present in one or more copies and has been shown in some instance to bind simple sugars, such as galactose or lactose.
The ricin B lectin domain is composed of three homologous subdomains of 40 amino acids (alpha, beta and gamma) and a linker peptide of around 15 residues (lambda). It has been proposed that the ricin B lectin domain arose by gene triplication from a primitive 40 residue galactoside-binding peptide. The most characteristic, though not completely conserved, sequence feature is the presence of a Q-W pattern. Consequently, the ricin B lectin domain as also been refered as the (QxW)3 domain and the three homologous regions as the QxW repeats. A disulphide bond is also conserved in some of the QxW repeats.
The 3D structure of the ricin B chain has shown that the three QxW repeats pack around a pseudo threefold axis that is stabilised by the lambda linker. The ricin B lectin domain has no major segments of a helix or beta sheet but each of the QxW repeats contains an omega loop. An idealized omega-loop is a compact, contiguous segment of polypeptide that traces a 'loop-shaped' path in three-dimensional space; the main chain resembles a Greek omega.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C19 (ubiquitin-specific protease family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. Predicted active site residues for members of this family and family C1 occur in the same order in the sequence: N/Q, C, H. The type example is human ubiquitin-specific protease 14.
Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.
The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 100-200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.
The name HECT comes from 'Homologous to the E6-AP Carboxyl Terminus'. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.
Synonym(s): Steroid 5-alpha-reductase
3-oxo-5-alpha-steroid 4-dehydrogenases,catalyse the conversion of 3-oxo-5-alpha-steroid + acceptor to 3-oxo-delta(4)-steroid + reduced acceptor. The steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during foetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related steroid reductase enzyme, DET2, is found in plants such as Arabidopsis. Mutations in this enzyme cause defects in light-regulated development. This domain is present in both type 1 and type 2 forms.
Cytoskeleton-associated proteins (CAP) are made of three distinct parts, an N-terminal section that is most probably globular and contains the CAP-Gly domain, a large central region predicted to be in an alpha-helical coiled-coil conformation and, finally, a short C-terminal globular domain. The CAP-Gly domain is a conserved, glycine-rich domain of about 42 residues found in some CAPs. Proteins known to contain this domain include restin (also known as cytoplasmic linker protein-170 or CLIP-170), a 160 kDa protein associated with intermediate filaments and that links endocytic vesicles to microtubules; vertebrate dynactin (150 kDa dynein-associated polypeptide; DAP) and Drosophila glued, a major component of activator I; yeast protein BIK1, which seems to be required for the formation or stabilisation of microtubules during mitosis and for spindle pole body fusion during conjugation; yeast protein NIP100 (NIP80); human protein CKAP1/TFCB; Schizosaccharomyces pombe protein alp11 and Caenorhabditis elegans hypothetical protein F53F4.3. The latter proteins contain a N-terminal ubiquitin domain and a C-terminal CAP-Gly domain.
The crystal structure of the CAP-Gly domain of C. elegans F53F4.3 protein, solved by single wavelength sulphur-anomalous phasing, revealed a novel protein fold containing three beta-sheets. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. Residues in the groove are highly conserved as measured from the information content of the aligned sequences. The C-terminal tail of another molecule in the crystal is bound in this groove.
The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.
Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+ and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+ and the counter-transport of K+, with no apparent dependence on Cl-. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters.
Sequence analysis of the Na+/Cl- neurotransmitter superfamily reveals that it can be divided into four subfamilies, these being transporters for monoamines, the amino acids proline and glycine, GABA, and a group of orphan transporters.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP), All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties.
Some of the proteins containing an UBP zinc finger include:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The recessive suppressor of secretory defect in yeast Golgi and yeast actin function belongs to this family. This protein may be involved in the coordination of the activities of the secretory pathway and the actin cytoskeleton.
Human synaptojanin which may be localised on coated endocytic intermediates in nerve terminals also belongs to this family.
The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure.
Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.
The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.
The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.
The egg peptide speract receptor is a transmembrane glycoprotein. Other members of this family include the macrophage scavenger receptor type I (a membrane glycoprotein implicated in the pathologic deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell proliferation).
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases.
The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases . The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.
The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins. It mediates proteinÂprotein interactions and the assembly of multiprotein complexes. The TPR motif consists of 3Â16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.
The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helixÂturnÂhelix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel alpha-helices. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24° within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.
WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.
The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.
The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.
The drosophila pumilio gene codes for an unusual protein that binds through the Puf domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of Caenorhabditis elegans also has a Puf domain. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.
In Saccharomyces cerevisiae (Baker's yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.
The drosophila pumilio gene codes for an unusual protein that binds through the Puf domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of Caenorhabditis elegans also has a Puf domain. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.
In Saccharomyces cerevisiae (Baker's yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.
The drosophila tudor protein is encoded by a 'posterior group' gene, which when mutated disrupt normal abdominal segmentation and pole cell formation. Another drosophila gene, homeless, is required for RNA localization during oogenesis. The tudor protein contains multiple repeats of a domain which is also found in homeless.
The tudor domain is found in many proteins that colocalise with ribonucleoprotein or single-strand DNA-associated complexes in the nucleus, in the mitochondrial membrane, or at kinetochores. It is not known whether the domain binds directly to RNA and ssDNA, or controls interactions with the nucleoprotein complexes. At least one tudor-containing protein, homeless, also contains a zinc finger typical of RNA-binding proteins.
The resolution of the solution structure of the Tudor domain of human SMN revealed that the Tudor domain forms a strongly bent antiparallel beta-sheet with five strands forming a barrel-like fold. The structure exhibits a conserved negatively charged surface that interacts with the C-terminal Arg and Gly-rich tails of the spliceosomal Sm D1 and D3 proteins.
X-linked lissencephaly is a severe brain malformation affecting males. Recently it has been demonstrated that the doublecortin gene is implicated in this disorder . Doublecortin was found to bind to the microtubule cytoskeleton. In vivo and in vitro assays show that Doublecortin stabilizes microtubules and causes bundling. Doublecortin is a basic protein with an iso-electric point of 10, typical of microtubule-binding proteins. However, its sequence contains no known microtubule-binding domain(s).
The detailed sequence analysis of Doublecortin and Doublecortin-like proteins allowed the identification of an evolutionarily conserved Doublecortin (DC) domain. This domain is found in the N-terminus of proteins and consists of one or two tandemly repeated copies of an around 80 amino acids region. It has been suggested that the first DC domain of Doublecortin binds tubulin and enhances microtubule polymerization.
In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of Cephalopoda is also a GST.
Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.
Soluble GSTs activate glutathione (GSH) to GS-. In many GSTs, this is accomplished by a Tyr at H-bonding distance from the sulphur of GSH. These enzymes catalyse nucleophilic attack by reduced glutathione (GSH) on nonpolar compounds that contain an electrophilic carbon, nitrogen, or sulphur atom.
Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold, with each monomer composed of two distinct domains. The N-terminal domain forms a thioredoxin-like fold that binds the glutathione moiety, while the C-terminal domain contains several hydrophobic alpha-helices that specifically bind hydrophobic substrates.
This entry represents the N-terminal domain of GST.
In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of Cephalopoda is also a GST.
Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.
Soluble GSTs activate glutathione (GSH) to GS-. In many GSTs, this is accomplished by a Tyr at H-bonding distance from the sulphur of GSH. These enzymes catalyse nucleophilic attack by reduced glutathione (GSH) on nonpolar compounds that contain an electrophilic carbon, nitrogen, or sulphur atom.
Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold, with each monomer composed of two distinct domains. The N-terminal domain forms a thioredoxin-like fold that binds the glutathione moiety, while the C-terminal domain contains several hydrophobic alpha-helices that specifically bind hydrophobic substrates.
This entry represents the C-terminal domain of glutathione S-transferases, and a number of redox-regulated chloride ion channel proteins.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of proteins contain cysteine peptidases belonging to MEROPS peptidase family C48 (Ulp1 endopeptidase family, clan CE). The protein fold of the peptidase domain for members of this family resembles that of adenain, the type example for clan CE. This group of sequences also contains a number of hypothetical proteins, which have not yet been characterised, and non-peptidase homologues. These are proteins that have either been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of the peptidases in the family.
The Ulp1 endopeptidase family contain the deubiquitinating enzymes (DUB) that can de-conjugate ubiquitin or ubiquitin-like proteins from ubiquitin-conjugated proteins. They can be classified in 3 families according to sequence homology: Ubiquitin carboxyl-terminal hydrolase (UCH) (see, Ubiquitin-specific processing protease (UBP) (see , and ubiquitin-like protease (ULP) specific for de-conjugating ubiquitin-like proteins. In contrast to the UBP pathway, which is very redundant (16 UBP enzymes in yeast), there are few ubiquitin-like proteases (only one in yeast, Ulp1).
Ulp1 catalyses two critical functions in the SUMO/Smt3 pathway via its cysteine protease activity. Ulp1 processes the Smt3 C-terminal sequence (-GGATY) to its mature form (-GG), and it de-conjugates Smt3 from the lysine epsilon-amino group of the target protein.
Crystal structure of yeast Ulp1 bound to Smt3 revealed that the catalytic and interaction interface is situated in a shallow and narrow cleft where conserved residues recognise the Gly-Gly motif at the C-terminal extremity of Smt3 protein. Ulp1 adopts a novel architecture despite some structural similarity with other cysteine protease. The secondary structure is composed of seven alpha helices and seven beta strands. The catalytic domain includes the central alpha helix, beta-strands 4 to 6, and the catalytic triad (Cys-His-Asp). This profile is directed against the C-terminal part of ULP proteins that displays full proteolytic activity.
The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA binding domain found in diverse nuclear proteins involved in chromosomal organization, including in apoptosis. In yeast, SAP is found in the most distal N-terminal region of E3 SUMO-protein ligase SIZ1, where it is involved in nuclear localization.
This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65).
None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity.
Inteins, or protein introns, are parts of protein sequences that are post-translationally excised, their flanking regions (exteins) being spliced together to yield an additional protein product. This process is believed to be self-catalysed, apparently initiating at the C-terminal splice junction, where a conserved asparagine residue mediates the nucleophilic attack of the peptide bond between it and its neighbouring residue. Most inteins consist of two domains: One is involved in autocatalytic splicing, and the other is an endonuclease that is important in the spread of inteins.
Inteins are between 134 and 608 amino acids long, and they are found in members of all three domains of life: eukaryotes, bacteria, and archaea, although most frequently in archaea. Inteins are found in proteins with diverse functions, including metabolic enzymes, DNA and RNA polymerases, proteases, ribonucleotide reductases, and the vacuolar-type ATPase. However, enzymes involved in DNA replication and repair appear to dominate. Inteins are found in conserved regions of conserved proteins and can be regarded as parasitic genetic elements.
In most cases the intein seems to be an endonuclease which belongs to MEROPS peptidase family C46. It has been proposed that the splicing initiates at the C-terminal splice junction. The delta-nitrogen group of a conserved asparagine residue makes a nucleophilic attack on the peptide bond that links this asparagine to the next residue. The next residue (a Cys, Ser or Thr) is then free to attack the peptide bond at the N-terminal splice junction by a transpeptidation reaction that releases the intein and creates a new peptide bond. Such a mechanism is briefly schematised in the following figures.
Inteins are difficult to identify from sequence data because they lie in the same reading frame as the spliced protein and they are characterised by only a few short conserved motifs: two of these are similar to the nonapeptide LAGLIDADG, which is diagnostic of certain homing endonucleases (mutation of one such motif causes loss of endonuclease activity, but not of the protein splicing function); another includes the C' splice site, mutations in which disable protein function.
The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices.
Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain.
The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins. It has been shown to bind RNA. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitro RNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently.
According to structural analysis the KH domain can be separated in two groups. The first group or type-1 contain a beta-alpha-alpha-beta-beta-alpha structure, whereas in the type-2 the two last beta-sheet are located in the N terminal part of the domain (alpha-beta-beta-alpha-alpha-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-2 KH domain include eukaryotic and prokaryotic S3 family of ribosomal proteins, and the prokaryotic GTP-binding protein, era.
The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eukaryotic proteins of unknown function . It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition. Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand-binding site. There is limited homology within the C-terminal 20-30 amino acids of various GYF domains, supporting the idea that this part of the domain is structurally but not functionally important.
Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.
The S1 domain of around 70 amino acids, originally identified in ribosomal protein S1, is found in a large number of RNA-associated proteins. It has been shown that S1 proteins bind RNA through their S1 domains with some degree of sequence specificity. This type of S1 domain is found in translation initiation factor 1.
The solution structure of one S1 RNA-binding domain from Escherichia coli polynucleotide phosphorylase has been determined. It displays some similarity with the cold shock domain (CSD). Both the S1 and the CSD domain consist of an antiparallel beta barrel of the same topology with 5 beta strands. This fold is also shared by many other proteins of unrelated function and is known as the OB fold. However, the S1 and CSD fold can be distinguished from the other OB folds by the presence of a short 3(10) helix at the end of strand 3. This unique feature is likely to form a part of the DNA/RNA-binding site.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
The Brix domain is found in a number of eukaryotic proteins including some from Saccharomyces cerevisiae and Homo sapiens, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins.
There are six (one archaean and five eukaryotic) protein families which have a similar domain architecture with a central globular Brix domain. They have an optional N- and obligatory C-terminal segments, which both have charged low-complexity regions.
Proteins from the Imp4/Brix superfamily appear to be involved in ribosomal RNA processing, which essential for the functioning of all cells. The N- and C-terminal halves of a member of the superfamily, Mil, show significant structural similarity to one another. This suggests an origin by means of an ancestral duplication. Both halves have the same fold as the anticodon-binding domain of class IIa aminoacyl-tRNA synthetases, with greater conservation seen in the N-terminal half. Structural evidence suggests that the Imp4/Brix superfamily proteins could bind single-stranded segments of RNA along a concave surface formed by the N-terminal half of their beta-sheet and a central alpha-helix.
The Wnt signalling pathway is conserved in various species from Caenorhabditis elegans to mammals, and plays important roles in development, cellular proliferation, and differentiation. The molecular mechanisms by which the Wnt signal regulates cellular functions are becoming increasingly well understood. Wnt stabilizes cytoplasmic beta-catenin, which stimulates the expression of genes including c-myc, c-jun, fra-1, and cyclin D1. Axin and its homologue Axil are components of the Wnt signalling pathway that negatively regulate this pathway. Other components of the Wnt signalling pathway, including Dvl, glycogen synthase kinase-3beta (GSK-3beta), beta-catenin, and adenomatous polyposis coli (APC), interact with Axin, and the phosphorylation and stability of beta-catenin are regulated in the Axin complex. Axil has similar functions to Axin. Thus, Axin and Axil act as scaffold proteins in the Wnt signalling pathway, thereby modulating the Wnt-dependent cellular functions.
Proteins that transport heavy metals in micro-organisms and mammals share similarities in their sequences and structures.
These proteins provide an important focus for research, some being involved in bacterial resistance to toxic metals, such as lead and cadmium, while others are involved in inherited human syndromes, such as Wilson's and Menke's diseases.
A conserved domain has been found in a number of these heavy metal transport or detoxification proteins. The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding.
Structure solution of the fourth HMA domain of the MenkeÂs copper transporting ATPase shows a well-defined structure comprising a four-stranded antiparallel beta-sheet and two alpha helices packed in an alpha-beta sandwich fold. This fold is common to other domains and is classified as "ferredoxin-like".
Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).
Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.
The BSD domain is an about 60-residue long domain named after the BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins in which it is found. Additionally, it is also found in several hypothetical proteins. The BSD domain occurs in one or two copies in a variety of species ranging from primal protozoan to human. It can be found associated with other domains such as the BTB domain (see or the U-box in multidomain proteins. The function of the BSD domain is yet unknown.
Secondary structure prediction indicates the presence of three predicted alpha helices, which probably form a three-helical bundle in small domains. The third predicted helix contains neighbouring phenylalanine and tryptophan residues - less common amino acids that are invariant in all the BSD domains identified and that are the most striking sequence features of the domain.
Some proteins known to contain one or two BSD domains are listed below:VAMPs (and its homologue synaptobrevins) define a group of SNARE proteins that contain a C-terminal coiled-coil/SNARE domain, in combination with variable N-terminal domains that are used to classify VAMPs: those containing longin N-terminal domains (~150 aa) are referred to as longins, while those with shorter N-termini are referred to as brevins. Longins are the only type of VAMP protein found in all eukaryotes, suggesting that their longin domain is essential. The longin domain is thought to exert a regulatory function. Longin domains have been shown to share the same structural fold, a profilin-like globular domain consisting of a five-stranded antiparallel beta-sheet that is sandwiched by an alpha-helix on one side, and two alpha-helices on the other (beta(2)-alpha-beta(3)-alpha(2)).
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Alanyl-tRNA synthetase is an alpha4 tetramer that belongs to class IIc.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry recognises all class-II enzymes except for heterodimeric glycyl-tRNA synthetasesand alanyl- tRNA synthetases.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents MYND-type zinc finger domains. The MYND domain (myeloid, Nervy, and DEAF-1) is present in a large group of proteins that includes RP-8 (PDCD2), Nervy, and predicted proteins from Drosophila, mammals, Caenorhabditis elegans, yeast, and plants. The MYND domain consists of a cluster of cysteine and histidine residues, arranged with an invariant spacing to form a potential zinc-binding motif. Mutating conserved cysteine residues in the DEAF-1 MYND domain does not abolish DNA binding, which suggests that the MYND domain might be involved in protein-protein interactions. Indeed, the MYND domain of ETO/MTG8 interacts directly with the N-CoR and SMRT co-repressors. Aberrant recruitment of co-repressor complexes and inappropriate transcriptional repression is believed to be a general mechanism of leukemogenesis caused by the t(8;21) translocations that fuse ETO with the acute myelogenous leukemia 1 (AML1) protein. ETO has been shown to be a co-repressor recruited by the promyelocytic leukemia zinc finger (PLZF) protein. A divergent MYND domain present in the adenovirus E1A binding protein BS69 was also shown to interact with N-CoR and mediate transcriptional repression. The current evidence suggests that the MYND motif in mammalian proteins constitutes a protein-protein interaction domain that functions as a co-repressor-recruiting interface.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The GOLD (for Golgi dynamics) domain is a protein module found in several eukaryotic Golgi and lipid-traffic proteins. It is typically between 90 and 150 amino acids long. Most of the size difference observed in the GOLD-domain superfamily is traceable to a single large low-complexity insert that is seen in some versions of the domain. With the exception of the p24 proteins, which have a simple architecture with the GOLD domain as their only globular domain, all other GOLD-domain proteins contain additional conserved globular domains. In these proteins, the GOLD domain co-occurs with lipid-, sterol- or fatty acid-binding domains such as PH, CRAL-TRIO, FYVE oxysterol binding- and acyl CoA-binding domains, suggesting that these proteins may interact with membranes. The GOLD domain can also be found associated with a RUN domain, which may have a role in the interaction of various proteins with cytoskeletal filaments. The GOLD domain is predicted to mediate diverse protein-protein interactions. A secondary structure prediction for the GOLD domain reveals that it is likely to adopt a compact all-beta-fold structure with six to seven strands. Most of the sequence conservation is centred on the hydrophobic cores that support these predicted strands. The predicted secondary-structure elements and the size of the conserved core of the domain suggests that it may form a beta- sandwich fold with the strands arranged in two beta sheets stacked on each other.
Some proteins known to contain a GOLD domain are listed below:This region is found in a number of histone lysine methyltransferases (HMTase), C-terminal to the SET domain; it is generally described as the post-SET domain.
Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.
The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.
The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.
Retroviral reverse transcriptase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The discovery of retroelements in the prokaryotes raises intriguing questions concerning their roles in bacteria and the origin and evolution of reverse transcriptases and whether the bacterial reverse transcriptases are older than eukaryotic reverse transcriptases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S5 is one of the proteins from the small ribosomal subunit, and is a protein of 166 to 254 amino-acid residues. In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, cyanelle, red algal chloroplast, archaeal and fungal mitochondrial S5; mammalian, Caenorhabditis elegans, Drosophila and plant S2; and yeast S4 (SUP44).
This entry represents the N-terminal domain of ribosomal protein S5, which has an alpha-beta(3)-alpha structure that folds into two layers, alpha/beta.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterised, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA.
The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases.
Synaptobrevin is an intrinsic membrane protein of small synaptic vesicles, specialised secretory organelles of neurons that actively accumulate neurotransmitters and participate in their calcium-dependent release by exocytosis. Vesicle function is mediated by proteins in their membranes, although the precise nature of the protein-protein interactions underlying this are still uncertain. Synaptobrevin may play a role in the molecular events underlying neurotransmitter release and vesicle recycling and may be involved in the regulation of membrane flow in the nerve terminal, a process mediated by interaction with low molecular weight GTP-binding proteins. Synaptic vesicle-associated membrane proteins (VAMPs) from Torpedo californica (Pacific electric ray) and SNC1 from yeast are related to synaptobrevin.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.
ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.
The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.
The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.
The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
On the basis of sequence similarities a family of related ATP-binding proteins has been characterised.
The proteins belonging to this family also contain one or two copies of the 'A' consensus sequence or the 'P-loop'.
The LisH motif is found in a large number of eukaryotic proteins, from metazoa, fungi and plants that have a wide range of functions. The recently solved structure of the LisH domain in the N-terminal region of LIS1 depicted it as a novel dimerization motif, and that other structural elements are likely to play an important role in dimerisation.
A sequence motif, LisH, has been identified in the products of genes mutated in Miller-Dieker lissencephaly, Treacher Collins, oral-facial-digital type 1 and contiguous syndrome ocular albinism with late onset sensorineural deafness syndromes. An additional homologous motif was detected in a gene product fused to the fibroblast growth factor receptor type 1 in patients with an atypical stem cell myeloproliferative disorder. In total, over 100 eukaryotic intracellular proteins are shown to possess a LIS1 homology (LisH) motif, including several katanin p60 subunits, muskelin, tonneau, LEUNIG, Nopp140, aimless and numerous WD repeat-containing beta-propeller proteins.
It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The predicted secondary structure of LisH motifs, and their occurrence in homologues of Gbeta beta-propeller subunits, suggests that they are analogues of Ggamma subunits, and might associate with the periphery of beta-propeller domains.
The 33-residue LIS1 homology (LisH) motif is found in eukaryotic intracellular proteins involved in microtubule dynamics, cell migration, nucleokinesis and chromosome segregation. The LisH motif is likely to possess a conserved protein-binding function and it has been proposed that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The LisH motif is found associated to other domains, such as WD-40 (see, SPRY, Kelch, AAA ATPase, RasGEF, or HEAT (see. The secondary structure of the LisH domain is predicted to be two alpha- helices.
Some proteins known to contain a LisH motif are listed below:The C-terminal to LisH (CTLH) motif is a predicted alpha-helical sequence of unknown function that is found adjacent to the LisH motif in a number of these proteins but is absent in other (e.g. LIS1). The CTLH domain can also be found in the absence of the LisH motif, like in:
This domain is found in a number of proteins including flavodoxin and nitric-oxide synthase. Flavodoxins are electron-transfer proteins that function in various electron transport systems. They bind one FMN molecule, which serves as a redox-active prosthetic group and are functionally interchangeable with ferredoxins. They have been isolated from prokaryotes, cyanobacteria, and some eukaryotic algae. Nitric oxide synthase produces nitric oxide from L-arginie and NADPH. Nitric oxide acts as a messenger molecule in the body.
These proteins contain a conserved region found in the yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. GFP-tagged MSF1 localizes to mitochondria and is required for wild-type respiratory growth. This region is also found in a number of other eukaryotic proteins. The PRELI/MSF1 domain is an eukaryotic protein module which occurs in stand-alone form in several proteins, including the human PRELI protein and the yeast MSF1 protein, and as an amino-terminal domain in an orthologous group of proteins typified by human SEC14L1, which is conserved in all animals. In this group of proteins, the PRELI/MSF1 domain co-occurs with the CRAL-TRIO (see and the GOLD domains (see. The PRELI/MSF1 domain is approximately 170 residues long and is predicted to assume a globular alpha + beta fold with six beta strands and four alpha helices. It has been suggested that the PRELI/MSF1 domain may have a function associated with cellular membrane.
The RWD eukaryotic domain is found in RING finger and WD repeat containing proteins and DEXDc-like helicase subfamily related to the ubiquitin-conjugating enzymes domain.
The MIR domain is named after three of the proteins in which it occurs: protein Mannosyltransferase, Inositol 1,4,5-trisphosphate receptor (IP3R) and Ryanodine receptor (RyR). MIR domains have also been found in eukaryotic stromal cell-derived factor 2 (SDF-2) and in Chlamydia trachomatis protein CT153. The MIR domain may have a ligand transferase function. This domain has a closed beta-barrel structure with a hairpin triplet, and has an internal pseudo-threefold symmetry. The MIR motifs that make up the MIR domain consist of ~50 residues and are often found in multiple copies.
Inositol 1,4,5-trisphosphate (InsP3) is an intracellular second messenger that transduces growth factor and neurotransmitter signals. InsP3 mediates the release of Ca2+ from intracellular stores by binding to specific Ca2+ channel-coupled receptors. Ryanodine receptors are involved in communication between transverse-tubules and the sarcoplamic reticulum of cardiac and skeletal muscle. The proteins function as a Ca2+-release channels following depolarisation of transverse-tubules. The function is modulated by Ca2+, Mg2+, ATP and calmodulin. Deficiency in the ryanodine receptor may be the cause of malignant hyperthermia (MH) and of central core disease of muscle (CCD). protein O-mannosyltransferases transfer mannose from DOL-P-mannose to ser or thr residues on proteins.
A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.
TLC is a protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis , TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. Proteins containing this domain may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.
ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.
The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.
The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.
The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
ABC transporters minimally contain two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). In certain bacterial transporters, these regions are found on different polypeptides. The function of the integral inner-membrane protein is to translocate the substrate across the membrane, as well as in substrate recognition.
This entry is a ABC transporter integral membrane type 1 fused domain.
The SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in eukaryotic chromosomal proteins. It is named after the proteins SWI3, RSC8 and MOIRA in which it was first recognised. This domain is predicted to mediate protein-protein interactions in the assembly of chromatin-protein complexes. The SWIRM domain can be linked to different domains, such as the ZZ-type zinc finger, the Myb DNA-binding domain, the HORMA domain, the amino-oxidase domain, the chromo domain, and the JAB1/PAD1 domain.
The ENTH (Epsin N-terminal homology) domain is approximately 150 amino acids in length and is always found located at the N-termini of proteins. The domain forms a compact globular structure, composed of 9 alpha-helices connected by loops of varying length. The general topology is determined by three helical hairpins that are stacked consecutively with a right hand twist. An N-terminal helix folds back, forming a deep basic groove that forms the binding pocket for the Ins(1,4,5)P3 ligand. The ligand is coordinated by residues from surrounding alpha-helices and all three phosphates are multiply coordinated. The coordination of Ins(1,4,5)P3 suggests that ENTH is specific for particular head groups.
Proteins containing this domain have been found to bind PtdIns(4,5)P2 and PtdIns(1,4,5)P3 suggesting that the domain may be a membrane interacting module. The main function of proteins containing this domain appears to be to act as accessory clathrin adaptors in endocytosis, Epsin is able to recruit and promote clathrin polymerisation on a lipid monolayer, but may have additional roles in signalling and actin regulation. Epsin causes a strong degree of membrane curvature and tubulation, even fragmentation of membranes with a high PtdIns(4,5)P2 content. Epsin binding to membranes facilitates their deformation by insertion of the N-terminal helix into the outer leaflet of the bilayer, pushing the head groups apart. This would reduce the energy needed to curve the membrane into a vesicle, making it easier for the clathrin cage to fix and stabilise the curved membrane. This points to a pioneering role for epsin in vesicle budding as it provides both a driving force and a link between membrane invagination and clathrin polymerisation.
This is large family of DNA binding helix-turn helix proteins that include a bacterial plasmid copy control protein, bacterial methylases, various bacteriophage transcription control proteins and a vegetative specific protein from Dictyostelium discoideum (Slime mould).
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination. SWIM domains are also found in the homologous recombination protein Sws1, as well as in several hypothetical proteins.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesize folate de novo because they lack the active transport system of higher vertebrate cells that allows these organisms to use dietary folates. Proteins containing this domain include dihydropteroate synthase as well as a group of methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulphur protein methyltransferase (MeTr)that catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation.
Dihydropteroate synthase (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).
The ATP-grasp superfamily currently includes 17 groups of enzymes, catalyzing ATP-dependent ligation of a carboxylate containing molecule to an amino or thiol group-containing molecule. They contribute predominantly to macromolecular synthesis. ATP-hydrolysis is used to activate a substrate. For example, DD-ligase transfers phosphate from ATP to D-alanine on the first step of catalysis. On the second step the resulting acylphosphate is attacked by a second D-alanine to produce a DD dipeptide following phosphate elimination.
The ATP-grasp domain contains three conserved motifs, corresponding to the phosphate binding loop and the Mg(2+) binding site. The fold is characterised by two alpha-beta subdomains that grasp the ATP molecule between them. Each subdomain provides a variable loop that forms part of the active site, with regions from other domains also contributing to the active site, even though these other domains are not conserved between the various ATP-grasp enzymes.
Biotin-dependent carboxylase enzymes perform a two step reaction. Enzyme-bound biotin is first carboxylated by bicarbonated and ATP and the carboxyl group temporarily bound to biotin is subsequently transferred to an acceptor substrate such as pyruvate or acetyl-CoA. The first step is mediated by the BC domain common to all biotin-dependent carboxylases. The BC domain can be divided in three subdomains (N-terminal, central and C-terminal). The N-terminal region provides part of the active site; the central region corresponds to the ATP-grasp domain, which is common to many ATP-dependent enzymes involved in macromolecular synthesis. The ATP-grasp module directly binds the ATP molecule. The C-terminal subdomain is involved in dimer formation.
Several structure of the BC domain have been solved . The central module is splayed significantly away from the main body of the domain and is able to rotate of approximately 45 degree upon nucleotide binding thereby closing off the active site pocket.
Acetyl-coenzyme A carboxylase (ACC), a member of the biotin-dependent enzyme family, catalyses the formation of malonyl-coenzyme A (CoA) and regulates fatty acid biosynthesis and oxidation. Biotin-dependent carboxylase enzymes perform a two step reaction: enzyme-bound biotin is first carboxylated by bicarbonate and ATP and the carboxyl group temporarily bound to biotin is subsequently transferred to an acceptor substrate such as acetyl-CoA. The carboxyltransferase domain performs the second part of the reaction.
The N- and C-terminal regions of the carboxyltransferase domain share similar polypeptide backbone folds, with a central beta-beta-alpha superhelix. The CoA molecule is mostly associated with the N subdomain. In bacterial acetyl coenzyme A carboxylase the N and C subdomains are encoded by two different polypeptides.
This entry represents the N terminal subdomain and contains the bacterial ACC beta-subunit.
The most abundant modification seen in structured RNAs (transfer, ribosomal, and splicing RNAs) is the isomerization of uridine (U) to pseudouridine (5- ribosyluracil). Pseudouridine is made by a set of enzymes called pseudouridine synthase, which select specific U residues in a polynucleotide chain for isomerization to pseudouridine. Pseudouridine synthases are ubiquitous as putative synthase genes have been found in all genomes so far sequenced. TruD, a pseudouridine synthase in Escherichia coli, is responsible for modifying U 13 in tRNA-Glu to pseudouridine. Homologs of truD have been identified in eubacteria, archaea, and eukarya. Because all of the organisms known to have pseudouridine 13 in their tRNAs also have a truD homolog, it is reasonable to infer that truD homologs in those organisms with tRNA pseudouridine 13 are the responsible synthases.
TruD folds into a V-shaped molecule with two distinct modules: a catalytic domain that differs in sequence but is structurally very similar to the catalytic domain of other pseudouridine synthases and a TRUD domain of ~150 amino acids with a alpha/beta fold. The TRUD domain forms a compact fold that is titled away from the catalytic domain to form a deep cleft in truD which is lined with basic residues from each domain. The TRUD domain is always associated with a truD-type catalytic domain and is not found on its own or attached to another type of protein as a separate module. Furthermore, there are no truD-type catalytic domain that lack the TRUD domain insert. The TRUD domain is characterised by two conserved sequence motifs that form a part of the hydrophobic core. The TRUD domain sequence in the truD family is also characterised by large insertions at several specific sites that are seen in many archaeal and eukaryotic homologs. The TRUD domain is likely to be involved in substrate recognition and may represent a RNA binding module.
Acetyl-coenzyme A carboxylase (ACC), a member of the biotin-dependent enzyme family, catalyses the formation of malonyl-coenzyme A (CoA) and regulates fatty acid biosynthesis and oxidation. Biotin-dependent carboxylase enzymes perform a two step reaction: enzyme-bound biotin is first carboxylated by bicarbonate and ATP and the carboxyl group temporarily bound to biotin is subsequently transferred to an acceptor substrate such as acetyl-CoA. The carboxyltransferase domain performs the second part of the reaction.
The N- and C-terminal regions of the carboxyltransferase domain share similar polypeptide backbone folds, with a central beta-beta-alpha superhelix. The CoA molecule is mostly associated with the N subdomain. In bacterial acetyl coenzyme A carboxylase the N and C subdomains are encoded by two different polypeptides.
A group of polyamine biosynthetic enzymes involved in the fifth (last) step in the biosynthesis of spermidine from arginine and methionine which includes; spermidine synthase, spermine synthase and putrescine N-methyltransferase.
The Thermotoga maritima spermidine synthase monomer consists of two domains: an N-terminal domain composed of six beta-strands, and a Rossmann-like C- terminal domain. The larger C-terminal catalytic core domain consists of a seven-stranded beta-sheet flanked by nine alpha helices. This domain resembles a topology observed in a number of nucleotide and dinucleotide-binding enzymes, and in S-adenosyl-L-methionine (AdoMet)- dependent methyltransferase (MTases).
After cytochrome c is synthesized in the cytoplasm as apocytochrome c, it is transported through the outer mitochondrial membrane to the intermembrane space, where haem is covalently attached by thioester bonds to two cysteine residues located in the cytochrome c centre. Cytochrome c is required during oxidative phosphorylation as an electron shuttle between Complex III (cytochrome c reductase) and IV (cytochrome c oxidase). In addition, cytochrome c is involved in apoptosis in more complex organisms such as Xenopus, rats and humans. Cellular stress can induce cytochrome c release from the mitochondrial membrane. In mammals, cytochrome c triggers the assembly of the apoptosome, consisting of cytochrome c, Apaf-1 and dATP, which activates caspase-9, leading to cell death. There are several different members of the cytochrome c family with different functional roles, for instance cytochrome c549 is associated with photosystem II.
The known structures of c-type cytochromes have six different classes of fold. Of these, four are unique to c-type cytochromes. The consensus sequence for the cytochrome c centre is Cys-X-X-Cys-His, where the histidine residue is one of the two axial ligands of the haem iron. This arrangement is shared by all proteins known to belong to the cytochrome c family, which presently includes both mono-haem proteins and multi-haem proteins. This entry represents mono-haem cytochrome c proteins (excluding class II and f-type cytochromes), such as cytochromes c, c1, c2, c5, c555, c550 to c553, c556, and c6.
Cytochrome c-type centres are also found in the active sites of many enzymes, including cytochrome cd1-nitrite reductase as the N-terminal haem c domain, in quinoprotein alcohol dehydrogenase as the C-terminal domain, in Quinohemoprotein amine dehydrogenase A chain as domains 1 and 2, and in the cytochrome bc1 complex as the cytochrome bc1 domain.
Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.
The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions.
The PWI domain, named after a highly conserved PWI tri-peptide located within its N-terminal region, is a ~80 amino acid module, which is found either at the N-terminus or at the C-terminus of eukaryotic proteins involved in pre-mRNA processing. It is generally found in association with other domains such as RRM and RS. The PWI domain is a RNA/DNA-binding domain that has an equal preference for single- and double-stranded nucleic acids and is likely to have multiple important functions in pre-mRNA processing. Proteins containing this domain include the SR-related nuclear matrix protein of 160 kD (SRm160) splicing and 3'-end cleavage-stimulatory factor, and the mammalian splicing factor PRP3.
The PWI domain is a soluble, globular and independently folded domain which consists of a four-helix bundle, with structured N- and C-terminal elements.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the zinc finger domain found in A20. A20 is an inhibitor of cell death that inhibits NF-kappaB activation via the tumour necrosis factor receptor associated factor pathway. The zinc finger domains appear to mediate self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Named the YEATS family, after 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', this family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the AN1-type zinc finger domain, which has a dimetal (zinc)-bound alpha/beta fold. This domain was first identified as a zinc finger at the C-terminus of AN1 a ubiquitin-like protein in Xenopus laevis. The AN1-type zinc finger contains six conserved cysteines and two histidines that could potentially coordinate 2 zinc atoms.
Certain stress-associated proteins (SAP) contain AN1 domain, often in combination with A20 zinc finger domains (SAP8) or C2H2 domains (SAP16). For example, the human protein Znf216 has an A20 zinc-finger at the N-terminus and an AN1 zinc-finger at the C-terminus, acting to negatively regulate the NFkappaB activation pathway and to interact with components of the immune response like RIP, IKKgamma and TRAF6. The interact of Znf216 with IKK-gamma and RIP is mediated by the A20 zinc-finger domain, while its interaction with TRAF6 is mediated by the AN1 zinc-finger domain; therefore, both zinc-finger domains are involved in regulating the immune response. The AN1 zinc finger domain is also found in proteins containing a ubiquitin-like domain, which are involved in the ubiquitination pathway. Proteins containing an AN1-type zinc finger include:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents MIZ-type zinc finger domains. Miz1 (Msx-interacting-zinc finger) is a zinc finger-containing protein with homology to the yeast protein, Nfi-1. Miz1 is a sequence specific DNA binding protein that can function as a positive-acting transcription factor. Miz1 binds to the homeobox protein Msx2, enhancing the specific DNA-binding ability of Msx2. Other proteins containing this domain include the human pias family (protein inhibitor of activated STAT protein).
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a CW-type zinc finger motif, named for its conserved cysteine and tryptophan residues. It is predicted to be a highly specialised mononuclear four-cysteine (C4) zinc finger that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including chromatin methylation status and early embryonic development. Weak homology to members offurther evidences these predictions. The domain is found exclusively in vertebrates, vertebrate-infecting parasites and higher plants.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The glycophorin-binding protein contains a tandem repeat. The repeated sequence determines the binding domain for an erythrocyte receptor binding protein of Plasmodium falciparum, the malarial parasite. Erythrocyte invasion by the malarial merozoite is a receptor-mediated process, an obligatory step in the development of the parasite. The P. falciparum protein binds to the erythrocyte receptor glycophorin.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents the C-terminal domain of the mu subunit from various clathrin adaptors (AP1, AP2 and AP3). The C-teminal domain has an immunoglobulin-like beta-sandwich fold consisting of 9 strands in 2 sheets with a Greek key topology, similar to that found in cytochrome f and certain transcription factors. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle.
More information about these proteins can be found at Protein of the Month: Clathrin.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a probable zinc binding motif that contains four cysteines and may chelate zinc, known as the DPH-type after the diphthamide (DPH) biosynthesis protein in which it was first characterised, including the proteins DPH3 and DPH4. This domain is also found associated with N-terminal domain of heat shock protein DnaJdomain.
Diphthamide is a unique post-translationally modified histidine residue found only in translation elongation factor 2 (eEF-2). It is conserved from archaea to humans and serves as the target for diphteria toxin and Pseudomonas exotoxin A. These two toxins catalyse the transfer of ADP-ribose to diphtamide on eEF-2, thus inactivating eEF-2, halting cellular protein synthesis, and causing cell death. The biosynthesis of diphtamide is dependent on at least five proteins, DPH1 to -5, and a still unidentified amidating enzyme. DPH3 and DPH4 share a conserved region, which encode a putative zinc finger, the DPH-type or CSL-type (after the conserved motif of the final cysteine) zinc finger. The function of this motif is unknown.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the HIT-type zinc finger, which contains 7 conserved cysteines and one histidine that can potentially coordinate two zinc atoms. It has been named after the first protein that originally defined the domain: the yeast HIT1 protein. The HIT-type zinc finger displays some sequence similarities to the MYND-type zinc finger. The function of this domain is unknown but it is mainly found in nuclear proteins involved in gene regulation and chromatin remodeling. This domain is also found in the thyroid receptor interacting protein 3 (TRIP-3) that specifically interacts with the ligand binding domain of the thyroid receptor.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The Histidine Triad (HIT) motif, His-phi-His-phi-His-phi-phi (phi, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HIT superfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles. Hint homologues including rabbit Hint and yeast Hnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 and AMP-lysine to AMP plus the amine product and function as positive regulators of Cdk7/Kin28 in vivo. Fhit homologues are diadenosine polyphosphate hydrolases and function as tumour suppressors in human and mouse though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis. The third branch of the HIT superfamily, which includes GalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates rather than hydrolysing them.
The ferredoxin protein family are electron carrier proteins with an iron-sulphur cofactor that act in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s) and according to sequence similarities.
This entry represents members of the 2Fe-2S ferredoxin family that have a general core structure consisting of beta(2)-alpha-beta(2), which includes putidaredoxin and terpredoxin, and adrenodoxin. They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes and in multidomain proteins, such as aldehyde oxidoreductase (N-terminal), xanthine oxidase (N-terminal), phthalate dioxygenase reductase (C-terminal), succinate dehydrogenase iron-sulphur protein (N-terminal), and methane monooxygenase reductase (N-terminal).
The contiguous gene deletion syndrome is characterised by Alport syndrome (A), mental retardation (M), midface hypoplasia (M), and elliptocytosis (E), as well as generalized hypoplasia and cardiac abnormalities. It is caused by a deletion in Xq22.3, comprising several genes including AMME chromosomal region gene 1 (AMMECR1), which encodes a protein with a nuclear location and presently unknown function. The C-terminal region of AMMECR1 (from residue 122 to 333) is well conserved, and homologues appear in species ranging from bacteria and archaea to eukaryotes. The high level of conservation of the AMMECR1 domain points to a basic cellular function, potentially in either the transcription, replication, repair or translation machinery.
The AMMECR1 domain contains a 6-amino-acid motif (LRGCIG) that might be functionally important since it is strikingly conserved throughout evolution. The AMMECR1 domain consists of two distinct subdomains of different sizes. The large subdomain, which contains both the N- and C-terminal regions, consists of five alpha-helices and five beta-strands. These five beta-strands form an antiparallel beta-sheet. The small subdomain consists of four alpha-helices and three beta-strands, and these beta-strands also form an antiparallel beta-sheet. The conserved 'LRGCIG' motif is located at beta(2) and its N-terminal loop, and most of the side chains of these residues point toward the interface of the two subdomains. The two subdomains are connected by only two loops, and the interaction between the two subdomains is not strong. Thus, these subdomains may move dynamically when the substrate enters the cleft. The size of the cleft suggests that the substrate is large, e.g., the substrate may be a nucleic acid or protein. However, the inner side of the cleft is not filled with positively charged residues, and therefore it is unlikely that negatively charged nucleic acids such as DNA or RNA interact at this site.
Snz1p is a highly conserved protein involved in growth arrest in Saccharomyces cerevisiae (Baker's yeast). Sor1 (singlet oxygen resistance) is essential in pyridoxine (vitamin B6) synthesis in Cercospora nicotianae and Aspergillus flavus. Pyridoxine quenches singlet oxygen at a rate comparable to that of vitamins C and E, two of the most highly efficient biological antioxidants, suggesting a previously unknown role for pyridoxine in active oxygen resistance..
Members of this family are involved in the pyridoxine biosynthetic pathway. The regulation of cellular growth and proliferation in response to environmental cues is critical for development and the maintenance of viability in all organisms. In unicellular organisms, such as the budding yeast Saccharomyces cerevisiae (Baker's yeast), growth and proliferation are regulated by nutrient availability.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a zinc finger motif found in transcription factor IIs (TFIIS). In eukaryotes the initiation of transcription of protein encoding genes by polymerase II (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least eight different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, -IIH and -IIS. During mRNA elongation, Pol II can encounter DNA sequences that cause reverse movement of the enzyme. Such backtracking involves extrusion of the RNA 3'-end into the pore, and can lead to transcriptional arrest. Escape from arrest requires cleavage of the extruded RNA with the help of TFIIS, which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS extends from the polymerase surface via a pore to the internal active site. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
TFIIS is a protein of about 300 amino acids. It contains three regions: a variable N-terminal domain not required for TFIIS activity; a conserved central domain required for Pol II binding; and a conserved C-terminal C4-type zinc finger essential for RNA cleavage. The zinc finger folds in a conformation termed a zinc ribbon characterised by a three-stranded antiparallel beta-sheet and two beta-hairpins. A backbone model for Pol II-TFIIS complex was obtained from X-ray analysis. It shows that a beta hairpin protrudes from the zinc finger and complements the pol II active site.
Some viral proteins also contain the TFIIS zinc ribbon C-terminal domain. The Vaccinia virus protein, unlike its eukaryotic homologue, is an integral RNA polymerase subunit rather than a readily separable transcription factor.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a zinc finger motif found in transcription factor IIB (TFIIB). In eukaryotes the initiation of transcription of protein encoding genes by the polymerase II complexe (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least seven different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, and -IIH.
TFIIB and TFIID are responsible for promoter recognition and interaction with pol II; together with Pol II, they form a minimal initiation complex capable of transcription under certain conditions. The TATA box of a Pol II promoter is bound in the initiation complex by the TBP subunit of TFIID, which bends the DNA around the C-terminal domain of TFIIB whereas the N-terminal zinc finger of TFIIB interacts with Pol II.
The TFIIB zinc finger adopts a zinc ribbon fold characterised by two beta-hairpins forming two structurally similar zinc-binding sub-sites. The zinc finger contacts the rbp1 subunit of Pol II through its dock domain, a conserved region of about 70 amino acids located close to the polymerase active site. In the Pol II complex this surface is located near the RNA exit groove. Interestingly this sequence is best conserved in the three polymerases that utilise a TFIIB-like general transcription factor (Pol II, Pol III, and archaeal RNA polymerase) but not in Pol I.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.
This entry represents alpha class carbonic anhydrases.
More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.
Protein prenylation is the posttranslational attachment of either a farnesyl group or a geranylgeranyl group via a thioether linkage (-C-S-C-) to a cysteine at or near the carboxyl terminus of the protein. Farnesyl and geranylgeranyl groups are polyisoprenes, unsaturated hydrocarbons with a multiple of five carbons; the chain is 15 carbons long in the farnesyl moiety and 20 carbons long in the geranylgeranyl moiety. There are three different protein prenyltransferases in humans: farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) share the same motif (the CaaX box) around the cysteine in their substrates, and are thus called CaaX prenyltransferases, whereas geranylgeranyltransferase 2 (GGT2, also called Rab geranylgeranyltransferase) recognises a different motif and is thus called a non-CaaX prenyltransferase. Protein prenyltransferases are currently known only in eukaryotes, but they are widespread, being found in vertebrates, insects, nematodes, plants, fungi and protozoa, including several parasites.
Each protein consists of two subunits, alpha and beta; the alpha subunit of FT and GGT1 is encoded by the same gene, FNTA. The alpha subunit is thought to participate in a stable complex with the isoprenyl substrate; the beta subunit binds the peptide substrate. In the alpha subunits of both types of protein prenyltransferases, seven tetratricopeptide repeats are formed by pairs of helices that are stabilized by conserved intercalating residues. The alpha subunits of GGT2 in mammals and plants also have an immunoglobulin-like domain between the fifth and sixth tetratricopeptide repeat, as well as leucine-rich repeats at the carboxyl terminus. The functions of these additional domains in GGT2 are as yet undefined, but they are apparently not directly involved in the interaction with substrates and Rab escort proteins. The tetratricopeptide repeats of the alpha subunit form a right-handed superhelix, which embraces the (alpha-alpha)6 barrel of the beta subunit.
Nascent polypeptide-associated complex (NAC) is among the first ribosome-associated entities to bind the nascent polypeptide after peptide bond formation. The nascent polypeptide-associated complex (NAC) of yeast functions in the targeting process of ribosomes to the ER membrane. NAC may prevent binding of ribosome nascent chains (RNCs) without a signal sequence to yeast membranes.
The Macro or A1pp domain is a module of about 180 amino acids which can bind ADP-ribose, an NAD metabolite or related ligands. The domain was described originally in association with ADP-ribose 1''-phosphate (Appr-1''-P) processing activity (A1pp) of the yeast YBR022W protein. The domain is also called Macro domain as it is the C-terminal domain of mammalian core histone macro-H2A. Macro domain proteins can be found in eukaryotes, in (mostly pathogenic) bacteria, in archaea and in ssRNA viruses, such as coronaviruses, Rubella and Hepatitis E viruses. In vertebrates the domain occurs e.g. in histone macroH2A, in predicted poly-ADP-ribose polymerases (PARPs) and in B aggressive lymphoma (BAL) protein. The macro domain can be associated with catalytic domains, such as PARP, or sirtuin. The Macro domain can recognize ADP-ribose or in some cases poly-ADP-ribose, which can be involved in ADP-ribosylation reactions that occur in important processes, such as chromatin biology, DNA repair and transcription regulation. The human macroH2A1.1 Macro domain binds an NAD metabolite O-acetyl-ADP-ribose. The Macro domain has been suggested to play a regulatory role in ADP-ribosylation, which is involved in inter- and intracellular signaling, transcriptional regulation, DNA repair pathways and maintenance of genomic stability, telomere dynamics, cell differentiation and proliferation, and necrosis and apoptosis.
The 3D structure of the Macro domain has a mixed alpha/beta fold of a mixed beta sheet sandwiched between four helices. Several Macro domain only domains are shorter than the structure of AF1521 and lack either the first strand or the C-terminal helix 5. Well conserved residues form a hydrophobic cleft and cluster around the AF1521-ADP-ribose binding site.
The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain and a GATA binding domain. ELM2, in some instances, is also found associated with the ARID DNA binding domain This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
The N-end rule-based degradation signal, which targets a protein for ubiquitin-dependent proteolysis, comprises a destabilizing amino-terminal residue and a specific internal lysine residue. This entry describes a putative zinc finger in N-recognin, a recognition component of the N-end rule pathway.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Acylphosphatase is an enzyme of approximately 98 amino acid residues that specifically catalyses the hydrolysis of the carboxyl-phosphate bond of acylphosphates, its substrates including 1,3-diphosphoglycerate and carbamyl phosphate. The enzyme has a mainly beta-sheet structure with 2 short alpha-helical segments. It is distributed in a tissue-specific manner in a wide variety of species, although its physiological role is as yet unknown: it may, however, play a part in the regulation of the glycolytic pathway and pyrimidine biosynthesis. There are two known isozymes. One seems to be specific to muscular tissues, the other, called 'organ-common type', is found in many different tissues. While bacterial and archebacterial hypothetical proteins that are highly similar to that enzyme and that probably possess the same activity.
These proteins include:
The ATP-cone is an evolutionarily mobile, ATP-binding regulatory domain which is found in a variety of proteins including ribonucleotide reductases, phosphoglycerate kinases and transcriptional regulators.
In ribonucleotide reductase protein R1 from Escherichia coli this domain is located at the N-terminus, and is composed mostly of helices. It forms part of the allosteric effector region and contains the general allosteric activity site in a cleft located at the tip of the N-terminal region. This site binds either ATP (activating) or dATP (inhibitory), with the base bound in a hydrophobic pocket and the phosphates bound to basic residues. Substrate binding to this site is thought to affect enzyme activity by altering the relative positions of the two subunits of ribonucleotide reductase.
The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation. The YrdC protein, which consists solely of this domain, forms an alpha/beta twisted open-sheet structure composed of seven alpha helices and seven beta strands. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases.
Jumonji protein is required for neural tube formation in mice.There is evidence of domain swapping within the jumonji family of transcription factors. This domain is often associated with JmjC.
This entry contains:
Histone acetylation is carried out by a class of enzymes known as histone acetyltransferases (HATs), which catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histone. Early indication that HATs were involved in transcription came from the observation that in actively transcribed regions of chromatin, histones tend to be hyperacetylated, whereas in transcriptionally silent regions histones are hypoacetylated. The histone acetyltransferases are divided into five families. These include the Gcn5-related acetyltransferases (GNATs); the MYST (for 'MOZ, Ybf2/Sas3, Sas2 and Tip60)-related HATs; p300/CBP HATs; the general transcription factor HATs, which include the TFIID subunit TAF250; and the nuclear hormone-related HATs SRC1 and ACTR (SRC3). The GCN5-related N-acetyltransferase superfamily includes such enzymes as the histone acetyltransferases GCN5 and Hat1, the elongator complex subunit Elp3, the mediator-complex subunit Nut1, and Hpa2 .
Many GNATs share several functional domains, including an N-terminal region of variable length, an acetyltransferase domain that encompasses the conserved sequence motifs described above, a region that interacts with the coactivator Ada2, and a C-terminal bromodomain that is believed to interact with acetyl-lysine residues. Members of the GNAT family are important for the regulation of cell growth and development. In mice, knockouts of Gcn5L are embryonic lethal. Yeast Gcn5 is needed for normal progression through the G2ÂM boundary and mitotic gene expression. The importance of GNATs is probably related to their role in transcription and DNA repair.
The yeast GCN5 (yGCN5) transcriptional coactivator functions as a histone acetyltransferase (HAT) to promote transcriptional activation. The crystal structure of the yeast histone acetyltransferase Hat1-acetyl coenzyme A (AcCoA) shows that Hat1 has an elongated, curved structure, and the AcCoA molecule is bound in a cleft on the concave surface of the protein, marking the active site of the enzyme. A channel of variable width and depth that runs across the protein is probably the binding site for the histone substrate. The central protein core associated with AcCoA binding that appears to be structurally conserved among a superfamily of N-acetyltransferases, including yeast histone acetyltransferase 1 and Serratia marcescens aminoglycoside 3-N-acetyltransferase.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolyzing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.
Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs has been identified . These two superfamilies encompass a large number of DNA and RNA helicases from archaea, eubacteria, eukaryotes and viruses that seem to be active as monomers or dimers. RNA and DNA helicases are considered to be enzymes that catalyse the separation of double-stranded nucleic acids in an energy-dependent manner.
The various structures of SF1 and SF2 helicases present a common core with two alpha-beta RecA-like domains . The structural homology with the RecA recombination protein covers the five contiguous parallel beta strands and the tandem alpha helices. ATP binds to the amino proximal alpha-beta domain, where the Walker A (motif I) and Walker B (motif II) are found. The N-terminal domain also contains motif III (S-A-T) which was proposed to participate in linking ATPase and helicase activities. The carboxy-terminal alpha-beta domain is structurally very similar to the proximal one even though it is bereft of an ATP-binding site, suggesting that it may have originally arisen through gene duplication of the first one.
Some members of helicase superfamilies 1 and 2 are listed below:
This entry represents the ATP-binding domain found within most SF1 and SF2 helicases.
Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs has been identified. These two superfamilies encompass a large number of DNA and RNA helicases from archaea, eubacteria, eukaryotes and viruses that seem to be active as monomers or dimers. RNA and DNA helicases are considered to be enzymes that catalyze the separation of double-stranded nucleic acids in an energy-dependent manner.
The various structures of SF1 and SF2 helicases present a common core with two alpha-beta RecA-like domains. The structural homology with the RecA recombination protein covers the five contiguous parallel beta strands and the tandem alpha helices. ATP binds to the amino proximal alpha-beta domain, where the Walker A (motif I) and Walker B (motif II) are found. The N-terminal domain also contains motif III (S-A-T) which was proposed to participate in linking ATPase and helicase activities. The carboxy-terminal alpha-beta domain is structurally very similar to the proximal one even though it is bereft of an ATP-binding site, suggesting that it may have originally arisen through gene duplication of the first one.
Some members of helicase superfamilies 1 and 2 are listed below:
This entry represents the ATP-binding domain found within bacterial DinG and eukaryotic Rad3 proteins, differing from other SF1 and SF2 helicases by the presence of a large insert after the Walker A motif.
The domain, which defines this group of proteins is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.
The eukaryotic translation initiation factor 4A (eIF4A) is a member of the DEA(D/H)-box RNA helicase family This is a diverse group of proteins that couples an ATPase activity to RNA binding and unwinding. The structure of the carboxyl-terminal domain of eIF4A has been determined to 1.75 A resolution; it has a parallel alpha-beta topology that superimposes, with minor variations, on the structures and conserved motifs of the equivalent domain in other, distantly related helicases.
RNA helicases from the DEAD-box family are found in almost all organisms and have important roles in RNA metabolism such as splicing, RNA transport, ribosome biogenesis, translation and RNA decay. They are enzymes that unwind double-stranded RNA molecules in an energy dependent fashion through the hydrolysis of NTP. DEAD-box RNA helicases belong to superfamily 2 (SF2) of helicases. As other SF1 and SF2 members they contain seven conserved motifs which are characteristic of these two superfamilies. DEAD-box is named after the amino acids of motif II or Walker B (Mg2+-binding aspartic acid). Besides these seven motifs, DEAD-box RNA helicases contain a conserved cluster of nine amino-acids (the Q motif) with an invariant glutamine located N-terminally of motif I. An additional highly conserved but isolated aromatic residue is also found upstream of these nine residues. The Q motif is characteristic of and unique to DEAD box family of helicases. It is supposed to control ATP binding and hydrolysis, and therefore it represents a potential mechanism for regulating helicase activity.
Several structural analyses of DEAD-box RNA helicases have been reported . The Q motif is located in close proximity to motif I. The conserved glutamine and aromatic residues interact with the ADP molecule.
Some proteins known to contain a Q motif:
This entry represents a region stretching from the conserved aromatic residue to one amino acid after the glutamine of the Q motif.
Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs have been identified which are distributed over two structural domains, an N-terminal ATP-binding domain and a C-terminal domain. UvrD-like DNA helicases belong to SF1, but they differ from classical SF1/SF2 (see by a large insertion in each domain. UvrD-like DNA helicases unwind DNA with a 3'-5' polarity.
Crystal structures of several uvrD-like DNA helicases have been solved . They are monomeric enzymes consisting of two domains with a common alpha-beta RecA-like core. The ATP-binding site is situated in a cleft between the N-terminus of the ATP-binding domain and the beginning of the C-terminal domain. The enzyme crystallizes in two different conformations (open and closed). The conformational difference between the two forms comprises a large rotation of the end of the C-terminal domain by approximately 130°. This "domain swiveling" was proposed to be an important aspect of the mechanism of the enzyme.
Some proteins that belong to the UvrD-like DNA helicase family are listed below:
This entry represents the ATP-binding domain found in UvrD-like helicases.
The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis.
The helicase/SANT-associated (HSA) domain is a predicted DNA-binding domain of ~75 amino acids, which is found in the eukaryotic SRCAP/p400/DOM and SNF2/brahma families. While each family has the core sequences that define the HSA domain, they each also have additional sequences that distinguish these families from one another. For example, the sequence HWDY(L/C)EEEM(Q/V) is found in the SRCAP/p400/DOM family, whereas the sequence HQE(Y/F)LNSILQ is found in the SNF2 /brahma family. In addition to the SANT and helicase domains, the HSA domain is also found in association with the bromo domain.
The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.
Members of the importin-alpha (karyopherin-alpha) family can form heterodimers with importin-beta. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Proteins can contain one (monopartite) or two (bipartite) NLS motifs. Importin-alpha contains several armadillo (ARM) repeats, which produce a curving structure with two NLS-binding sites, a major one close to the N-terminus and a minor one close to the C-terminus.
Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. The N-terminal importin-beta-binding (IBB) domain of importin-alpha contains an auto-regulatory region that mimics the NLS motif. The release of importin-beta frees the auto-regulatory region on importin-alpha to loop back and bind to the major NLS-binding site, causing the cargo to be released.
This entry represents the N-terminal IBB domain of importin-alpha that contains the auto-regulatory region.
More information about these proteins can be found at Protein of the Month: Importins.
This domain, Associated With SET, of unknown function is found in eukaryotic proteins of unknown function. This domain, as the name suggests, is often found in association with the SET domain, suggesting a role in gene regulation by methylation of lysine residues in histones and other proteins.
Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) see and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs have been identified which are distributed over two structural domains, an N-terminal ATP-binding domain and a C-terminal domain.
This entry represents the C-terminal domain.
UvrD-like DNA helicases belong to SF1, but they differ from classical SF1/SF2 by a large insertion in each domain. UvrD-like DNA helicases unwind DNA with a 3'-5' polarity. Crystal structures of several uvrD-like DNA helicases have been solved. They are monomeric enzymes consisting of two domains with a common alpha-beta RecA-like core. The ATP-binding site is situated in a cleft between the N-terminus of the ATP-binding domain and the beginning of the C-terminal domain. The enzyme crystallizes in two different conformations (open and closed). The conformational difference between the two forms comprises a large rotation of the end of the C-terminal domain by approximately 130°. This "domain swiveling" was proposed to be an important aspect of the mechanism of the enzyme.
Some proteins that belong to the uvrD-like DNA helicase family are listed below:
This family contains dephospho-CoA kinases, which catalyzes the final step in CoA biosynthesis, the phosphorylation of the 3'-hydroxyl group of ribose using ATP as a phosphate donor.
The crystal structures of a number of the proteins in this entry have been determined, including the structure of the protein from Haemophilus influenzae to 2.0-A resolution in a comlex with ATP. The protein consists of three domains: the nucleotide-binding domain with a five-stranded parallel beta-sheet, the substrate-binding alpha-helical domain, and the lid domain formed by a pair of alpha-helices; the overall topology of the protein resembles the structures of other nucleotide kinases.
Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). Tubulin-tyrosine ligase (TTL) catalyses the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. The true physiological function of TTL has so far not been established. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness.
3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis.
Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.
ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.
Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.
The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.
Other proteins containing an ACB domain include:
A group of microtubule-associated proteins called +TIPs (plus end tracking proteins), including EB1 (end-binding protein 1) family proteins, label growing microtubules ends specifically in diverse organisms and are implicated in spindle dynamics, chromosome segregation, and directing microtubules toward cortical sites. EB1 members have a bipartite composition: the N-terminal CH domain mediates microtubule plus end localization and a C-terminal cargo binding domain (EB1-C) that captures cell polarity determinants. The EB1-C domain comprises a unique EB1-like sequence motif that acts as a binding site for other +TIP proteins. It interacts with the carboxy terminus of the adenomatous polyposis coli (APC) tumor suppressor, a well conserved +TIP phosphoprotein with a pivotal function in cell cycle regulation. Another binding partner of the EB1-C domain is the well conserved +TIP protein dynactin, a component of the large cytoplasmic dynein/dynactin complex.
The ~80-residue EB1-C domain starts with a long smoothly curved helix (alpha1), which is followed by a hairpin connection leading to a short second helix (alpha2) running antiparallel to alpha1. The two parallel alpha1 helices of the EB1-C domain dimer wrap around each other in a slightly left-handed supercoil. The two alpha2 helices run antiparallel to helices alpha1 and form a similar fork in the opposite orientation and rotated by 90°. As a result, two helical segments from each monomer form a four-helix bundle. The side chain forming the hydrophobic core of this bundle are highly conserved.
Some protein known to contain an EB1-C domain are listed below:
The actin-depolymerising factor homology (ADF-H) domain is an ~150-amino acid motif that is present in three phylogenetically distinct classes of eukaryotic actin-binding proteins:
Although these proteins are biochemically distinct and play different roles in actin dynamics, they all appear to use the ADF-H domain for their interactions with actin.
The ADF-H domain consists of a six-stranded mixed beta-sheet in which the four central strands (beta2-beta5) are anti-parallel and the two edge strands (beta1 and beta6) run parallel with the neighbouring strands. The sheet is surrounded by two alpha-helices on each side .
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no currently known function.
As well as Pirh2, the CHY-type zinc finger is also found in the following proteins:
The solution structure of this zinc finger has been solved and binds 3 zinc atoms as shown in the following schematic representation:
More information about these proteins can be found at Protein of the Month: Zinc FingersGlutamine amidotransferase (GATase) enzymes catalyse the removal of the ammonia group from glutamine and then transfer this group to a substrate to form a new carbon-nitrogen group. The GATase domain exists either as a separate polypeptidic subunit or as part of a larger polypeptide fused in different ways to a synthase domain. Two classes of GATase domains have been identified: class-I (also known as trpG-type or triad) and class-II (also known as purF-type or Ntn). Class-I (or type 1) GATase domains have been found in the following enzymes:
A large group of biosynthetic enzymes are able to catalyse the removal of the ammonia group from glutamine and then to transfer this group to a substrate to form a new carbon-nitrogen group. This catalytic activity is known as glutamine amidotransferase (GATase). The GATase domain exists either as a separate polypeptidic subunit or as part of a larger polypeptide fused in different ways to a synthase domain. On the basis of sequence similarities two classes of GATase domains have been identified: class-I (also known as trpG-type or triad) and class-II (also known as purF-type or Ntn). Class-II (or type 2) GATase domains have been found in the following enzymes:
The active site is formed by a cysteine present at the N-terminal extremity of the mature form of all these enzymes. Two other conserved residues, Asn and Gly, form an oxyanion hole for stabilisation of the formed tetrahedral intermediate. An insert of ~120 residues can occur between the conserved regions . In some class-II GATases (for example in Bacillus subtilis or chicken amido phosphoribosyltransferase) the enzyme is synthesised with a short propeptide which is cleaved off post-translationally by a proposed autocatalytic mechanism. Nuclear-encoded Fd-dependent gltS have a longer propeptide which may contain a chloroplast-targeting peptide in addition to the propeptide that is excised on enzyme activation.
The 3-D structure of the GATase type 2 domain forms a four layer alpha/beta/beta/alpha architecture which consists of a fold similar to the N-terminal nucleophile (Ntn) hydrolases. These have the capacity for nucleophilic attack and the possibility of autocatalytic processing. The N-terminal position and the folding of the catalytic Cys differ strongly from the Cys-His-Glu triad which forms the active site of GATases of type 1.
Vertebrate BCNT (named after Bucentaur) protein is found in the nucleus and cytosol. Gene duplication of the ancestral BCNT gene leads to the h-type BCNT or craniofacial development protein 1 (CFDP1) gene and the ruminant-specific p97BCNT or craniofacial development protein 2 (CFDP2) gene. The h-type BCNT proteins contain a highly conserved 82-amino acid region at the C-terminus (BCNT-C) that is not present in p97BCNT. Instead ruminant p97BCNT contains a region derived from the endonuclease domain of a retrotransposable element RTE-1. In addition to h-type BCNT proteins, a BCNT-C domain is also found in Drosophila YETI, a protein that binds to a microtubule-based motor kinesin-1, and the yeast SWR1-complex protein 5 (SWC5) or AOR1 (actin overexpression resistant 1), a component of the SWR1 chromatin remodeling complex.
Deubiquitinating enzymes (DUB) form a large family of cysteine protease that can deconjugate ubiquitin or ubiquitin-like proteins (see from ubiquitin-conjugated proteins. All DUBs contain a catalytic domain surrounded by one or more subdomains, some of which contribute to target recognition. The ~120-residue DUSP (domain present in ubiquitin-specific proteases) domain is one of these specific subdomains. Single or tandem DUSP domains are located both N- and C-terminal to the ubiquitin carboxyl-terminal hydrolase catalytic core domain (see.
The DUSP domain displays a tripod-like AB3 fold with a three-helix bundle and a three-stranded anti-parallel beta-sheet resembling the legs and seat of the tripod (see PDB:1W6V). Conserved residues are predominantly involved in hydrophobic packing interactions within the three alpha-helices. The most conserved DUSP residues, forming the PGPI motif, are flanked by two long loops that vary both in length and sequence. The PGPI motif packs against the three-helix bundle and is highly ordered.
The function of the DUSP domain is unknown but it may play a role in protein/protein interaction or substrate recognition. This domain is associated with ubiquitin carboxyl-terminal hydrolase family 2 (MEROPS peptidase family C19). They are a family 100 to 200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast; others include:
The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation.
Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication.
APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts.
One of the subunits of the APC that is required for ubiquitination activity is APC10, a one-domain protein homologous to a sequence element, termed the DOC domain, found in several hypothetical proteins that may also mediate ubiquitination reactions, because they contain combinations of either RING finger (see, cullin (see or HECT (see domains.
The DOC domain consists of a beta-sandwich, in which a five-stranded antiparallel beta-sheet is packed on top of a three stranded antiparallel beta-sheet, exhibiting a 'jellyroll' fold.
Proteins known to contain a DOC domain include:
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
The AGC (cAMP-dependent, cGMP-dependent and protein kinase C) protein kinase family embraces a collection of protein kinases that display a high degree of sequence similarity within their respective kinase domains. AGC kinase proteins are characterised by three conserved phosphorylation sites that critically regulate their function. The first one is located in an activation loop in the centre of the kinase domain. The two other phosphorylation sites are located outside the kinase domain in a conserved region on its C-terminal side, the AGC-kinase C-terminal domain. These sites serves as phosphorylation-regulated switches to control both intra- and inter-molecular interactions. Without these priming phosphorylations, the kinases are catalytically inactive.
Several structures of the AGC-kinase C-terminal domain have been solved. The first phosphorylation site is located in a turn motif, the second one at the end of the domain in an hydrophobic pocket. In PKB the phosphorylated hydrophobic motif engages a hydrophobic groove within the N-lobe of the kinase domain which orders alpha helices close to the active site.
The ~60-residue RAP (an acronym for RNA-binding domain abundant in Apicomplexans) domain is found in various proteins in eukaryotes. It is particularly abundant in apicomplexans and might mediate a range of cellular functions through its potential interactions with RNA.
The RAP domain consists of multiple blocks of charged and aromatics residues and is predicted to be composed of alpha helical and beta strand structures. Two predicted loop regions that are dominated by glycine and tryptophan residues are found before and after the central beta sheet. Some proteins known to contain a RAP domain are listed below:
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
The RING finger is a well characterised zinc finger which coordinates two zinc atoms in a cross-braced manner (see. According to the pattern of cysteines and histidines three different subfamilies of RING finger can be defined. The classical RING finger (RING-HC) has a histidine at the fourth coordinating position and a cysteine at the fifth. In the RING-H2 variant, both the fourth and fifth positions are occupied by histidines. The RING-CH, which is very similar to the classical RING finger, differs from both of these variants in that it has a cys residue in the fourth position and a His in the fifth. Another difference between the RING-CH and the common RING variants is a somewhat longer peptide segment between the fourth and fifth zinc-coordinating residues. The RING-CH zinc finger has thus the same arrangement of cysteine and histidine (C4HC3) as the PHD zinc finger (see but it contains features (spacing between the cysteines and the histidine) characteristic of the genuine RING-finger (C3HC4). The RING-CH-type is an E3 ligase mainly found in proteins associated to membranes.
The solution structure of the RING-CH-type zinc finger of the herpesvirus Mir1 protein has shown that it is an outlying relative of the cellular RING finger domain family, with its polypeptide backbone much more closely resembling that of RING domains than PHD domains. The only real difference between the classic and variant RING domains, other than the alteration of zinc ligands, is the loss of the small beta-sheet found in RING domains and the replacement of one strand of this sheet with a single turn of helix. Some proteins that contains a RING-CH-type zinc finger are listed below:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The myb family can be classified into three groups: the myb-type HTH domain, which binds DNA, the SANT domain, which is a protein-protein interaction module and the myb-like domain that can be involved in either of these functions.
The SANT domain is a motif of ~50 amino acids present in proteins involved in chromatin-remodelling and transcription regulation. This eukaryotic domain was identified in nuclear receptor co-repressors and named after switching-defective protein 3 (Swi3), adaptor 2 (Ada2), nuclear receptor co-repressor (N-CoR) and transcription factor (TF)IIIB. Although SANT domains show remarkable sequence and structural similarity to the DNA-binding helix-turn-helix (HTH) domain of the myb-like tandem repeat, their function is not DNA binding. Instead, SANT domains are protein-protein interaction modules and some can bind to histone tails (e.g. in Ada2 and SMRT). SANT domains are found in combination with other domains, such as the SWIRM domain (see, the ZZ-type zinc finger (see, the C2H2-type zinc finger (see, the GATA-type zinc finger (see, the MPN-domain and DEAH ATP-helicase domain (see. The SANT domain was proposed to function as a histone-interaction module that couples histone-tail binding to enzyme catalysis for the remodelling of nucleosomes.
The 3-dimensional structure of the SANT domain forms three alpha helices (see PDB:1OFC) similar to the DNA-binding myb-type HTH domain. Because of the strong resemblance, the SANT domain can also be detected as a myb-like "DNA-binding" domain. Most SANT domains have acidic amino acids at the start of helix 2 and in helix 3, while myb-like DNA-binding domains have more positively charged residues, in particular in their third 'recognition' helix. The bulky aromatic and hydrophobic residues in the centre of helix 3 that are incompatible with DNA contacts of myb-like DNA-binding domains form another distinguishing property of SANT domains.
The myb family can be classified into three groups: the myb-type HTH domain, which binds DNA, the SANT domain, which is a protein-protein interaction module and the myb-like domain that can be involved in either of these functions.
The myb-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of ~55 amino acids, typically occurring in a tandem repeat in eukaryotic transcription factors. The domain is named after the retroviral oncogene v-myb, and its cellular counterpart c-myb, which encode nuclear DNA-binding proteins that specifically recognize the sequence YAAC(G/T)G. Myb proteins contain three tandem repeats of 51 to 53 amino acids, termed R1, R2 and R3. This repeat region is involved in DNA-binding and R2 and R3 bind directly to the DNA major groove. The major part of the first repeat is missing in retroviral v-Myb sequences and in plant myb-related (R2R3) proteins. A single myb-type HTH DNA-binding domain occurs in TRF1 and TRF2. The 3D-structure of the myb-type HTH domain forms three alpha-helices. The second and third helices connected via a turn comprise the helix-turn-helix motif. Helix 3 is termed the recognition helix as it binds the DNA major groove, like in other HTHs.
There are multiple types of iron-sulphur clusters which are grouped into three main categories based on their atomic content: [2Fe-2S], [3Fe-4S], [4Fe-4S] (see, and other hybrid or mixed metal types. Two general types of [2Fe-2S] clusters are known and they differ in their coordinating residues. The ferredoxin-type [2Fe-2S] clusters are coordinated to the protein by four cysteine residues (see. The Rieske-type [2Fe-2S] cluster is coordinated to its protein by two cysteine residues and two histidine residues.
The structure of several Rieske domains has been solved. It contains three layers of antiparallel beta sheets forming two beta sandwiches. Both beta sandwiches share the central sheet 2. The metal-binding site is at the top of the beta sandwich formed by the sheets 2 and 3. The Fe1 iron of the Rieske cluster is coordinated by two cysteines while the other iron Fe2 is coordinated by two histidines. Two inorganic sulphide ions bridge the two iron ions forming a flat, rhombic cluster.
Rieske-type iron-sulphur clusters are common to electron transfer chains of mitochondria and chloroplast and to non-haem iron oxygenase systems:
The polyadenylate-binding protein (PABP) has a conserved C-terminal domain (PABC), which is also found in the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains. PABP recognises the 3' mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilisation/degradation. PABC domains of PABP are peptide-binding domains that mediate PABP homo-oligomerisation and protein-protein interactions. In mammals, the PABC domain of PABP functions to recruit several different translation factors to the mRNA poly(A) tail.
Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.
TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
The TFIIS N-terminal domain is a compact four-helix bundle. The hydrophobic core residues of helices 2, 3, and 4 are well conserved among TFIIS domains, although helix 1 is less conserved.
Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.
TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
This domain is found in the central region of transcription elongation factor S-II and in several hypothetical proteins.
The ~100-residue ERV/ALR sulphydryl oxidase domain is a versatile module adapted for catalysis of disulphide bond formation in various organelles and biological settings. The ERV/ALR sulphydryl oxidase domain has a Cys-X-X-Cys dithiol/disulphide motif adjacent to a bound FAD cofactor, enabling transfer of electrons from thiol substrates to non-thiol electron acceptors. ERV/ALR family members differ in their N- or C-terminal extensions, which typically contain at least one additional disulphide bond, the hypothesised 'shuttle' disulphide. In yeast ERV1, a mitochondrial enzyme, the shuttle disulphide is N-terminal to the catalytic core; in yeast ERV2, present in the endoplasmic reticulum, it is C-terminal. The N- and C-terminal extensions can be entire domains, such as the thioredoxin-like domains or short segments that do not seem to be distinct domains. Proteins of the ERV/ALR family are encoded by all eukaryotes and cytoplasmic DNA viruses (poxviruses, African swine fever virus, iridoviruses, and Paramecium bursaria Chlorella virus 1).
The ERV/ALR sulphydryl oxidase domain contains a four-helix bundle (helices alpha1-alpha4) and an additional single turn of helix (alpha5) packed perpendicular to the bundle. The FAD prosthetic group is housed at the mouth of the 4-helix bundle and communicates with the pair of juxtaposed cysteine residues that form the proximal redox active site.
The C-CAP/cofactor C-like domain is present in several cytoskeleton-related proteins, which also contain a number of additional domains:
The cyclase-associated protein C-CAP/cofactor C-like domain binds G-actin and is responsible for oligomerisation of the entire CAP molecule, whereas the XRP2 C-CAP/cofactor C-like domain is required for binding of ADP ribosylation factor-like protein 3 (Arl3).
The central core of the C-CAP/cofactor C-like domain is composed of six coils of right-handed parallel beta-helices, termed coils 1-6, which form an elliptical barrel with a tightly packed interior. Each beta-helical coil is composed of three relatively short beta-strands, designated a-c, separated by sharp turns. Flanking the central beta-helical core is an N-terminal beta-strand, beta0, that packs antiparallel to the core, and strand beta7 packs antiparallel to the core near the C-terminal end of the parallel beta-helix .
Dihydrofolate reductase (DHFR) catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate, an essential step in de novo synthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis), and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate. Although DHFR is found ubiquitously in prokaryotes and eukaryotes, and is found in all dividing cells, maintaining levels of fully reduced folate coenzymes, the catabolic steps are still not well understood.
Bacterial species possesses distinct DHFR enzymes (based on their pattern of binding diaminoheterocyclic molecules), but mammalian DHFRs are highly similar. The active site is situated in the N-terminal half of the sequence, which includes a conserved Pro-Trp dipeptide; the tryptophan has been shown to be involved in the binding of substrate by the enzyme. Its central role in DNA precursor synthesis, coupled with its inhibition by antagonists such as trimethoprim and methotrexate, which are used as anti-bacterial or anti-cancer agents, has made DHFR a target of anticancer chemotherapy. However, resistance has developed against some drugs, as a result of changes in DHFR itself.
Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit and the small beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE.
The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C-terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow.
Archaea contain a TFIIE homolog, called TFE, which corresponds to the N-terminal half of TFIIEalpha. It appears that archaeal TFE corresponds to the minimal essential region of eukaryotic TFIIEalpha. In archaea TFE contains an N-terminal, weakly conserved, helix-turn-helix (HTH) motif within a leucine-rich region and a C-terminal zinc ribbon. It has been proposed that the TFE/IIEalpha-type HTH domain acts as a bridging factor or adapter between the TATA box-binding protein, the polymerase, and possibly promoter DNA.
The TFE/IIEalpha-type HTH domain adopts a winged HTH (winged helix) fold, comprising three alpha-helices and three beta-strands in the canonical order alpha1-beta1-alpha2-alpha3-beta2-beta3. Conserved residues within helices alpha1-alpha3 form the tightly packed hydrophobic core of the winged helix domain. A specific feature of the structure is the extension of the canonical winged helix fold at the N and C termini by the additional helices alpha0 and alpha4, respectively. Hydrophobic residues from the additional helix alpha0 extend the hydrophobic core of the winged helix domain, and helix alpha0 is tightly packed against the canonical winged helix fold. Helix alpha4 comprises only one turn.
Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein .
Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.
A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity. The various forms of PDI which are currently known are:
Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:
This entry represents the thioredoxin domain and homologous domains in other proteins.
Glutaredoxins, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system.
Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.
Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.
This entry represents Glutaredoxin.
Glutathione peroxidase (GSHPx) is an enzyme that catalyses the reduction of hydroxyperoxides by glutathione. Its main function is to protect against the damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates, several forms of GSHPx are known, including a ubiquitous cytosolic form (GSHPx-1), a gastrointestinal cytosolic form (GSHPx-GI), a plasma secreted form (GSHPx-P), and an epididymal secretory form (GSHPx-EP). In addition to these characterised forms, the sequence of a protein of unknown function has been shown to be evolutionary related to those of GSHPx's.
In filarial nematode parasites, the major soluble cuticular protein (gp29) is a secreted GSHPx, which may provide a mechanism of resistance to the immune reaction of the mammalian host by neutralising the products of the oxidative burst of leukocytes. The Escherichia coli protein btuE, a periplasmic protein involved in vitamin B12 transport, is evolutionarily related to GSHPxs, although the significance of this relationship is unclear. The structure of bovine seleno-glutathione peroxidase has been determined. The protein belongs to the alpha-beta class, with a 3 layer(aba) sandwich architecture. The catalyic site of GSHPx contains a conserved residue which is either a cysteine or, in many eukaryotic GSHPx, a selenocysteine.
Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.
In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits, which is known as Vb in mammals, V in Dictyostelium discoideum (Slime mold) and IV in yeast, binds a zinc atom. The sequence of subunit Vb is well conserved and includes three conserved cysteines that coordinate the zinc ion. Two of these cysteines are clustered in the C-terminal section of the subunit.
This entry represents the W2 domain (two invariant tryptophans) and is a region of ~165 amino acids which is found in the C-terminus of the following eIFs:
Translation initiation is a sophisticated, well regulated and highly coordinated cellular process in eukaryotes, in which at least 11 eukayrotic initiation factors (eIFs) are included.
The W2 domain has a globular fold and is exclusively composed out of alpha-helices. The structure can be divided into a structural C-terminal core onto which the two N-terminal helices are attached. The core contains two aromatic/acidic residue-rich regions (AA boxes), which are important for mediating protein-protein interactions.
The entry covers the entire W2 domain.
This entry represents the MI domain (after MA-3 and eIF4G), it is a protein-protein interaction module of ~130 amino acids. It appears in several translation factors and is found in:
The MI domain consists of seven alpha-helices, which pack into a globular form. The packing arrangement consists of repeating pairs of antiparallel helices packed one upon the other such that a superhelical axis is generated perpendicular to the alpha-helical axes.
The MI domain has also been named MA3 domain.
CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes.
Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking, while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites.
Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations.
In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).
This entry represents the PPR repeat.
Pentatricopeptide repeat (PPR) proteins are characterised by tandem repeats of a degenerate 35 amino acid motif. Most of PPR proteins have roles in mitochondria or plastid. PPR repeats were discovered while screening Arabidopsis proteins for those predicted to be targeted to mitochondria or chloroplast. Some of these proteins have been shown to play a role in post-transcriptional processes within organelles and they are thought to be sequence-specific RNA-binding proteins. Plant genomes have between one hundred to five hundred PPR genes per genome whereas non-plant genomes encode two to six PPR proteins.
Although no PPR structures are yet known, the motif is predicted to fold into a helix-turn-helix structure similar to those found in the tetratricopeptide repeat (TPR) family (see.
The plant PPR protein family has been divided in two subfamilies on the basis of their motif content and organisation.
Examples of PPR repeat-containing proteins include PET309 which may be involved in RNA stabilisation, and crp1, which is involved in RNA processing. The repeat is associated with a predicted plant proteinthat has a domain organisation similar to the human BRCA1 protein.
Ferredoxins are a group of iron-sulphur proteins which mediate electron transfer in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s). One of these subgroups are the 4Fe-4S ferredoxins, which are found in bacteria and which are thus often referred as 'bacterial-type' ferredoxins. The structure of these proteins consists of the duplication of a domain of twenty six amino acid residues; each of these domains contains four cysteine residues that bind to a 4Fe-4S centre.
Several structures of the 4Fe-4S ferredoxin domain have been determined. The clusters consist of two interleaved 4Fe- and 4S-tetrahedra forming a cubane-like structure, in such a way that the four iron occupy the eight corners of a distorted cube. Each 4Fe-4S is attached to the polypeptide chain by four covalent Fe-S bonds involving cysteine residues.
A number of proteins have been found that include one or more 4Fe-4S binding domains similar to those of bacterial-type ferredoxins.
The pattern of cysteine residues in the iron-sulphur region is sufficient to detect this class of 4Fe-4S binding proteins. This entry represents the whole domain.
Note:In some bacterial ferredoxins, one of the two duplicated domains has lost one or more of the four conserved cysteines. The consequence of such variations is that these domains have either lost their iron-sulphur binding property or bind to a 3Fe-3S centre instead of a 4Fe-4S centre.
The EXS domain is named after ERD1/XPR1/SYG1 and proteins containing this motif include the C-terminal of the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be Murine leukemia virus (MLV) receptors (XPR1. The N-terminal of these proteins often have an SPX domain.
While the N-terminal is thought to be involved in signal transduction, the role of the C-terminal is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) S. cerevisiae proteins. ERD1 proteins are involved in the localization of endogenous endoplasmic reticulum (ER) proteins. Erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localization label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via 'salvage' vesicles.
The SPX domain is named after SYG1/Pho81/XPR1 proteins. This 180 residue length domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G- protein beta subunit and inhibits transduction of the mating pheromone signal suggesting that all the members of this family are involved in G-protein associated signal transduction. The C-terminal of these proteins often have an EXS domain.
The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. NUC-2 contains several ankyrin repeats.
Several members of this family are the XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with Murine leukemia virus (MLV). The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction and may itself function as a phosphate sensor.
This family is related to Hydroxyethylthiazole kinaseand PfkB carbohydrate kinaseimplying that it also a carbohydrate kinase.
Several uncharacterised proteins have been shown to share regions of similarities, including yeast chromosome XI hypothetical protein YKL151c; Caenorhabditis elegans hypothetical protein R107.2; Escherichia coli hypothetical protein yjeF; Bacillus subtilis hypothetical protein yxkO; Helicobacter pylori hypothetical protein HP1363; Mycobacterium tuberculosis hypothetical protein MtCY77.05c; Mycobacterium leprae hypothetical protein B229_C2_201; Synechocystis sp. (strain PCC 6803) hypothetical protein sll1433; and Methanocaldococcus jannaschii (Methanococcus jannaschii) hypothetical protein MJ1586. These are proteins of about 30 to 40 kDa whose central region is well conserved.
Flavoenzymes have the ability to catalyse a wide range of biochemical reactions. They are involved in the dehydrogenation of a variety of metabolites, in electron transfer from and to redox centres, in light emission, in the activation of oxygen for oxidation and hydroxylation reactions. About 1% of all eukaryotic and prokaryotic proteins are predicted to encode a flavin adenine dinucleotide (FAD)-binding domain.
According to structural similarities and conserved sequence motifs, FAD-binding domains have been grouped in three main families: (i)the ferredoxin reductase (FR)-type FAD-binding domain, (ii) the FAD-binding domains that adopt a Rossmann fold and (iii) the PCMH-type FAD-binding domain.
The FAD cofactor consists of adenosine monophosphate (AMP) linked to flavin mononucleotide (FMN) by a pyrophosphate bond. The AMP moiety is composed of the adenine ring bonded to a ribose that is linked to a phosphate group. The FMN moiety is composed of the isoalloxazine-flavin ring linked to a ribitol, which is connected to a phosphate group. The flavin functions mainly in a redox capacity, being able to take up two electrons from one substrate and release them two at a time to a substrate or coenzyme, or one at a time to an electron acceptor. The catalytic function of the FAD is concentrated in the isoalloxazine ring, whereas the ribityl phosphate and the AMP moiety mainly stabilise cofactor binding to protein residues.
The structural core of all FR family members is well conserved. The FAD-binding fold characteristic of the FR family is a cylindrical beta-domain with a flattened six-stranded antiparallel beta-barrel organised into two orthogonal sheets (B1-B2-B5 and B4-B3-B6) separated by one alpha-helix. The cylinder is open between strands B4 and B5 which makes space for the isoalloxazine and ribityl moieties of the FAD. One end of the cylinder is covered by the only helix of the domain, which is essential for the binding of the pyrophosphate groups of the FAD. The FR family contains two conserved motifs, one (R-x-Y-[ST]) located in B4 where the invariant positively charge Arg residue forms hydrogen bonds to the negative pyrophosphate oxygen atom. The other conserved sequence motif is G-x(2)-[ST]-x(2)-L-x(5)-G-x(7)-P-x-G, which is part of H1-B6 and is known as the phosphate-binding motif.
The YjeF N-terminal domains occur either as single proteins or fusions with other domains and are commonly associated with enzymes. In bacteria and archaea, YjeF N-terminal domains are often fused to a YjeF C-terminal domain with high structural homology to the members of a ribokinase-like superfamilyand/or belong to operons that encode enzymes of diverse functions: pyridoxal phosphate biosynthetic protein PdxJ; phosphopanteine-protein transferase; ATP/GTP hydrolase; and pyruvate-formate lyase 1-activating enzyme. In plants, the YjeF N-terminal domain is fused to a C-terminal putative pyridoxamine 5'-phosphate oxidase. In eukaryotes, proteins that consist of (Sm)-FDF-YjeF N-terminal domains may be involved in RNA processing.
The YjeF N-terminal domains represent a novel version of the Rossmann fold, one of the most common protein folds in nature observed in numerous enzyme families, that has acquired a set of catalytic residues and structural features that distinguish them from the conventional dehydrogenases. The YjeF N-terminal domain is comprised of a three-layer alpha-beta-alpha sandwich with a central beta-sheet surrounded by helices. The conservation of the acidic residues in the predicted active site of the YjeF N-terminal domains is reminiscent of the presence of such residues in the active sites of diverse hydrolases.
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
The cyclins in this entry are involved in the regulation of RNA polymerase II transcription. These proteins are highly evolutionarily conserved and can be found in species ranging from Arabidopsis thaliana (Mouse-ear cress) to Homo sapiens (Human).
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
The cyclins in this entry are involved in the regulation of RNA polymerase II transcription. Cyclin H and its associated cyclin dependent kinase, cdk 7, are components of the TFIIH complex that is involved in both transcription and DNA repair.
This subfamily of Cyclin H proteins is found in vertebrates ranging from Xenopus laevis (African clawed frog) to Homo sapiens (Human).
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
This group of proteins is comprised entirely of phosphatidylinositol 3- and phosphatidylinositol 4-kinases. Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase, PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3), function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) acts on phosphatidylinositol (PI) in the first committed step of the production of the secondary messenger inositol-1,4,5-trisphosphate. The PI3- and PI4-kinases share a well-conserved domain at their C-terminal section, which is distantly related to the catalytic domain of protein kinases. The catalytic domain of PI3K has a bilobal structure with a small N-terminal lobe and a large C-terminal lobe; this structure is often found in other ATP-dependent kinases. The core of this catalytic domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to other protein kinases, the PI3K loop interacts with the phosphates of the ATP, known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.
Synonym(s): PIK
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L18ae forms part of the 60S ribosomal subunit. This family is found in eukaryotes. Rat ribosomal protein L18 is homologous to Xenopus laevis L14.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Tryptophanyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. The crystal structure of tryptophanyl-tRNA synthetase is known.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L22e forms part of the 60S ribosomal subunit. This family is found in eukaryotes. Rattus norvegicus (Rat) L22 is related to ribosomal proteins from other eukaryotes and is identical in amino acid sequence to human EAP, the EBER 1 (Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) encoded RNA) associated protein.
Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein.
The proteins in this entry are variously annotated as iron-sulphur cluster insertion protein or Fe/S biogenesis protein. They appear to be involved in Fe-S cluster biogenesis. This family includes IscA, HesB, YadR and YfhF-like proteins. The hesB gene is expressed only under nitrogen fixation conditions. IscA, an 11 kDa member of the hesB family of proteins, binds iron and [2Fe-2S] clusters, and participates in the biosynthesis of iron-sulphur proteins. IscA is able to bind at least 2 iron ions per dimer. Other members of this family include various hypothetical proteins that also contain the NifU-like domain suggesting that they too are able to bind iron and are involved in Fe-S cluster biogenesis. The HesB family are found in species as divergent as Homo sapiens (Human) and Haemophilus influenzae suggesting that these proteins are involved in basic cellular functions.
This entry represents DNA mismatch repair proteins, such as MutL. The dimeric MutL protein has a key function in communicating mismatch recognition by MutS to downstream repair processes. Mismatch repair contributes to the overall fidelity of DNA replication by targeting mispaired bases that arise through replication errors during homologous recombination and as a result of DNA damage. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex.
Mismatch repair is one of five major DNA repair pathways, the others being homologous recombination repair, non-homologous end joining, nucleotide excision repair, and base excision repair. The mismatch repair system recognises and repairs mispaired or unpaired nucleotides that result from errors in DNA replication. Many proteins involved in the different repair processes also play a role in apoptosis when DNA damage is excessive, thereby helping to prevent carcinogenesis. The mismatch repair protein, Mlh1 (mutL homologue 1), has a dual role in DNA repair and apoptosis. Mlh1 acts as a heterodimer in conjunction with Pms2, Pms1 (post-meiotic segregation 1 and 2) or Mlh3 (MutL homologue 3), which function as adaptor proteins that link Msh (MutS homologue) heterodimers to the DNA repair machinery, resulting in excision and repair of the mispaired base.
The Post-Meiotic Segregation 2 (PMS2) protein is a component of the MMR (Mis-Match Repair) complex involved in DNA repair. In Homo sapiens (Human), PMS2 forms an alpha heterodimer with the MLH1 protein. The gene was originally identified as having a low chromosome segregation defect in meiosis in Saccharomyces cerevisiae (Baker's yeast), in which organism MLH1/PMS2 is a single gene. Germline mutations in the PMS2 gene have been shown to give rise to Turcot syndrome, which is the co-occurrence of a primary brain tumour and multiple colorectal adenomas. It has also been shown that in families having a history of hereditary nonpolyposis colorectal cancer (HNPCC), PMS2 was seen to have large internal deletions in the gene. It was later shown that HNPCC patients having germline mutations either in PMS2 or MHL1 had a much higher rate of chromosomal mutations compared to control individuals.
Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.
The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.
The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.
In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.
This entry represents the N-terminal of NifU and homologous proteins. NifU contains two domains: an N-terminal and a C-terminal domain. These domains exist either together or on different polypeptides, both domains being found in organisms that do not fix nitrogen (e.g. yeast), so they have a broader significance in the cell than nitrogen fixation.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
This is a family of single chain polymerases, which are evolutionary related, and which are related to the T3/T7 bacteriophage polymerases.
Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia. Human NHE is also involved in heart disease, cell growth and in cell differentiation. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9). These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport.
This entry represents a conserved region found in a number of cation/proton exchangers, including Na+/H+ exchangers, K+/H+ exchangers and Na+(K+,Li+,Rb+)/H+ exchangers.
Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia. Human NHE is also involved in heart disease, cell growth and in cell differentiation. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9). These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport.
This entry represents a conserved region found in putative Na+/H+ exchanger proteins from Apicomplexa.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The L36E ribosomal family consists of mammalian, Caenorhabditis elegans and Drosophila L36, Candida albicans L39, and yeast YL39 ribosomal proteins.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Glutamyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Glutaminyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutamyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. Glutaminyl-tRNA synthetase is a relatively rare synthetase, found in the cytosolic compartment of eukaryotes, in Escherichia coli and a number of other Gram-negative bacteria, and in Deinococcus radiodurans. In contrast, the pathway to Gln-tRNA in mitochondria, Archaea, Gram-positive bacteria, and a number of other lineages is by misacylation with Glu followed by transamidation to correct the aminoacylation to Gln.
Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.
In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits, which is known as Vb in mammals, V in Dictyostelium discoideum (Slime mold) and IV in yeast, binds a zinc atom. The sequence of subunit Vb is well conserved and includes three conserved cysteines that coordinate the zinc ion. Two of these cysteines are clustered in the C-terminal section of the subunit.
The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins.
The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box and interacts with transcription factors and regulatory proteins . By contrast, the N-terminal region varies in both length and sequence.
Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems. It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential which is linked to ATP synthesis.
The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur 'Rieske' subunit, which contains a high potential 2Fe-2S cluster.The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits.
The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents the C subunit that is part of the V1 complex, and is localised to the interface between the V1 and V0 complexes. This subunit does not show any homology with F-ATPase subunits. The C subunit plays an essential role in controlling the assembly of V-ATPase, acting as a flexible stator that holds together the catalytic (V1) and membrane (V0) sectors of the enzyme . The release of subunit C from the ATPase complex results in the dissociation of the V1 and V0 subcomplexes, which is an important mechanism in controlling V-ATPase activity in cells.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Mre11 and Rad50 are two proteins required for DNA repair and meiosis-specific double-strand break formation in Saccharomyces cerevisiae. Mre11 by itself has 3' to 5' exonuclease activity that is increased when Mre11 is in a complex with Rad50.
These eukaryotic proteins contain one metallo-phosphoesterase domain followed by an Mre11 DNA-binding domain. S. cerevisiae Mre11 is required for DNA repair and meiosis-specific double-strand break (DSB) formation and has both 3' to 5' exonuclease activity (which increases when in complex with Rad50) and endonuclease activity. The N-terminal phosphoesterase domain is required for DSB repair, and the carboxyl-terminal dsDNA-binding domain is essential during meiosis for chromatin modification and DSB formation. Schizosaccharomyces pombe rad32 is required for repair of double strand breaks and recombination.
For additional information please see.
Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.
PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.
Proteins in this entry occur in archaea, bacteria and eukaryotes. They are encoded by genes which are often co-transcribed with proline biosysnthesis genes, although their function in vivo has not yet been demonstrated.
The structure of the yeast protein YBL036C has been determined to a resolution of 2.0 A. Similar in structure to the N-terminal domains of alanine racemase and ornithine decarboxylase, it forms a TIM barrel fold which begins with a long N-terminal helix, rather than the classical beta strand found at the beginning of most other TIM barrels. Unlike alanine racemase and ornithine decarboxylase, which are two-domain dimeric proteins, the yeast protein is a single domain monomer. A pyridoxal 5'-phosphate cofactor is covalently bound towards the C-terminal end of the barrel, which is the usual active site in TIM-barrel folds. Some racemase activity was observed for this protein and it was suggested by the authors that it may function as a general racemase.
Glycerol kinase is a bacterial sugar kinase which catalyzes the Mg-ATP-dependent phosphorylation of glycerol to yield glycerol 3-phosphate. The enzyme from Escherichia coli is an allosteric regulatory enzyme whose activity is inhibited by fructose 1,6-bisphosphate (FBP) and the glucose-specific phosphocarrier of the phosphoenolpyruvate:glycose phosphotransferase system, IIA(Glc), structural studies suggest a nucleophilic in-line transfer mechanism for the ATP-dependent phosphorylation of glycerol by glycerol kinase.
Deoxyribodipyrimidine photolyase (DNA photolyase) is a DNA repair enzyme. It binds to UV-damaged DNA containing pyrimidine dimers and, upon absorbing a near-UV photon (300 to 500 nm), breaks the cyclobutane ring joining the two pyrimidines of the dimer. DNA photolyase is an enzyme that requires two choromophore-cofactors for its activity: a reduced FADH2 and either 5,10-methenyltetrahydrofolate (5,10-MTFH) or an oxidized 8-hydroxy-5- deazaflavin (8-HDF) derivative (F420). The folate or deazaflavin chromophore appears to function as an antenna, while the FADH2 chromophore is thought to be responsible for electron transfer. On the basis of sequence similarities DNA photolyases can be grouped into two classes.
The second class contains enzymes from Myxococcus xanthus, methanogenic archaebacteria, insects, fish and marsupial mammals. It is not yet known what second cofactor is bound to class 2 enzymes. There are a number of conserved sequence regions in all known class 2 DNA photolyases, especially in the C-terminal part.
This family of membrane proteins transport nucleotide sugars from the cytoplasm into golgi vesicles.transports CMP-sialic acid,transports UDP-galactose andtransports UDP-GlcNAc. This family has some but not complete overlap with the UDP-galactose transporter family
Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. The eukaryotic translation initiation factor EIF-2B is a complex made up of five different subunits, alpha, beta, gamma, delta and epsilon, and catalyses the exchange of EIF-2-bound GDP for GTP. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes; related proteins from archaebacteria and IF-2 from prokaryotes and also contains a subfamily of proteins in eukaryotes, archaeae (e.g. Pyrococcus furiosus), or eubacteria such as Bacillus subtilis and Thermotoga maritima. Many of these proteins were initially annotated as putative translation initiation factors despite the fact that there is no evidence for the requirement of an IF2 recycling factor in prokaryotic translation initiation. Recently, one of these proteins from B. subtilis has been functionally characterised as a 5-methylthioribose-1-phosphate isomerase (MTNA). This enzyme participates in the methionine salvage pathway catalysing the isomerisation of 5-methylthioribose-1-phosphate to 5-methylthioribulose-1-phosphate. The methionine salvage pathway leads to the synthesis of methionine from methylthioadenosine, the end product of the spermidine and spermine anabolism in many species.
Inorganic pyrophosphatase (PPase) is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.
The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.
Synonym(s): Di-trans-poly-cis-undecaprenyl-diphosphate synthase, Undecaprenyl pyrophosphate synthetase, Undecaprenyl pyrophosphate synthase, UPP synthetase
Di-trans-poly-cis-decaprenylcistransferase (UPP synthetase) generates undecaprenyl pyrophosphate (UPP) from isopentenyl pyrophosphate (IPP). This bacterial enzyme is also found in archaebacteria and in a number of uncharacterised proteins including some from yeasts.
This entry also matches related enzymes that transfer alkyl groups, such as dehydrodolichyl diphosphate synthase.
Glutaredoxins, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system.
Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.
Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.
This family groups a number of hypothetical proteins from different organisms which are related to glutaredoxin proteins.All proteins in this family, for which functions are known, are single-stranded DNA-binding proteins that function in many processes including transcription, repair, replication and recombination. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation.
A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.
Delayed-early response (DER) gene products include growth progression factors and several unknown products of novel cDNAs. Murine and human cDNAs from one novel DER gene (DER12) have been characterised to identify its product and to examine its role in the growth response. Both sequences encode a hydrophobic 36kD protein that is predicted to contain 8 transmembrane (TM) domains. The protein has been localised to the nucleolus, where its concentration increases following mitogen stimulation.
Although the function of the protein is unknown, its identification as a nucleolar gene transcriptionally activated by growth factors implicates it as participating in the proliferative response. Sequence analysis reveals the protein to share a high degree of similarity with the C-terminal portion of equilibrative nucleoside transporters. These proteins are integral membrane proteins which enable the movement of hydrophilic nucleosides and nucleoside analogs down their concentration gradients across cell membranes. ENT family members have been identified in humans, mice, fish, tunicates, slime molds, and bacteria.
Thymidylate kinase (dTMP kinase) catalyzes the phosphorylation of thymidine 5'-monophosphate (dTMP) to form thymidine 5'-diphosphate (dTDP) in the presence of ATP and magnesium:
ATP + thymidine 5'-phosphate = ADP + thymidine 5'-diphosphate
Thymidylate kinase is an ubiquitous enzyme of about 25 Kd and is important in the dTTP synthesis pathway for DNA synthesis. The function of dTMP kinase in eukaryotes comes from the study of a cell cycle mutant, cdc8, in Saccharomyces cerevisiae. Structural and functional analyses suggest that the cDNA codes for authentic human dTMP kinase. The mRNA levels and enzyme activities corresponded to cell cycle progression and cell growth stages.
This family represent GDP-mannose 4,6-dehydratase, also known as GDP-D-mannose dehydratase. This enzyme converts GDP-mannose to GDP-4-dehydro-6-deoxy-D-mannose, the first of three steps for the conversion of GDP-mannose to GDP-fucose in animals, plants, and bacteria. In bacteria, GDP-L-fucose acts as a precursor of surface antigens such as the extracellular polysaccharide colanic acid of Escherichia coli. Excluded from this family are members of the clade that are poorly related because of highly dervied (phylogenetically long-branch) sequences, e.g. Aneurinibacillus thermoaerophilus Gmd, described as a bifunctional GDP-mannose 4,6-dehydratase/GDP-6-deoxy-D-lyxo-4-hexulose reductase.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian, Trypanosoma brucei, Caenorhabditis elegans and fungal L44, and Haloarcula marismortui LA.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase family S14 (ClpP endopeptidase family, clan SK). ClpP is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of ATP, although the P subunit alone does possess some catalytic activity. This family of sequences represent the P subunit.
Proteases highly similar to ClpP have been found to be encoded in the genome of bacteria, metazoa, some viruses and in the chloroplast of plants. A number of the proteins in this family are classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.
Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension.
Eukaryotic single-stranded RNA-binding proteins often contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This is known as the putative RNA-binding region RNP-1 signature, or RNA recognition motif (RRM). RRMs are found in a variety of canonical RNA-binding proteins. These include heterogeneous nuclear ribonucleoproteins (hnRNPs), implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs), central players in mRNA splicing. The motif also appears in a few single stranded DNA-binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, and a third helix present during RNA-binding in some cases. The biological role of proteins classified in this subfamily is unknown.
Eukaryotic single-stranded RNA-binding proteins often contain one or more copies of a putative RNA-binding domain of approximately 90 amino acids. This is known as the putative RNA-binding region RNP-1 signature, or RNA recognition motif (RRM). RRMs are found in a variety of canonical RNA-binding proteins. These include heterogeneous nuclear ribonucleoproteins (hnRNPs), implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs), central players in mRNA splicing. The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, and a third helix present during RNA-binding in some cases.
Bruno is an RNA-binding protein in Drosophila that acts as a translational repressor and is involved in multiple aspects of pattern formation in embryonic development. Bruno-like RNA-binding proteins, also referred to as CUG-BP, etr-like, or CELF proteins, display high homology to Bruno and contain multiple RRM domains. These types of RNA-binding proteins function in a wide variety of biological processes.
These sequences represent a subfamily of RNA splicing factors including the Pad-1 protein (Neurospora crassa), CAPER (mouse) and CC1.3 (human). All are characterised by an N-terminal arginine-rich, low complexity domain followed by three (or in the case of 4 human paralogs, two) RNA recognition domains. These splicing factors are closely related to the U2AF splicing factor family.
Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein .
Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.
A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity. The various forms of PDI which are currently known are:
Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:
This entry represents the core thioredoxin domain.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 82 to 87 amino acids. The amino termini are all N alpha-acetylated. The N-terminal halves of the protein molecules are highly conserved in contrast to the carboxy-terminal parts.
Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::
N-formyl-L-methionine + H2O = formate + methionyl peptideCatalytic efficiency strongly depends on the identity of the bound metal.
The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .
This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family contains the S24e ribosomal proteins from eukaryotes and archaebacteria. These proteins have 101 to 148 amino acids.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein, L27 is found in fungi, plants, algae and vertebrates. The family has a specific signature at the C terminus.
Synonym(s): Peptidylprolyl cis-trans isomerase
FKBP-type peptidylprolyl isomerases in vertebrates, are receptors for the two immunosuppressants, FK506 and rapamycin. The drugs inhibit T cell proliferation by arresting two distinct cytoplasmic signal transmission pathways. Peptidylprolyl isomerases accelerate protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. These proteins are found in a variety of organisms.
DNA primase synthesizes the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large (p60) and small (p50) subunits in eukaryotes. This family represents sequences of the small subunit and the DNA primase sequences of the Archaea. No sequence similarity can be detected between the eukaryotic p50 and p60 subunits and the primases purified from bacteriophage and bacteria.
This entry represents the eukaryotic and archaeal proteins, and does not include viral proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L28e forms part of the 60S ribosomal subunit. This family is found in eukaryotes. In rat there are 9 or 10 copies of the L28 gene. The L28 protein contains a possible internal duplication of 9 residues.
Proteins resident in the lumen of the endoplasmic reticulum (ER) contain a C-terminal tetrapeptide, commonly known as Lys-Asp-Glu-Leu (KDEL) in mammals and His-Asp-Glu-Leu (HDEL) in yeast (Saccharomyces cerevisiae) that acts as a signal for their retrieval from subsequent compartments of the secretory pathway. The receptor for this signal is a ~26 kDa Golgi membrane protein, initially identified as the ERD2 gene product in S. cerevisiae. The receptor molecule, known variously as the ER lumen protein retaining receptor or the 'KDEL receptor', is believed to cycle between the cis side of the Golgi apparatus and the ER. It has also been characterised in a number of other species, including plants, Plasmodium, Drosophila and mammals. In mammals, 2 highly related forms of the receptor are known.
The KDEL receptor is a highly hydrophobic protein of 220 residues; its sequence exhibits 7 hydrophobic regions, all of which have been suggested to traverse the membrane. More recently, however, it has been suggested that only 6 of these regions are transmembrane (TM), resulting in both N- and C-termini on the cytoplasmic side of the membrane.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.
Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.
The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.
The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis, as well as muscle contraction.
The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins.
The beta subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species.
HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence-specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positivecharge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.
HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence- specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positive charge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.
S-AdoMet + tRNA = S-adenosyl-L-homocysteine + tRNA containing N2-methylguanineThe TRM1 gene of Saccharomyces cerevisiae is necessary for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNAs. The enzyme is found in both eukaryotes and archaea.
The Myb gene family became a topic of interest following the discovery of the v-Myb avian retroviral oncogene and its cellular homologue, c-Myb. Mammals, birds, and amphibians were all found to contain three different Myb-related genes. The vertebrate Myb proteins are nuclear and can bind to the same specific DNA sequences YAAC(G/T)G. The Myb domain is included in the SANT domain family, and is a conserved region consisting of three tandem repeats. A-Myb and c-Myb encode tissue-specific transcriptional activators. In contrast, B-Myb appears to be essential in all dividing cells and transcriptional activation appears unlikely to be its primary physiologic function. Only single Myb genes are present in invertebrates such as the sea urchin (Strongylocentrotus purpuratus (Purple sea urchin) and Drosophila melanogaster and these genes most closely resemble vertebrate B-Myb. Myb-related transcription factor gene has been isolated from the cellular slime mold, Dictyostelium discoideum (Slime mold). While Myb repeat containing transcription factors are highly represented in plants, with more than two hundred proteins represented in maize and over one hundred present in Arabidopsis thaliana (Mouse-ear cress). Other Myb-related proteins include factors essential components of the mRNA splicing machinery such as CDC5/CEF1.
Cyclase-associated proteins (CAPs) are highly conserved actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways. In Drosophila, CAP functions in Hedgehog-mediated eye development and in establishing oocyte polarity. In Dictyostelium (slim mold), CAP is involved in microfilament reorganisation near the plasma membrane in a PIP2-regulated manner and is required to perpetuate the cAMP relay signal to organise fruitbody formation. In plants, CAP is involved in plant signalling pathways required for co-ordinated organ expansion. In yeast, CAP is involved in adenylate cyclase activation, as well as in vesicle trafficking and endocytosis. In both yeast and mammals, CAPs appear to be involved in recycling G-actin monomers from ADF/cofilins for subsequent rounds of filament assembly. In mammals, there are two different CAPs (CAP1 and CAP2) that share 64% amino acid identity.
All CAPs appear to contain a C-terminal actin-binding domain that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes. CAP directly regulates actin filament dynamics and has been implicated in a number of complex developmental and morphological processes, including mRNA localisation and the establishment of cell polarity. Actin exists both as globular (G) (monomeric) actin subunits and assembled into filamentous (F) actin. In cells, actin cycles between these two forms. Proteins that bind F-actin often regulate F-actin assembly and its interaction with other proteins, while proteins that interact with G-actin often control the availability of unpolymerised actin. CAPs bind G-actin.
In addition to actin-binding, CAPs can have additional roles, and may act as bifunctional proteins. In Saccharomyces cerevisiae (Baker's yeast), CAP is a component of the adenylyl cyclase complex (Cyr1p) that serves as an effector of Ras during normal cell signalling. S. cerevisiae CAP functions to expose adenylate cyclase binding sites to Ras, thereby enabling adenylate cyclase to be activated by Ras regulatory signals. In Schizosaccharomyces pombe (Fission yeast), CAP is also required for adenylate cyclase activity, but not through the Ras pathway. In both organisms, the N-terminal domain is responsible for adenylate cyclase activation, but the S cerevisiae and S. pombe N-termini cannot complement one another. Yeast CAPs are unique among the CAP family of proteins, because they are the only ones to directly interact with and activate adenylate cyclase. S. cerevisiae CAP has four major domains. In addition to the N-terminal adenylate cyclase-interacting domain, and the C-terminal actin-binding domain, it possesses two other domains: a proline-rich domain that interacts with Src homology 3 (SH3) domains of specific proteins, and a domain that is responsible for CAP oligomerisation to form multimeric complexes (although oligomerisation appears to involve the N- and C-terminal domains as well). The proline-rich domain interacts with profilin, a protein that catalyses nucleotide exchange on G-actin monomers and promotes addition to barbed ends of filamentous F-actin. Since CAP can bind profilin via a proline-rich domain, and G-actin via a C-terminal domain, it has been suggested that a ternary G-actin/CAP/profilin complex could be formed.
This entry represents CAP proteins from various organisms.
The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis, as well as muscle contraction.
The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins.
The alpha subunit is a protein of about 268 to 286 amino acid residues whose sequence is well conserved in eukaryotic species.
The 20S proteasome is a multicatalytic complex that is responsible for the non-lysosomal degradation of intracellular proteins. The proteasome is composed of a catalytic core that is regulated by protein complexes, which bind to the ends of the cylindrical core structure. One of these regulatory complexes is the PA28 activator complex (also known as the 11S regulator, or REG), a ring-shaped hexameric structure that enhances the peptidase activity of the core enzyme. Three REG subunits have been isolated, REGalpha, REGbeta and REGgamma. REGalpha and REGbeta preferentially form a heteromeric complex with alternating alpha and beta subunits. The structure of the human REGalpha subunit reveals a heptameric barrel-shaped assembly containing a central channel. The binding of REG is thought to create a pore through with substrates and products can pass.
Ubiquitin is a protein of seventy six amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. It is widely known as a post-translational tag used to signal a protein's hydrolytic destruction. Other functions for ubiquitin, depend on its differential internal isopeptide linkages. In addition, several ubiquitin-like proteins have been discovered from genome-sequencing efforts, other structural studies, and genetic screens. These new data show that proteins with the ubiquitin domain are adaptable, transposable genetic elements, which have been appended to other genes and utilised for many different cellular functions, depending on the ubiquitin-like protein's identity, subcellular location, and method of covalent attachment. The post-translational ligation of proteins to members of the ubiquitin superfamily can signal many different fates for the target protein.
Ubiquitin is a globular protein, the last four C-terminal residues (Leu-Arg-Gly-Gly) extending from the compact structure to form a 'tail' important for its function. The latter is mediated by the covalent conjugation of ubiquitin to target proteins, by an isopeptide linkage between the C-terminal glycine and the epsilon amino group of lysine residues in the target proteins.
Ubiquilin is a Ubiquitin-like (UBL) protein and has an N-terminal UBL domain and a C-terminal Ub-associated (UBA) domain in its structure.
This family contains dephospho-CoA kinases, which catalyzes the final step in CoA biosynthesis, the phosphorylation of the 3'-hydroxyl group of ribose using ATP as a phosphate donor.
The crystal structures of a number of the proteins in this entry have been determined, including the structure of the protein from Haemophilus influenzae to 2.0-A resolution in a comlex with ATP. The protein consists of three domains: the nucleotide-binding domain with a five-stranded parallel beta-sheet, the substrate-binding alpha-helical domain, and the lid domain formed by a pair of alpha-helices; the overall topology of the protein resembles the structures of other nucleotide kinases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes mammalian ribosomal protein L6 (L6 was previously known as TAX-responsive enhancer element binding protein 107); Caenorhabditis elegans ribosomal protein L6 (R151.3); Saccharomyces cerevisiae (Baker's yeast) ribosomal protein YL16A/YL16B; and Mesembryanthemum crystallinum (Common ice plant) ribosomal protein YL16-like. These proteins have 175 (yeast) to 287 (mammalian) amino acids.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped in this family of ribosomal proteins, S17e. They include, vertebrate, Drosophila and Neurospora crassa (crp-3) S17's as well as yeast S17a (RP51A) and S17b (RP51B) and archaebacterial S17e.
RER1 family proteins are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C terminus of yeast Rer1p interacts with a coatomer complex.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The ribosomal proteins catalyse ribosome assembly and stabilise the rRNA, tuning the structure of the ribosome for optimal function. Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S17 is known to bind specifically to the 5' end of 16S ribosomal RNA in Escherichia coli (primary rRNA binding protein), and is thought to be involved in the recognition of termination codons. Experimental evidence has revealed that S17 has virtually no groups exposed on the ribosomal surface.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
In eubacteria, glycyl-tRNA synthetase is an alpha2/beta2 tetramer composed of 2 different subunits. In some eubacteria, in archaea and eukaryota, glycyl-tRNA synthetase is an alpha2 dimer, this family. It belongs to class IIc and is one of the most complex synthetases. What is most interesting is the lack of similarity between the two types: divergence at the sequence level is so great that it is impossible to infer descent from common genes. The alpha (see and beta subunits (see also lack significant sequence similarity. However, they are translated from a single mRNA, and a single chain glycyl-tRNA synthetase from Chlamydia trachomatis has been found to have significant similarity with both domains, suggesting divergence from a single polypeptide chain.
The sequence and crystal structure of the homodimeric glycyl-tRNA synthetase from Thermus thermophilus, shows that each monomer consists of an active site strongly resembling that of the aspartyl and seryl enzymes, a C-terminal anticodon recognition domain of 100 residues and a third domain unusually inserted between motifs 1 and 2 almost certainly interacting with the acceptor arm of tRNA(Gly). The C-terminal domain has a novel five-stranded parallel-antiparallel beta-sheet structure with three surrounding helices. The active site residues most probably responsible for substrate recognition, in particular in the Gly binding pocket, can be identified by inference from aspartyl-tRNA synthetase due to the conserved nature of the class II active site.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family includes ribosomal L4/L1 from bacteria, chloroplasts and mitochondria. The L4 protein from yeast has been shown to bind rRNA.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31, plant L34, yeast putative ribosomal protein YIL052c and archaebacterial L34e.
This protein has been shown in Saccharomyces cerevisiae (Baker's yeast) to be one of several required for the modification of a particular histidine residue of translation elongation factor 2 to diphthamide. This modified site can then become the target for ADP-ribosylation by diphtheria toxin.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins of 56 to 96 amino-acid residues that share a highly conserved region located in the N-terminal part.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. Examples are:
These proteins have from 64 to 78 amino acids and a highly conserved C-terminal extremity region.
The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins . These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10).
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
A component of 14 to 18 kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in fission yeast (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) is evolutionary related to archaeal subunit K (gene rpoK). The archaeal protein is colinear with the C-terminal part of the eukaryotic subunit.
This family includes eukaryotic translation initiation factor 6 (eIF6) as well as presumed archaeal homologues.
The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This event is rate-limiting for translation, and depends on external stimuli and the status of the cell. Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and prevents its association with the 40S ribosomal subunit ribosomes. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.
This family includes proteins such as Drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family.
This entry represents a group of uncharacterised proteins that appear to be related to nodulin MtN3 and saliva related transmembrane protein.
This entry represents a group of uncharacterised proteins that appear to be related to nodulin MtN3 and saliva related transmembrane protein.
This entry represents RAG1 (recombination activating genes 1)-activating protein 1 homologue. Expression of recombination activating genes (RAG) involved in the V (D) J recombination is regulated by the RAG1 gene activator (RGA) in mammals.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian ribosomal protein L24; yeast ribosomal protein L30A/B (Rp29) (YL21); Kluyveromyces lactis ribosomal protein L30; Arabidopsis thaliana ribosomal protein L24 homolog; Haloarcula marismortui ribosomal protein HL21/HL22; and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1201. These proteins have 60 to 160 amino-acid residues.
Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins and others are activated by this route. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C-terminus.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.
The entry also contains proteins that have lost catalytic activity, for example Spt16 , which is a component of the FACT complex. The crystal structure of the N terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails.
The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( ) and Pob3p (). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamily M24A.
Methionine aminopeptidase (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist . While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown, in the Escherichia coli MAP, to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group is made up of archaeal MAP and eukaryotic MAP-2.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamily M24A.
Methionine aminopeptidase (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist . While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown, in the Escherichia coli MAP, to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group is made up of archaeal MAP and eukaryotic MAP-2 and includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein.
Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents the epsilon subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex.
More information about these proteins can be found at Protein of the Month: Clathrin.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26B.
Eukaryotic microsomal signal peptidase is involved in the removal of signal peptides from secretory proteins as they pass into the endoplasmic reticulum lumen. The peptidase is more complex than its mitochondrial and bacterial counterparts, containing a number of subunits, ranging from two in the chicken oviduct peptidase, to five in the dog pancreas protein. They share sequence similarity with the bacterial leader peptidases (family S26A), although activity here is mediated by a serine/histidine dyad rather than a serine/lysine dyad. Archaeal signal peptidases also belong to this group.
Members of this family are involved in asparagine-linked protein glycosylation. In particular, dolichyl-diphosphooligosaccharide-protein glycosyltransferase (DDOST), also known as oligosaccharyltransferase, transfers the high-mannose sugar GlcNAc(2)-Man(9)-Glc(3) from a dolichol-linked donor to an asparagine acceptor in a consensus Asn-X-Ser/Thr motif. In most eukaryotes, the DDOST complex is composed of three subunits, which in humans are described as a 48kDa subunit, ribophorin I, and ribophorin II. However, the yeast DDOST appears to consist of six subunits (alpha, beta, gamma, delta, epsilon, zeta). The yeast beta subunit is a 45kDa polypeptide, previously discovered as the Wbp1 protein, with known sequence similarity to the human 48kDa subunit and the other orthologues. This family includes the 48kDa-like subunits from several eukaryotes; it also includes the yeast DDOST beta subunit Wbp1.
Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme.
GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.
GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington's disease, Alzheimer's disease, Parkinson's disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins.
This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.
This entry represents Spo11, a meiotic recombination protein found in eukaryotes, and subunit A of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea. These two types of proteins share structural homology.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. They can be divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA. Topoisomerase VI is a type IIB enzymes that assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein.
Spo11 is a meiosis-specific protein that is responsible for the initiation of recombination through the formation of DNA double-strand breaks by a type II DNA topoisomerase-like activity. Spo11 acts in conjunction with several other proteins, including Rec102 in yeast, to bring about meiotic recombination.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
The Drosophila melanogaster protein pelota is proposed to act in protein translation. It can replace the budding yeast protein DOM34, and is closely related to a set of archaeal proteins. This family contains a proposed RNA binding motif, and is homologous to a family of peptide chain release factors. In Drosophila melanogaster it is required prior to the first meiotic division for spindle formation and nuclear envelope breakdown during spermatogenesis. It is also required for normal eye patterning and for mitotic divisions in the ovary. The meiotic defect in pelota mutants may be a complex result of a protein translation defect, as suggested in yeast by ribosomal protein RPS30A being a multicopy suppressor, and by an altered polyribosome profile in DOM34 mutants rescued by RPS30A.
Coronin is an actin-binding protein that belongs to the WD40-repeat family proteins and contains 5 WD40 repeats. The WD40 motif is found in a multitude of eukaryotic proteins involved in a variety of cellular processes. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The final 40 amino acids are predicted to form a coiled-coil in a coronin homodimer.
Coronin has been found to localise to actin-rich regions of the cell and binds to F-actin with a Kd of 1-5 nM in yeast. It has also been shown to have actin bundling and nucleation activity, and can bind to microtubules in vivo. Coronin has also been shown to bind to and inhibit the Arp2/3 complex in yeast.
Diphthine synthase, also known as diphthamide biosynthesis S-adenosylmethionine-dependent methyltransferase, participates in the modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. It is required for the methylation step in dipthamide biosynthesis. The protein was characterised in Saccharomyces cerevisiae and designated DPH5.
Class I aldolases catalyse carbon-carbon bond formation using a 'Schiff base' mechanism. This entry represents deoxyribose-phosphate aldolase, a widely distributed enzyme, which catalyses the following reversible reaction:
2-deoxy-D-ribose 5-phosphate = D-glyceraldehyde 3-phosphate + acetaldehydeWhile the physiological role of this enzyme remains unknown in eukaryotes, in prokaroytes it is thought to function in the catabolism of deoxyribonucleotides.
In all studied structures, the deoxyribose-phophate aldolase subunits adopt the classical eight-bladed TIM barrel fold. The oligomerisation state of the enzyme appears to depend on the living temperature of the organism - the Escherichia coli enzyme is a homodimer, while the enzymes from the thermophilic microorganisms Thermus thermophilus and Aeropyrum pernix are homotetramers. The degree of oligomerisation does not, however, appear to affect catalysis.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Cysteinyl-tRNA synthetase is an alpha monomer and belongs to class Ia.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:
These proteins have 87 to 110 amino-acid residues.
Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.
The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices.
The eubacterial secY protein interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.
Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export. Homologs of secY are found in archaebacteria. SecY is also encoded in the chloroplast genome of some algae where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.
The ribosomal RNA large subunit methyltransferase Jmethylates the 23S rRNA. It specifically methylates the uridine in position 2552 of 23s rRNA in the 50S particle using S-adenosyl-L-methionine as a substrate. It was previously known as cell division protein ftsJ.
A number of uncharacterised hydrophilic proteins of about 30 kDa share regions of similarity. These include,
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Members of this family are large subunit ribosomal proteins which are found in the Eukaryota and Archaea. These proteins have 115 to 187 amino-acid residues. The family consists of:
Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a fidelity factor during the assembly of the ternary initiation complex which consist of the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic protein that binds to the 30S ribosomal subunit. The chloroplast initiation factor IF-3(chl) is a protein that enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal 30s subunits in which the central section is evolutionary related to the sequence of bacterial IF-3.
Ribonuclease HII is involved in the degradation of the ribonucleotide moiety on RNA-DNA hybrid molecules carrying out endonucleolytic cleavage to 5'-phospo-monoester. Proteins which belong to this family have been found in bacteria, archaea, and yeasts. This family also includes Ribonuclease HIII.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 87 to 128 amino-acid residues. This family consists of:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L38e forms part of the 60S ribosomal subunit. This family is found in eukaryotes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L20 is a protein from the large (50S) subunit; in Escherichia coli it is known to bind directly to the 23S rRNA, and is required for ribosome assembly, but does not take part in protein synthesis. It belongs to a family of ribosomal proteins, including L20 from eubacteria, plant and alga chloroplasts and cyanelles.
The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae (Baker's yeast) contains a proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process, and the classes were called der for degradation in the ER. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins, suggesting that the function of the Der1 protein may be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. This family may also mediate degradation of misfolded proteins.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:
Based on their cellular localisation, PTPases are also classified as:
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.
This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types.
This family describes protoheme IX farnesyltransferase, also called haeme O synthase, an enzyme that creates an intermediate in the biosynthesis of haeme A. Prior to the description of its enzymatic function, this protein was often called a cytochrome o ubiquinol oxidase assembly factor.
This entry contains proteins from all branches of life. The molecular function of these proteins are unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton.
The CCAAT-binding factor (CBF) is a mammalian transcription factor that binds to a CCAAT motif in the promoters of a wide variety of genes, including type I collagen and albumin. The factor is a heteromeric complex of A and B subunits, both of which are required for DNA-binding. The subunits can interact in the absence of DNA-binding, conserved regions in each being important in mediating this interaction.
The A subunit can be split into 3 domains on the basis of sequence similarity, a non-conserved N-terminal 'A domain'; a highly-conserved central 'B domain' involved in DNA-binding; and a C-terminal 'C domain', which contains a number of glutamine and acidic residues involved in protein-protein interactions. The A subunit shows striking similarity to the HAP3 subunit of the yeast CCAAT-binding heterotrimeric transcription factor. The Kluyveromyces lactis HAP3 protein has been predicted to contain a 4-cysteine zinc finger, which is thought to be present in similar HAP3 and CBF subunit A proteins, in which the third cysteine is replaced by a serine. This family also includes DNA topoisomerase II, which controls the topology of DNA by transient breaking of the strands and rejoining.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
This family contains the Saccharomyces cerevisiae (Baker's yeast) HAM1 proteinand other hypothetical archaeal, bacterial and Caenorhabditis elegans proteins. S. cerevisiae HAM1 protects against the mutagenic effects of the base analog 6-N-hydroxylaminopurine (HAP) which can be a natural product of monooxygenase activity on adenine. HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions.
Members of this family are helicases that catalyse ATP dependent unwinding of double stranded DNA to single stranded DNA. THe family includes both Rep and UvrD helcases. The Rep family helicases are composed of four structural domains. The Rep proteins function as dimers.
In Escherichia coli, UV and many chemicals appear to cause mutagenesis by a process of translesion synthesis that requires DNA polymerase III and the SOS-regulated proteins UmuD, UmuC and RecA. This machinery allows the replication to continue through DNA lesion, and therefore avoid lethal interruption of DNA replication after DNA damage. UmuC is a well conserved protein in prokaryotes, with a homologue in yeast species.
Proteins currently known to belong to this family are listed below:
Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.
XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.
Members of this family catalyse the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae (Baker's yeast) acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria.
ATP + RNA 3'-terminal-phosphate = AMP + diphosphate + RNA terminal-2',3'-cyclic-phosphateThese enzymes might be responsible for production of the cyclic phosphate RNA ends that are known to be required by many RNA ligases in both prokaryotes and eukaryotes.
RNA cyclase is a protein of from 36 to 42 kDa. The best conserved region is a glycine-rich stretch of residues located in the central part of the sequence and which is reminiscent of various ATP, GTP or AMP glycine-rich loops.
The crystal structure of RNA 3'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3). The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a similar secondary structure element with different topology, observed in many other proteins such as thioredoxin. Although the active site of this enzyme could not be unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate acceptor, in which a number of amino acids are highly conserved in the enzyme from different sources.
The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs.
To carry out its biological functions, Vps5 forms the retromer complex with at least four other proteins: Vps17, Vps26, Vps29, and Vps35.Vps35 contains a central region of weaker sequence similarity, thought to indicate the presence of at least three domains.
The PHO-4 family of transporters includes the phosphate-repressible phosphate permease (PHO-4) from Neurospora crassa which is probably a sodium-phosphate symporter. This family also includes the human leukemia virus receptor.
Ferrochelatase catalyses the last step in haem biosynthesis: the chelation of a ferrous ion to proto-porphyrin IX, to form protohaem. In eukaryotic cells, it binds to the mitochondrial inner membrane with its active site on the matrix side of the membrane.
The X-ray structure of Bacillus subtilis and human ferrochelatase have been solved. The human enzyme exists as a homodimer. Each subunit contains one [Fe2S2] cluster. The monomer is folded into two similar domains, each with a four-stranded parallel beta-sheet flanked by an alpha-helix in a beta-alpha-beta motif that is reminiscent of the fold found in the periplasmic binding proteins. The topological similarity between the domains suggests that they have arisen from a gene duplication event. However, significant differences exist between the two domains, including an N-terminal section (residues 80-130) that forms part of the active site pocket, and a C-terminal extension (residues 390-423) that is involved in coordination of the [Fe2S2]cluster and in stabilisation of the homodimer. The [Fe2S2] cluster ligands are Cys196, Cys403, Cys406 and Cys411. The experiments with Co(II) binding show that His230 and Asp383 are part of the enzyme active site.
Ferrochelatase seems to have a structurally conserved core region that is common to the enzyme from bacteria, plants and mammals. Porphyrin binds in the identified cleft; this cleft also includes the metal-binding site of the enzyme. It is likely that the structure of the cleft region will have different conformations upon substrate binding and release.
GTP cyclohydrolase I catalyzes the biosynthesis of formic acid and dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine-containing pigments in insects. The comparison of the sequence of the enzyme from bacterial and eukaryotic sources shows that the structure of this enzyme has been extremely well conserved throughout evolution.
This entry represents eukaryotic glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system.
Folylpolyglutamate synthase(FPGS) is responsible for the addition of a polyglutamate tail to folate and folate derivatives, is an ATP-dependent enzyme isolated from eukaryotic and bacterial sources, where it plays a key role in the retention of the intracellular folate pool Its sequence is moderately conserved between prokaryotes (gene folC) and eukaryotes.
FPGS belongs to a protein family that contains a number of related peptidoglycan synthetases (Mur)(see.
A crystal structure of the MgATP complex of the enzyme from Lactobacillus casei reveals that folylpolyglutamate synthetase is a modular protein consisting of two domains, one with a typical mononucleotide-binding fold and the other strikingly similar to the folate-binding enzyme dihydrofolate reductase. The active site of the enzyme is located in a large interdomain cleft adjacent to an ATP-binding P-loop motif. Opposite this site, in the C domain, a cavity likely to be the folate binding site has been identified, and inspection of this cavity and the surrounding protein structure suggests that the glutamate tail of the substrate may project into the active site. A further feature of the structure is a well defined Omega loop, which contributes both to the active site and to interdomain interactions.
Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. This protein also includes formyl tetrahydrofolate dehydrogenases, which produce formate from formyl-tetrahydrofolate. These enzymes contain an N-terminal domain in common with other formyl transferase enzymes. The C-terminal domain has an open beta-barrel fold.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L24 is one of the proteins from the large ribosomal subunit. In their mature form, these proteins have 103 to 150 amino-acid residues. This entry represents the archaeal and eukaryotic branch of these proteins, known as the L26 family.
This group of eukaryotic integral membrane proteins are evolutionary related, but exact function has not yet clearly been established. The proteins have from 290 to 435 amino acid residues. Structurally, they seem to be formed of three sections: a N-terminal region with two transmembrane domains, a central hydrophilic loop and a C-terminal region that contains from one to three transmembrane domains. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signalling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1.
SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex and is also involved in the ubiquitin pathway. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the basis of sequence similarities. The structure for S7 is known.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family describes the members from the eukaryotic cytosol and the Archaea of the family that includes ribosomal protein S7 of bacteria and S5 of eukaryotes.
Characterised members of the Multi Antimicrobial Extrusion (MATE) family function as drug/sodium antiporters. These proteins mediate resistance to a wide range of cationic dyes, fluroquinolones, aminoglycosides and other structurally diverse antibodies and drugs. MATE proteins are found in bacteria, archaea and eukaryotes. These proteins are predicted to have 12 alpha-helical transmembrane regions, some of the animal proteins may have an additional C-terminal helix.
chorismate + l-glutamine = anthranilate + pyruvate + l-glutamate.The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that catalyses the formation of anthranilate using ammonia rather than glutamine, while component II provides glutamine amidotransferase activity
This family of proteins represent HslU, a bacterial clpX homolog, which is an ATPase and chaperone belonging to the AAA Clp/Hsp100 family and a component of the eubacterial proteasome.
ATP-dependent protease complexes are present in all three kingdoms of life, where they rid the cell of misfolded or damaged proteins and control the level of certain regulatory proteins. They include the proteasome in Eukaryotes, Archaea, and Actinomycetales and the HslVU (ClpQY, ClpXP) complex in other eubacteria. Genes homologous to eubacterial HslV (ClpQ,) and HslU (ClpY, ClpX) have also been demonstrated in to be present in the genome of trypanosomatid protozoa. They are expressed as precursors, with a propeptide that is removed to produce the active protease. The protease is probably located in the kinetoplast (mitochondrion). Phylogenetic analysis shows that HslV and HslU from trypanosomatids form a single clad with other eubacterial homologs.
Uracil-DNA glycosylase(UNG) is a DNA repair enzyme that excises uracil residues from DNA by cleaving the N-glycosylic bond. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. The sequence of uracil-DNA glycosylase is extremely well conserved in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localisation , but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures to act as a general base in the catalytic mechanism.
This is a family of methyltransferases, so called because they are responsible for the transfer of methyl groups between molecules. Despite its name, it does not occur solely in bacteria. This protein is essential in Escherichia coli and has been linked to peptidoglycan biosynthesis.
The 22 kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore-forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein involved in the development of early-onset glomerulosclerosis.
A member of this family found in Saccharomyces cerevisiae (Baker's yeast) is an integral membrane protein of the inner mitochondrial membrane and has been suggested to play a role in mitochondrial function during heat shock.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of Xenopus S8, and mammalian, insect and yeast S7. These proteins have about 200 amino acids.
Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.
In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.
NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.
NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.
Proliferating cell nuclear antigen (PCNA), or cyclin, is a non-histone acidic nuclear protein that plays a key role in the control of eukaryotic DNA replication. It acts as a co-factor for DNA polymerase delta, which is responsible for leading strand DNA replication. The sequence of PCNA is well conserved between plants and animals, indicating a strong selective pressure for structure conservation, and suggesting that this type of DNA replication mechanism is conserved throughout eukaryotes. In Saccharomyces cerevisiae (Baker's yeast), POL30, is associated with polymerase III, the yeast analog of polymerase delta.
Homologues of PCNA have also been identified in the archaea (Euryarchaeota and Crenarchaeota) and in Paramecium bursaria Chlorella virus 1 (PBCV-1) and in nuclear polyhedrosis viruses.
Partially folded polypeptide chains, either newly made by ribosomes or emerging from mature proteins unfolded by stress, run the risk of aggregating with one another to the detriment of the organism. Folding of newly synthesised polypeptides in the crowded cellular environment requires the assistance of molecular chaperone proteins, such as the large bacterial chaperonins GroEL and GroES.
GroEL and GroES prevent aggregation by encapsulating individual chains within the so-called 'Anfinsen cage' provided by the GroEL-GroES complex, where they can fold in isolation from one another. GroEL consists of two heptameric rings of identical ATPase subunits stacked back to back, containing a cage in each ring. Each subunit consists of three domains. The equatorial domain contains the nucleotide binding site and is connected by a flexible intermediate domain with the apical domain. The latter presents several hydrophobic amino-acid side chains at the top of the ring, orientated towards the cavity of the cage. These side chains are involved in binding either a partially folded polypeptide chain or a single molecule of GroES.
The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins, which include 10 kDa and 60 kDa proteins. These are found in abundance in prokaryotes, chloroplasts and mitochondria. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions.
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The cpn10 and cpn60 oligomers also require Mg2+-ATP in order to interact to form a functional complex, although the mechanism of this interaction is as yet unknown. This chaperonin complex is essential for the correct folding and assembly of polypeptides into oligomeric structures, of which the chaperonins themselves are not a part. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease, and is thought to play a role in the protection of the Legionella bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A. The TCP-1 family of proteins act as molecular chaperones for tubulin, actin and probably some other proteins. They are weakly, but significantly, related to the cpn60/groEL chaperonin family.
Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT theta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.
Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacteria, GroEL/GroES. This family consists exclusively of the CCT alpha chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.
Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT zeta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.
Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT eta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.
Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT beta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.
The TCP-1 protein (Tailless Complex Polypeptide 1) was first identified in mice where it is especially abundant in testis but present in all cell types. It has since been found and characterised in many other animal species, as well as in yeast, plants and protists. TCP-1 is a highly conserved protein of about 60 kDa (556 to 560 residues) which participates in a hetero-oligomeric 900 kDa double-torus shaped particle with 6 to 8 other different subunits. These subunits, the chaperonin containing TCP-1 (CCT) subunit beta, gamma, delta, epsilon, zeta and eta are evolutionary related to TCP-1 itself. The CCT is known to act as a molecular chaperone for tubulin, actin and probably some other proteins.
The TCP-1 family of proteins are weakly, but significantly, related to the cpn60/groEL chaperonin family.
Proteins in this entry consist exclusively of the CCT gamma chain from animals, plants, fungi, and other eukaryotes.
Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT epsilon chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.
Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT delta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.
The ureohydrolase superfamily includes arginase, agmatinase, formiminoglutamase and proclavaminate amidinohydrolase. These enzymes share a 3-layer alpha-beta-alpha structure, and play important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways.
Arginase, which catalyses the conversion of arginine to urea and ornithine, is one of the five members of the urea cycle enzymes that convert ammonia to urea as the principal product of nitrogen excretion. There are several arginase isozymes that differ in catalytic, molecular and immunological properties. Deficiency in the liver isozyme leads to argininemia, which is usually associated with hyperammonemia.
Agmatinase hydrolyses agmatine to putrescine, the precursor for the biosynthesis of higher polyamines, spermidine and spermine. In addition, agmatine may play an important regulatory role in mammals.
Formiminoglutamase catalyses the fourth step in histidine degradation, acting to hydrolyse N-formimidoyl-L-glutamate to L-glutamate and formamide.
Proclavaminate amidinohydrolase is involved in clavulanic acid biosynthesis. Clavulanic acid acts as an inhibitor of a wide range of beta-lactamase enzymes that are used by various microorganisms to resist beta-lactam antibiotics. As a result, this enzyme improves the effectiveness of beta-lactamase antibiotics.
L-Arginine is converted to nitric oxide and citrulline by the enzyme nitric oxide synthase and by the enzyme arginase as a part of the hepatic urea cycle. Arginase is a manganese metalloenzyme containing a metal-activated hydroxide ion, a critical nucleophile in metalloenzymes that catalyze hydrolysis or hydration reactions. A hydrogen bond formed by the metal-bound hydroxide holds the enzyme in the proper orientation for catalysis however non-metal substrate-binding sites are also implicated in the enzyme mechanism. Regeneration of metal-bound hydroxide ion from a metal-bound water molecule requires proton transfer to bulk solvent mediated by a histidine proton shuttle residue.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the C-terminal region found in proteins in the MutS family of DNA mismatch repair proteins. The C-terminal region of MutS is comprised of the ATPase domain and the HTH (helix-turn-helix) domain, the latter being involved in dimer contacts. Yeast MSH3, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
MSH6 is an ATPase that is part of the MSH2-MSH6 complex and has been shown in Homo sapiens (Human) to bind to mismatched DNA directly in the ADP-bound state. The MSH6 has members from yeasts, plants, fish, and mammals. After DNA replication, a low level of replication errors exist, including base-pair mismatches. In bacteria, it was shown that MutS acts with MutL, MutH, and UvrD to correct these errors. In Human, it was shown that MSH2 and MSH6 are involved in the BASC complex (BRCA1-associated genome surveillance complex) with many other proteins including MLH1, RAD50, MRE11, NBS1, RFC1, RFC2, RFC4, BRCA1, ATM, and BLM. In Human, mutations in MSH6 have been shown to be associated with hereditary nonpolyposis colon cancer and endometrial cancer.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry describes the core region of type IA topoisomerases, which are highly conserved enzymes that are structurally distinct from type IB enzymes. The structures of both topoisomerases I and III have been elucidated, and consist of four domains that together form a toroidal molecule with a central hole that is large enough to accommodate single- and double-stranded DNA. It is believed that the domains transiently separate from one another to allow the entrance and exit of DNA strands.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.
The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).
Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.
PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.
Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.
This entry represents the full PGK enzyme.
Adenosine deaminase catalyzes the hydrolytic deamination of adenosine into inosine and AMP deaminase catalyzes the hydrolytic deamination of AMP into IMP. It has been shown that these two enzymes share three regions of sequence similarities; these regions are centred on residues which are proposed to play an important role in the catalytic mechanism of these two enzymes.
Histone H3 is one of the four histones, along with H2A, H2B and H4, which form the eukaryotic nucleosome octomer core; the nucleosome octamer winds ~146 DNA base-pairs. It is a highly conserved protein of 135 amino acid residues.
Several proteins have been found to contain a C-terminal H3-like domain, including the mammalian centromeric protein CENP-A (which may act as a core histone necessary for the assembly of centromeres); yeast chromatin- associated protein CSE4; and Caenorhabditis elegans chromosome III proteins YL82_CAEEL and YMH3_CAEEL, whose function is unknown.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic, bacterial and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:
These proteins, of the L30e family, have 82 to 114 amino-acid residues.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Prolyl-tRNA synthetase is a class II tRNA synthetase and is recognized by which recognises tRNA synthetases for Gly, His, Ser, and Pro. The prolyl-tRNA synthetases are divided into two widely divergent families. This family includes the archaeal enzyme, the Pro-specific domain of a human multifunctional tRNA ligase, and the enzyme from the spirochete Borrelia burgdorferi (Lyme desease spirochete). The other family includes enzymes from Escherichia coli, Bacillus subtilis, Synechocystis sp. (strain PCC 6308), and one of the two prolyl-tRNA synthetases of Saccharomyces cerevisiae (Baker's yeast).Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.
The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.
Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.
This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme.
Phosphoglucose isomerase (PGI) is a dimeric enzyme that catalyses the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the Entner-Doudouroff pathway. The multifunctional protein, PGI, is also known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumour-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, it catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine.
PGI from Bacillus stearothermophilus has an open twisted alpha/beta structural motif consisting of two globular domains and two protruding parts. It has been suggested that the top part of the large domain together with one of the protruding loops might participate in inducing the neurotrophic activity. The structure of rabbit muscle phosphoglucose isomerase complexed with various inhibitors shows that the enzyme is a dimer with two alpha/beta-sandwich domains in each subunit. The location of the bound D-gluconate 6-phosphate inhibitor leads to the identification of residues involved in substrate specificity. In addition, the positions of amino acid residues that are substituted in the genetic disease nonspherocytic hemolytic anemia suggest how these substitutions can result in altered catalysis or protein stability.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Histidyl-tRNA synthetase is an alpha2 dimer that belongs to class IIa. Every completed genome includes a histidyl-tRNA synthetase. Apparent second copies from Bacillus subtilis, Synechocystis sp. (strain PCC 6803), and Aquifex aeolicus are slightly shorter, more closely related to each other than to other hisS proteins, and not demonstrated to act as histidyl-tRNA synthetases (see. The regulatory protein kinase GCN2 of Saccharomyces cerevisiae (YDR283c), and related proteins from other species designated eIF-2 alpha kinase, have a domain closely related to histidyl-tRNA synthetase that may serve to detect and respond to uncharged tRNA(his), an indicator of amino acid starvation, but these regulatory proteins are not orthologous.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family describes the ribosomal protein of the eukaryotic cytosol and of Archaea, homologous to S2 of bacteria. It is designated typically as Sa in eukaryotes and Sa or S2 in the archaea.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins have been grouped on the basis of sequence similarities. Ribosomal protein S6 is the major substrate of protein kinases in eukaryotic ribosomes and may play an important role in controlling cell growth and proliferation through the selective translation of particular classes of mRNA.
1L-myo-Inositol-1-phosphate synthase catalyzes the conversion of D-glucose 6-phosphate to 1L-myo-inositol-1-phosphate, the first committed step in the production of all inositol-containing compounds, including phospholipids, either directly or by salvage. The enzyme exists in a cytoplasmic form in a wide range of plants, animals, and fungi. It has also been detected in several bacteria and a chloroplast form is observed in alga and higher plants. Inositol phosphates play an important role in signal transduction.
In Saccharomyces cerevisiae (Baker's yeast), the transcriptional regulation of the INO1 gene has been studied in detail and its expression is sensitive to the availability of phospholipid precursors as well as growth phase. The regulation of the structural gene encoding 1L-myo-inositol-1-phosphate synthase has also been analyzed at the transcriptional level in the aquatic angiosperm, Spirodela polyrrhiza (Giant duckweed) and the halophyte, Mesembryanthemum crystallinum (Common ice plant).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This ribosomal protein is found in archaebacteria and eukaryotes. Ribosomal protein L37 has a single zinc finger-like motif of the C2-C2 type.
Prokaryotes and eukaryotes respond to heat shock and other forms of environmental stress by inducing synthesis of heat-shock proteins (hsp). The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1Â2% of cellular proteins under non-stress conditions. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.
The sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal domain by a highly-acidic, flexible linker region.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M1 (clan MA(E)), the type example being aminopeptidase N from Homo sapiens (Human). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.
Membrane alanine aminopeptidase is part of the HEXXH+E group; it consists entirely of aminopeptidases, spread across a wide variety of species. Functional studies show that CD13/APN catalyzes the removal of single amino acids from the amino terminus of small peptides and probably plays a role in their final digestion; one family member (leukotriene-A4 hydrolase) is known to hydrolyse the epoxide leukotriene-A4 to form an inflammatory mediator. This hydrolase has been shown to have aminopeptidase activity, and the zinc ligands of the M1 family were identified by site-directed mutagenesis on this enzyme CD13 participates in trimming peptides bound to MHC class II molecules and cleaves MIP-1 chemokine, which alters target cell specificity from basophils to eosinophils. CD13 acts as a receptor for specific strains of RNA viruses (coronaviruses) which cause a relatively large percentage of upper respiratory trace infections.
CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
The M1 family of zinc metallopeptidases contains a number of distinct, well-separated clades of proteins with aminopeptidase activity. Several are designated aminopeptidase N after the Escherichia coli enzyme, suggesting a similar activity profile (seefor a description of catalytic activity).
This family of zinc metallopeptidases belong to MEROPS peptidase family M1 (aminopeptidase N, clan MA); the majority are identified as alanyl aminopeptidases (proteobacteria) that are closely related to E. coli PepN and presumed to have a similar (not identical) function. Nearly all are found in proteobacteria, but members are found also in cyanobacteria, plants, and apicomplexan parasites. This family differs greatly in sequence from the family of aminopeptidases typified by Streptomyces lividans PepN and from the membrane bound aminopeptidase N family in animals.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Phenylalanyl-tRNA synthetase is an alpha2/beta2 tetramer composed of 2 subunits that belongs to class IIc. In eubacteria, a small subunit (pheS gene) can be designated as beta (E. coli) or alpha subunit (nomenclature adopted in InterPro). Reciprocally the large subunit (pheT gene) can be designated as alpha (E. coli) or beta (see. In all other kingdoms the two subunits have equivalent length in eukaryota, and can be identified by specific signatures. The enzyme from Thermus thermophilus has an alpha 2 beta 2 type quaternary structure and is one of the most complicated members of the synthetase family. Identification of phenylalanyl-tRNA synthetase as a member of class II aaRSs was based only on sequence alignment of the small alpha-subunit with other synthetases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit. This entry represents ribosomal protein L13 from bacteria, mitochondria and chloroplasts.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit. This model represents ribosomal protein of L13 from the Archaea and from the eukaryotic cytosol.
5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMPThis provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process cause various biological and genetic abnormalities, such as thymineless death. The enzyme is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.
CTP synthase is involved in pyrimidine ribonucleotide/ribonucleoside metabolism, catalysing the synthesis of CTP from UTP by amination of the pyrimidine ring at the 4-position. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. This gene has been found roughly 500 bp upstream of enolase in both beta (Nitrosomonas europaea) and gamma (Escherichia coli) subdivisions of Proteobacterium.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.
The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.
Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.
This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).
A group of polyamine biosynthetic enzymes involved in the fifth (last) step in the biosynthesis of spermidine from arginine and methionine which includes; spermidine synthase, spermine synthase and putrescine N-methyltransferase.
The Thermotoga maritima spermidine synthase monomer consists of two domains: an N-terminal domain composed of six beta-strands, and a Rossmann-like C- terminal domain. The larger C-terminal catalytic core domain consists of a seven-stranded beta-sheet flanked by nine alpha helices. This domain resembles a topology observed in a number of nucleotide and dinucleotide-binding enzymes, and in S-adenosyl-L-methionine (AdoMet)- dependent methyltransferase (MTases).
Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are considered to be efflux pumps that remove these ions from cells, however others are implicated in ion uptake. The family has six predicted transmembrane domains. Members of the family are variable in length because of variably sized inserts, often containing low-complexity sequence.
Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.
Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain.
The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals, the function of each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction process by hydrogen atom abstraction from the ribonucleotide. The radical-based reaction involves five cysteines: two of these are located at adjacent anti-parallel strands in a new type of ten-stranded alpha/beta-barrel; two others reside at the carboxyl end in a flexible arm; and the fifth, in a loop in the centre of the barrel, is positioned to initiate the radical reaction. There are several regions of similarity in the sequence of the large chain of prokaryotes, eukaryotes and viruses spread across 3 domains: an N-terminal domain common to the mammalian and bacterial enzymes; a C-terminal domain common to the mammalian and viral ribonucleotide reductases; and a central domain common to all three.
Protein-L-isoaspartate(D-aspartate) O-methyltransferase (PCMT) (which is also known as L-isoaspartyl protein carboxyl methyltransferase) is an enzyme that catalyses the transfer of a methyl group from S-adenosylmethionine to the free carboxyl groups of D-aspartyl or L-isoaspartyl residues in a variety of peptides and proteins. The enzyme does not act on normal L-aspartyl residues L-isoaspartyl and D-aspartyl are the products of the spontaneous deamidation and/or isomerisation of normal L-aspartyl and L-asparaginyl residues in proteins. PCMT plays a role in the repair and/or degradation of these damaged proteins; the enzymatic methyl esterification of the abnormal residues can lead to their conversion to normal L-aspartyl residues. The SAM domain is present in most of these proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes yeast S7 (YS6); archaeal S4e; and mammalian and plant cytoplasmic S4. Two highly similar isoforms of mammalian S4 exist, one coded by a gene on chromosome Y, and the other on chromosome X. These proteins have 233 to 264 amino acids.
Microtubules are polymers of tubulin, a dimer of two 55-kDa subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end.
For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.
Most species, excepting simple eukaryotes, express a variety of closely- related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly.
Most species, excepting simple eukaryotes, express a variety of closely related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. British type familial amyloidosis is an autosomal dominant disease characterised by progressive dementia, spastic paralysis and ataxia. Amyloid deposits from the brain tissue of an individual who died with this disease have been characterised. Trypsin digestion and subsequent N-terminal sequence analysis yielded a number of short sequences, all of which are tryptic fragments of the C-termini of human alpha- and beta-tubulin. Consistent with the definition of amyloid, synthetic peptides based on the sequences of these fragments formed fibrils in vitro, suggesting that the C-termini of both alpha- and beta-tubulin are closely associated with the amyloid deposits of this type of amyloidosis. Several alpha-tubulin isotypes have been described, each distinguished by the presence of unique amino acid substitutions within the coding region. Most of these isotype-specific amino acids are clustered at the C-terminus. Patterns of developmental expression of the various alpha-tubulin isotypes have been studied. Results suggest that individual tubulin isotypes confer functional specificity on different kinds of microtubules.
Most species, excepting simple eukaryotes, express a variety of closely related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. British type familial amyloidosis is an autosomal dominant disease characterised by progressive dementia, spastic paralysis and ataxia. Amyloid deposits from the brain tissue of an individual who died with this disease have been characterised. Trypsin digestion and subsequent N-terminal sequence analysis yielded a number of short sequences, all of which are tryptic fragments of the C-termini of human alpha- and beta-tubulin. Consistent with the definition of amyloid, synthetic peptides based on the sequences of these fragments formed fibrils in vitro, suggesting that the C-termini of both alpha- and beta-tubulin are closely associated with the amyloid deposits of this type of amyloidosis. The amino acid sequences encoded by beta tubulin genes have revealed a high level of overall similarity, but significant divergence between their C-termini. The pattern of expression of the beta-tubulin genes has been studied in several different human cell lines and has revealed varying levels of and differential expression in different cell lines. It appears that distinct human beta-tubulin isotypes are encoded by genes whose exon size and number has been conserved evolutionarily, but whose pattern of expression may be regulated either co-ordinately or uniquely.
Glutathione peroxidase (GSHPx) is an enzyme that catalyses the reduction of hydroxyperoxides by glutathione. Its main function is to protect against the damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates, several forms of GSHPx are known, including a ubiquitous cytosolic form (GSHPx-1), a gastrointestinal cytosolic form (GSHPx-GI), a plasma secreted form (GSHPx-P), and an epididymal secretory form (GSHPx-EP). In addition to these characterised forms, the sequence of a protein of unknown function has been shown to be evolutionary related to those of GSHPx's.
In filarial nematode parasites, the major soluble cuticular protein (gp29) is a secreted GSHPx, which may provide a mechanism of resistance to the immune reaction of the mammalian host by neutralising the products of the oxidative burst of leukocytes. The Escherichia coli protein btuE, a periplasmic protein involved in vitamin B12 transport, is evolutionarily related to GSHPxs, although the significance of this relationship is unclear. The structure of bovine seleno-glutathione peroxidase has been determined. The protein belongs to the alpha-beta class, with a 3 layer(aba) sandwich architecture. The catalyic site of GSHPx contains a conserved residue which is either a cysteine or, in many eukaryotic GSHPx, a selenocysteine.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family describes the ribosomal protein of the eukaryotic cytosol and of the Archaea, variously designated as L17, L22, and L23.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families include mammalian, yeast, Chlamydomonas reinhardtii and Entamoeba histolytica S27, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0250. These proteins have from 62 to 87 amino acids. They contain, in their central section, a putative zinc-finger region of the type C-x(2)-C-x(14)-C-x(2)-C.
Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.
Glutamate dehydrogenases (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids.
Leucine dehydrogenase (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.
Phenylalanine dehydrogenase (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate.
Valine dehydrogenase (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate.
Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+ and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+ and the counter-transport of K+, with no apparent dependence on Cl-. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters.
Sequence analysis of the Na+/Cl- neurotransmitter superfamily reveals that it can be divided into four subfamilies, these being transporters for monoamines, the amino acids proline and glycine, GABA, and a group of orphan transporters.
In eukaryotes, transcription initiation by polymerase II is modulated by both general and specific transcription factors. The general factors (which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIG and TFIIH) operate through common promoter elements, such as the TATA box. Transcription factor IIB (TFIIB) is of central importance in transcription of class II genes. It associates with TFIID-TFIIA bound to DNA (the DA complex) to form a ternary TFIID-IIA-IBB (DAB) complex, which is recognised by RNA polymerase II. TFIIB comprises ~315-340 residues and contains an imperfect C-terminal repeat of a 75-residue domain that may contribute to the symmetry of the folded protein.
The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.
There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.
Ubiquitin-conjugating enzymes (UBC or E2 enzymes) catalyze the covalent attachment of ubiquitin to target proteins. An activated ubiquitin moiety is transferred from an ubiquitin-activating enzyme (E1) to E2, which later ligates ubiquitin directly to substrate proteins with or without the assistance of 'N-end' recognizing proteins (E3). A cysteine residue is required for ubiquitin-thiolester formation. There is a single conserved cysteine in UBC's and the region around that residue is conserved in the sequence of known UBC isozymes. There are, however, exceptions, TSG101 is one of several UBC homologues that lacks this active site cysteine. In most species there are many forms of UBC (at least 9 in yeast) which are implicated in diverse cellular functions.
The specificity of ubiquitination is conferred primarily by interactions of substrates with specific ubiquitin protein ligases (E3s) in association with ubiquitin conjugating enzymes (E2s).
Ubc12 is an E2 conjugating enzyme for RUB1, a ubiquitin-like protein displaying 53% amino acid identity to ubiquitin. It is evolutionarily conserved across species ranging from Arabidopsis thaliana (Mouse-ear cress) to Homo sapiens (Human).
Ubiquitin-conjugating enzymes (UBC or E2 enzymes) catalyze the covalent attachment of ubiquitin to target proteins. An activated ubiquitin moiety is transferred from an ubiquitin-activating enzyme (E1) to E2 which later ligates ubiquitin directly to substrate proteins with or without the assistance of 'N-end' recognizing proteins (E3). A cysteine residue is required for ubiquitin-thiolester formation. There is a single conserved cysteine in UBC's and the region around that residue is conserved in the sequence of known UBC isozymes. There are, however, exceptions, TSG101 is one of several UBC homologues that lacks this active site cysteine. In most species there are many forms of UBC (at least 9 in yeast) which are implicated in diverse cellular functions.
The specificity of ubiquitination is conferred primarily by interactions of substrates with specific ubiquitin protein ligases (E3s) in association with ubiquitin conjugating enzymes (E2s).
This entry includes 24 KD E2 ubiquitin-conjugating enzymes from Arabidopsis thaliana (Mouse-ear cress), Schizosaccharomyces pombe (Fission yeast), Drosophila melanogaster and others.
Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor.
This group includes two proteins from the yeast Saccharomyces cerevisiae (Baker's yeast) but excludes dihydroxyactetone synthases (formaldehyde transketolases) from various yeasts and the even more distant mammalian transketolases. Among the family of thiamine diphosphate-dependent enzymes that includes transketolases, dihydroxyacetone synthases, pyruvate dehydrogenase E1-beta subunits, and deoxyxylulose-5-phosphate synthases, mammalian and bacterial transketolases seem not to be orthologous.
Fructose-bisphosphate aldolase is a glycolytic enzyme that catalyses the reversible aldol cleavage or condensation of fructose-1,6-bisphosphate into dihydroxyacetone-phosphate and glyceraldehyde 3-phosphate. There are two classes of fructose-bisphosphate aldolases with different catalytic mechanisms: class I enzymes are found in animals, do not require a metal ion, and are characterised by the formation of a Schiff base intermediate between a highly conserved active site lysine and a substrate carbonyl group, while the class II enzymes are produced in bacteria and fungi, and require an active-site divalent metal ion. This entry represents the class I enzymes.
In vertebrates, three forms of this enzyme are found: aldolase A is expressed in muscle, aldolase B in liver, kidney, stomach and intestine, and aldolase C in brain, heart and ovary. The different isozymes have different catalytic functions: aldolases A and C are mainly involved in glycolysis, while aldolase B is involved in both glycolysis and gluconeogenesis. Defects in aldolase B result in hereditary fructose intolerance.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents the 116-kDa subunit (or subunit a) and subunit I found in the V0 or A0 complex of V- or A-ATPases, respectively. The 116-kDa subunit is a transmembrane glycoprotein required for the assembly and proton transport activity of the ATPase complex. Several isoforms of the 116-kDa subunit exist, providing a potential role in the differential targeting and regulation of the V-ATPase for specific organelles.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.
The components of the MCM2-7 complex in Homo sapiens (Human) are:
.Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.
MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.
The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.
In addition to its role as a replication factor, the MCM7 protein has DNA helicase activity when complexed as a hexamer (containing two molecules each of MCM4, MCM6 and MCM7), suggesting that this complex is involved in the initiation of DNA replication as a DNA-unwinding enzyme. The human MCM7 gene has been localised to chromosome 7q21.3-q22.1. Increased expression of MCM7 RNA and protein in MYCN-amplified neuroblastoma tumour and cell lines has been reported. Furthermore, The MCM7 protein has been shown to form complexes with the retinoblastoma protein. These findings suggest MCM7- directed DNA replication contributes to neoplastic transformation.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This protein is known typically as S12 in bacteria, S23 in eukaryotes and as either S12 or S23 in the Archaea.
Bacterial S12 molecules contain a conserved aspartic acid residue which undergoes a novel post-translational modification, beta-methylthiolation, to form the corresponding 3-methylthioaspartic acid.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This family consists of ribosomal protein S12 from bacteria, mitochondria, and chloroplasts.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.
Glutamyl-tRNA(Gln) amidotransferase (Gat; provides a means of producing correctly charged Gln-tRNA(Gln) through the transamidation of mis-acylated Glu-tRNA(Gln) in organisms which lack glutaminyl-tRNA synthetase. The reaction takes place in the presence of glutamine and ATP through an activated gamma-phospho-Glu-tRNA(Gln). The enzyme is composed of three subunits: A (an amidase), B and C. It also exists in eukaryotes as a protein targeted to the mitochondria.
The heterotrimer GatABC is involved in converting Glu to Gln and/or Asp to Asn, when the amino acid is attached to the appropriate tRNA. In Lactobacillus, GatABC is responsible for producing tRNA(Gln). In Archaea, GatABC is responsible for producing tRNA(Asn), while GatDE is responsible for producing tRNA(Gln). In lineages that include Thermus, Chlamydia, or Acidithiobacillus, the GatABC complex catalyses both tRNA(Gln) and tRNA(Asn).
This entry represents aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase subunit B and glutamyl-tRNA(Gln) amidotransferase subunit E.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, chloroplast, cyanelle, and most mitochondrial forms of ribosomal protein L11. L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure. This entry represents the bacterial, chloroplast and mitochondrial forms.
Protein phosphorylation plays a central role in the regulation of cell functions, causing the activation or inhibition of many enzymes involved in various biochemical pathways. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental.
Protein phosphatase-1 (PP1) and protein phosphatase-2A (PP2A) have a broad specificity and there are two closely related isoforms of each, alpha and beta. PP2A is a trimeric enzyme that consists of a core composed of a catalytic subunit associated with a 65 kDa regulatory subunit and a third variable subunit. Protein phosphatase-2B (PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by calmodulin, is composed of two subunits the catalytic A-subunit and the calcium-binding B-subunit. The specificity of PP2B is restricted. Other serine/threonine specific protein phosphatases that have been characterised include mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V), which are closely related but yet distinct from PP2A; yeast phosphatase PPH3, which is similar to PP2A, but with different enzymatic properties; and Drosophila phosphatase-Y (PP-Y), and yeast phosphatases Z1 and Z2 which are closely related but yet distinct from PP1.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents the core four domains that make up aconitase, as well as the structurally similar core domains of homoaconitase, 3-isopropylmalate dehydratase small and large subunits, 2-methylisocitrate dehydratase (AcnD), and iron regulatory protein 2 (IRP2).
More information about these proteins can be found at Protein of the Month: Aconitase.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents several aconitase proteins, including bacterial aconitase A (AcnA), eukaryotic cytosolic aconitase (cAcn) and a few mitochondrial aconitases (mAcn) (but not the majority of mAcn enzymes). In addition, this entry represents the related proteins: iron-regulatory protein 2 (IRP2) and Fe/S-dependent 2-methylisocitrate dehydratase (AcnD;.
More information about these proteins can be found at Protein of the Month: Aconitase.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents the D subunit found in V1 and A1 complexes of V- and A-ATPases, respectively. Subunit D appears to be located in the central stalk, whereas subunits E and G form part of the peripheral stalk connecting V1 and V0. This subunit is the most likely homologue to the gamma subunit of the F1 complex in F-ATPases, which undergoes rotation during ATP hydrolysis and serves an essential function in rotary catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Translation initiation factor 5A (IF-5A) is reported to be involved in the first step of peptide bond formation in translation, to be involved in cell-cycle regulation and to be a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively. IF-5A contains an unusual amino acid, hypusine N-epsilon-(4-aminobutyl-2-hydroxy)lysine), that is required for its function. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported.
The crystal structure of IF-5A from the archaeon Pyrobaculum aerophilum has been determined to 1.75 A. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding.
Sec1-like molecules have been implicated in a variety of eukaryotic vesicle transport processes including neurotransmitter release by exocytosis. They regulate vesicle transport by binding to a t-SNARE from the syntaxin family. This process is thought to prevent SNARE complex formation, a protein complex required for membrane fusion. Whereas Sec1 molecules are essential for neurotransmitter release and other secretory events, their interaction with syntaxin molecules seems to represent a negative regulatory step in secretion.
Serine hydroxymethyltransferase (SHMT) is a pyridoxal phosphate (PLP) dependent enzyme and belongs to the aspartate aminotransferase superfamily (fold type I). The pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved in all forms of the enzyme. The enzyme carries out interconversion of serine and glycine using PLP as the cofactor. SHMT catalyses the transfer of a hydroxymethyl group from N5, N10- methylene tetrahydrofolate to glycine, resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers and the mammalian enzyme forms a homotetramer. PLP dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalysed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), D-amino acid superfamily (fold type IV) and glycogen phophorylase family (fold type V).
In vertebrates, glycine hydroxymethyltransferase exists in a cytoplasmic and a mitochondrial form whereas only one form is found in prokaryotes.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor. Experimental evidence has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Included in the family are one member each from Saccharomyces cerevisiae (Baker's yeast)and Schizosaccharomyces pombe (Fission yeast). These proteins lack an N-terminal mitochondrial transit peptide but contain additional sequence C-terminal to the ribosomal S10 protein region.
This entry describes a universal, mostly one-gene-per-genome GTP-binding protein that associates with ribosomal subunits and appears to play a role in ribosomal RNA maturation. Mutations in this gene are pleiotropic, but it appears that effects on cellular functions such as chromosome partition may be secondary to the effect on ribosome structure.
Spermidine + [eIF-5A]-lysine = 1,3-diaminopropane + [eIF-5A]-deoxyhypusineThe modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation. The structure is known for this enzyme in complex with its NAD+ cofactor.
The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functionally related proteins defined by a conserved hydrophobic core of ten transmembrane domains. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others. It is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations.
The natural resistance of mice to infection with intracellular parasites is controlled by the Bcg locus, which modulates the cytostatic/cytocidal activity of phagocytes. Nramp1, the gene responsible, is expressed exclusively in macrophages and poly-morphonuclear leukocytes, and encodes a polypeptide (natural resistance-associated macrophage protein) with features typical of integral membrane proteins. Other transporter proteins from a variety of sources also belong to this family.
Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.
Rab11 is a ubiquitously expressed Rab protein that is involved in the endosomal recycling pathway in mammalian cells and has been shown to co-localise with Sec15. It also co-localises with the transferrin receptor on pericentriolar recycling endosomes (REs) and is involved in recycling of transferrin to the plasma membrane. Rab11 has also been implicated in apical recycling and transcytosis in Madin-Darby canine kidney cells and trans-Golgi network to plasma membrane trafficking via REs in baby hamster kidney cells.
Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.
In higher plants, Rab18 is a drought-responsive gene involved in proline biosynthesis.
Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.
Rab5 is a regulatory GTPase that is associated with the sorting endosome and participates in endosomal membrane fusion reactions. Recent experiments have provided insights into Rab5 function by demonstrating direct links between Rab5-interacting proteins and components of the membrane fusion apparatus. In addition, a realisation that Rab5 has additional functions in endosome biogenesis is emerging.
Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.
Using antibodies to study in vivo trafficking, Rab6 was found to be in its GTP-bound conformation on the Golgi apparatus and transport intermediates, and the geometry of transport intermediates was modulated by Rab6 activity. Recent work showed that dynactin binds to Rab6 and a Rab6-dependent recruitment to Golgi membranes. Other Golgi Rabs do not bind to dynactin and are unable to support its recruitment to membranes. Rab6 therefore functions as a specificity or tethering factor controlling the recruitment of dynactin to membranes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family includes a number of eukaryotic and archaebacterial ribosomal proteins; mammalian S19, Drosophila S19, Ascaris lumbricoides S19g (ALEP-1) and S19s, yeast YS16 (RP55A and RP55B), Aspergillus S16 and Haloarcula marismortui HS12.
The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.
The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 ÂupÂ, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.
The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.
The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 'up', restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.
The SAR1 protein, first identified in budding yeast, is a 21 kDa GTP- binding protein involved in vesicular transport between the endoplasmic reticulum and the Golgi. It is a GTP-binding protein that takes part in the formation of secretory vesicles by binding to an ER type II membrane protein, Sec12p. It is evolutionary conserved and seems to be present in all eukaryotes.
SAR1 is generally included in the RAS 'superfamily' of small GTP-binding proteins, but it is only slightly related to other RAS proteins. It also differs from RAS proteins in that it lacks cysteine residues at the C terminus and is therefore not subject to prenylation. SAR1 is slightly related to ARFs.
Beta-ketoacyl-ACP synthase(KAS) is the enzyme that catalyses the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component of a number of enzymatic systems, including fatty acid synthetase (FAS), which catalyses the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH; the multi-functional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum, which is involved in the biosynthesis of a polyketide antibiotic; polyketide antibiotic synthase enzyme systems; Emericella nidulans multifunctional protein Wa, which is involved in the biosynthesis of conidial green pigment; Rhizobium nodulation protein nodE, which probably acts as a beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain; and yeast mitochondrial protein CEM1. The condensation reaction is a two step process, first the acyl component of an activated acyl primer is transferred to a cysteine residue of the enzyme and is then condensed with an activated malonyl donor with the concomitant release of carbon dioxide.
This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. A lipoyl group is attached to a completely conserved lysine residue. The H protein shuttles the methylamine group of glycine from the P protein to the T protein.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:
Based on their cellular localisation, PTPases are also classified as:
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.
This entry represents the low molecular weight (LMW) protein-tyrosine phosphatases (or acid phosphatase), which act on tyrosine phosphorylated proteins, low-MW aryl phosphates and natural and synthetic acyl phosphates. The structure of a LMW PTPase has been solved by X-ray crystallography and is found to form a single structural domain. It belongs to the alpha/beta class, with 6 alpha-helices and 4 beta-strands forming a 3-layer alpha-beta-alpha sandwich architecture.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The ribosomal protein L13e is widely found in vertebrates, Drosophila melanogaster, plants, yeast and others.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A variety of eukaryotic and plant ribosomal L10e proteins can be grouped. This family consists of vertebrate L10 (QM), plant L10, Caenorhabditis elegans L10, yeast L10 (QSR1) and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0543.
This family of proteins include rRNA adenine dimethylases (e.g. KsgA) and the Erythromycin resistance methylases (Erm).
The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.
The crystal structure of KsgA from Escherichia coli has been solved to a resolution of 2.1A. It bears a strong similarity to the crystal structure of ErmC' from Bacillus stearothermophilus and a lesser similarity to the yeast mitochondrial transcription factor, sc-mtTFB.
The Erm family of RNA methyltransferases, which methylate a single adenosine base in 23S rRNA confer resistance to the MLS-B group of antibiotics. Despite their sequence similarity, the two enzyme families have strikingly different levels of regulation that remain to be elucidated. Other orthologs, of this family include the yeast and Homo sapiens (Human) mitochondrial transcription factors (MTF1 and h-mtTFB respectively), which are nuclear encoded. Human-mtTFB is able to stimulate transcription in vitro independently of its S-adenosylmethionine binding and rRNA methyltransferase activity.
The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others. All possess a similar structure, with a beta-alpha-beta fold characteristic of nucleotide binding proteins. The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones.
Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases.
Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belongs to MEROPS peptidase family M22 (clan MK). The Pasteurella haemolytica secreted O-sialoglycoprotein endopeptidase Gcp (glycoprotease; cleaves only proteins that are heavily sialylated, in particular those with sialylated serine and threonine residues. It does not cleave unglycosylated proteins, desialylated glycoproteins or glycoproteins that are only N-glycosylated.
In some organisms, the O-sialoglycoprotein endopeptidase domain is fused to the serine/threonine protein kinase domain STYKS.
Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).
Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.
This entry represents types I and II citrate synthase enzymes, as well as the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive. ATP citrate synthase (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial produced citrate.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents EF-Tu (EF1A) proteins found primarily in bacteria, mitochondria and chloroplasts.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents the small sigma subunit of clathrin adaptor AP1, which is also known as the AP19 subunit. The small sigma subunit of AP proteins have been characterised in several species. The sigma subunit plays a role in protein sorting in the late-Golgi/trans-Golgi network (TGN) and/or endosomes.
More information about these proteins can be found at Protein of the Month: Clathrin.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S8 is known to bind directly to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups eubacterial, algal and plant chloroplast, cyanelle, archaebacterial and Marchantia polymorpha mitochondrial S8; mammalian and plant S15A; and yeast S22 (S24) ribosomal proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S11 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins.Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins. L14 is a protein of 119 to 137 amino-acid residues.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Tyrosyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. Studies on tyrosyl-tRNA synthetase provide the first kinetic evidence that the 'KMSKS' motif plays a role in the initial binding of tRNA(Tyr) to tyrosyl-tRNA synthetase.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Seryl-tRNA synthetase exists as monomer and belongs to class IIa.
A number of nucleoside diphosphate and triphosphate hydrolases as well as some yet uncharacterised proteins have been found to belong to the same family. The uncharacterised proteins all seem to be membrane-bound.
CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).
The mechanism of REP-1-mediated membrane association of Rab5 is similar to that mediated by Rab GDP dissociation inhibitor (GDI). REP-1 and Rab GDI also share other functional properties, including the ability to inhibit the release of GDP and to remove Rab proteins from membranes.
The crystal structure of the bovine alpha-isoform of Rab GDI has been determined to a resolution of 1.81A. The protein is composed of two main structural units: a large complex multi-sheet domain I, and a smaller alpha-helical domain II.
The structural organisation of domain I is closely related to FAD-containing monooxygenases and oxidases. Conserved regions common to GDI and the choroideraemia gene product, which delivers Rab to catalytic subunits of Rab geranylgeranyltransferase II, are clustered on one face of the domain. The two most conserved regions form a compact structure at the apex of the molecule; site-directed mutagenesis has shown these regions to play a critical role in the binding of Rab proteins.
GidA is a tRNA modification enzyme found in bacteria and mitochondria. Though its precise molecular function of these proteins is not known, it is involved in the 5-carboxymethylaminomethyl modification of the wobble uridine base in some tRNAs. Sequence variations in the human mitochondrial protein may influence the severity of aminoglycoside-induced deafness.
This entry represents GidA and related proteins, such as Gid, whose functions are not known.
There are four different enzymes that share a similar catalytic mechanism which involves the phosphorylation by ATP (or GTP) of a specific histidine residue in the active site. These enzymes are: ATP citrate-lyase, the primary enzyme responsible for the synthesis of cytosolic acetyl-CoA in many tissues, catalyzes the formation of acetyl-CoA and oxaloacetate from citrate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. ATP-citrate lyase is a tetramer of identical subunits; Succinyl-CoA ligase (GDP-forming) is a mitochondrial enzyme that catalyzes the substrate level phosphorylation step of the tricarboxylic acid cycle: the formation of succinyl-CoA from succinate with a concomitant hydrolysis of GTP to GDP and phosphate. This enzyme is a dimer composed of an alpha and a beta subunits; Succinyl-CoA ligase (ADP-forming) is a bacterial enzyme that during aerobic metabolism functions in the citric acid cycle, coupling the hydrolysis of succinyl-CoA to the synthesis of ATP. It can also function in the other direction for anabolic purposes. This enzyme is a tetramer composed of two alpha and two beta subunits; and Malate-CoA ligase (malyl-CoA synthetase), is a bacterial enzyme that forms malyl-CoA from malate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. Malate-CoA ligase is composed of two different subunits.
Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:
ADP + phosphoenolpyruvate = ATP + pyruvate
The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.
PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.
The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.
The Escherichia coli Hsp40 DnaJ and Hsp70 DnaK cooperate in the binding of proteins at intermediate stages of folding, assembly, and translocation across membranes. Binding of protein substrates to the DnaK C-terminal domain is controlled by ATP-binding and hydrolysis in the N-terminal ATPase domain. The interaction of DnaJ with DnaK is mediated at least in part by the highly conserved N-terminal J-domain of DnaJ. The J-domain interaction is localised to the ATPase domain of DnaK and is likely to be dominated by electrostatic interactions. J-domain may tether DnaK to DnaJ-bound substrates, which DnaK then binds with its C-terminal peptide-binding domain. The peptide-binding domain of DnaJ is comprised of a beta sandwich made up of 6 beta-strands divided into 2 sheets.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP-binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate-binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.
Isocitrate dehydrogenase (IDH) is an important enzyme of carbohydrate metabolism which catalyzes the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ or on NADP+. In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.
The eukaryotic, NADP-dependent isocitrate dehydrogenases, are defined by this group that includes the cytosolic, mitochondrial, and chloroplast enzymes, but does also hit a small number of bacterial proteins.
Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies.
One of these, called class-IV, currently consists of proteins of about 270 to 415 amino-acid residues that share a few regions of sequence similarity. Surprisingly, the best conserved region does not include the lysine residue to which the pyridoxal-phosphate group is known to be attached, in ilvE, but is located some 40 residues at the C terminus side of the pyridoxal-phosphate-lysine. The D-amino acid transferases (D-AAT), which are among the members of this entry, are required by bacteria to catalyse the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins that have from 220 to 250 amino acids.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S4 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S4 is known to bind directly to 16S ribosomal RNA. Mutations in S4 have been shown to increase translational error frequencies. S4 is a protein of 171 to 205 amino-acid residues (except for NAM9, which is much larger). The crystal structure of a bacterial S4 protein revealed a two domain molecule. The first domain is composed of four helices in the known structure. The second domain is in the middle of the first one and displays some structural homology with the ETS DNA binding domain. This family includes small ribosomal subunit S4 from prokaryotes and S9 from animals.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The small ribosomal subunit protein S12 contains 130-150 amino acid residues, and is thought to be involved in the translation initiation step. This family consists of eukaryotic S12 ribosomal proteins, including those from vertebrates, Trypanosoma brucei, Caenorhabditis elegans, Drosophila and Saccharomyces cerevisiae (Baker's yeast).
Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalysing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterised from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present: one involved in purine biosynthesis and the other in the purine nucleotide cycle.
The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:
These proteins have about 200 amino acid residues.
It is thought that NAPs act as histone chaperones, shuttling both core and linker histones from their site of synthesis in the cytoplasm to the nucleus. The proteins may be involved in regulating gene expression and therefore cellular differentiation.
The centrosomal protein c-Nap1, also known as Cep250, has been implicated in the cell-cycle-regulated cohesion of microtubule-organizing centres. This 281 kDa protein consists mainly of domains predicted to form coiled coil structures. The C-terminal region defines a novel histone-binding domain that is responsible for targeting CNAP1, and possibly condensin, to mitotic chromosomes. During interphase, C-Nap1 localizes to the proximal ends of both parental centrioles, but it dissociates from these structures at the onset of mitosis. Re-association with centrioles then occurs in late telophase or at the very beginning of G1 phase, when daughter cells are still connected by post-mitotic bridges. Electron microscopic studies performed on isolated centrosomes suggest that a proteinaceous linker connects parental centrioles and C-Nap1 may be part of a linker structure that assures the cohesion of duplicated centrosomes during interphase, but that is dismantled upon centrosome separation at the onset of mitosis.
L-aspartate + 2-oxoglutarate = oxaloacetate + L-glutamateAminotransferases share certain mechanistic features with other pyridoxal-phosphate-dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue . This family includes some aromatic-amino-acid aminotransferases too.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family represents eukaryotic ribosomal protein S15 and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15.
Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.
Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family.
Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2).
Amidase signature (AS) enzymes are a large group of hydrolytic enzymes that contain a conserved stretch of approximately 130 amino acids known as the AS sequence. They are widespread, being found in both prokaryotes and eukaryotes. AS enzymes catalyse the hydrolysis of amide bonds (CO-NH2), although the family has diverged widely with regard to substrate specificity and function. Nonetheless, these enzymes maintain a core alpha/beta/alpha structure, where the topologies of the N- and C-terminal halves are similar. AS enzymes characteristically have a highly conserved C-terminal region rich in serine and glycine residues, but devoid of aspartic acid and histidine residues, therefore they differ from classical serine hydrolases. These enzymes posses a unique, highly conserved Ser-Ser-Lys catalytic triad used for amide hydrolysis, although the catalytic mechanism for acyl-enzyme intermediate formation can differ between enzymes.
Examples of AS enzymes include:
A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.
Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.
Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .
More information about this protein can be found at Protein of the Month: ATP Synthases.
This entry represents a family defined on the basis of sequence similarity. Most of these proteins are not yet characterised, but those that are include
Most members of this family are phosphoglycerate mutase. This enzyme interconverts 2-phosphoglycerate and 3-phosphoglycerate.
2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate.The enzyme is transiently phosphorylated on an active site histidine by 2,3-diphosphoglyerate, which is both substrate and product. Some members of this family have are phosphoglycerate mutase as a minor activity and act primarily as a bisphoglycerate mutase, interconverting 2,3-diphosphoglycerate and 1,3-diphosphoglycerate.
Ribose 5-phosphate isomerase, also known as phosphoriboisomerase, catalyses the conversion of D-ribose 5-phosphate to D-ribulose 5-phosphate in the non-oxidative branch of the pentose phosphate pathway. The pentose phosphate pathway is a target for chemotherapy against Chagas disease. This family of enzymes is coded for by two genes and is found in many taxa except the viruses. It is a highly conserved enzyme.
Actin is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins.
In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.
Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.
This entry represents the several classes of P-type ATPases, including those that transport K+, Mg2+, Cd2+, Cu 2+, Zn2+, Na+, Ca2+, Na+/K+, and H+/K+. These P-ATPases are found in both prokaryotes and eukaryotes.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Valyl-tRNA synthetase is an alpha monomer that belongs to class Ia.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Leucyl tRNA synthetase is an alpha monomer that belongs to class Ia. There are two different families of leucyl-tRNA synthetases. This family includes the eubacterial and mitochondrial synthetases. The crystal structure of leucyl-tRNA synthetase from the hyperthermophile Thermus thermophilus has an overall architecture that is similar to that of isoleucyl-tRNA synthetase, except that the putative editing domain is inserted at a different position in the primary structure. This feature is unique to prokaryote-like leucyl-tRNA synthetases, as is the presence of a novel additional flexibly inserted domain.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Tyrosyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. Studies on tyrosyl-tRNA synthetase provide the first kinetic evidence that the 'KMSKS' motif plays a role in the initial binding of tRNA (Tyr) to tyrosyl-tRNA synthetase.
Isoleucyl-tRNA synthetase is an alpha monomer that belongs to class Ia. The enzyme, isoleucyl-transfer RNA synthetase, activates not only the cognate substrate L-isoleucine but also the minimally distinct L-valine in the first, aminoacylation step. Then, in a second, "editing" step, the synthetase itself rapidly hydrolyses only the valylated products as shown from the crystal structures.
Macrophage migration inhibitory factor (MIF) is a key regulatory cytokine within innate and adaptive immune responses, capable of promoting and modulating the magnitude of the response. MIF is released from T-cells and macrophages, and acts within the neuroendocrine system. MIF is capable of tautomerase activity, although its biological function has not been fully characterised. It is induced by glucocorticoid and is capable of overriding the anti-inflammatory actions of glucocorticoid. MIF regulates cytokine secretion and the expression of receptors involved in the immune response. It can be taken up into target cells in order to interact with intracellular signalling molecules, inhibiting p53 function, and/or activating components of the mitogen-activated protein kinase and Jun-activation domain-binding protein-1 (Jab-1). MIF has been linked to various inflammatory diseases, such as rheumatoid arthritis and atherosclerosis.
The MIF homologue D-dopachrome tautomerase is involved in detoxification through the conversion of dopaminechrome (and possibly norepinephrinechrome), the toxic quinine product of the neurotransmitter dopamine (and norepinephrine), to an indole derivative that can serve as a precursor to neuromelanin.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Arginyl-tRNA synthetase has been crystallized and preliminary X-ray crystallographic analysis of yeast arginyl-tRNA synthetase-yeast tRNAArg complexes is available.
Ambler recognised four classes of cytC.
Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N-terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C-terminus. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cyt C and prokaryotic 'short' cyt C2 exemplified by Rhodopila globiformis cyt C2; Class IA includes 'long' cyt C2, such as Rhodospirillum rubrum cyt C2 and Aquaspirillum itersonii cyt C-550, which have several extra loops by comparison with Class IB cyt C.
The 3D structures of a considerable number of class IA and IB cytC have been determined. The proteins consist of 3-6 alpha-helices; the three most conserved 'core' helices form a 'basket' around the haem group, with one haem edge exposed to the solvent. Most class I cytC have conserved aromatic residues clustered around the haem and axial ligands.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
The majority of members of this family are zinc-dependent exopeptidases belonging to MEROPS peptidase family M17 (leucyl aminopeptidase, clan MF). This family excludes pepB aminopeptidases, which are also members of MEROPS family M17 (see.
Leucyl aminopeptidase (LAP; selectively release N-terminal amino acid residues from polypeptides and proteins; in general they are involved in the processing, catabolism and degradation of intracellular proteins. Leucyl aminopeptidase forms a homohexamer containing two trimers stacked on top of one another. Each monomer binds two zinc ions. The zinc-binding and catalytic sites are located within the C-terminal catalytic domain. Leucine aminopeptidase has been shown to be identical with prolyl aminopeptidase in mammals.
Interestingly, members of this group are also implicated in transcriptional regulation and are thought to combine catalytic and regulatory properties. The N-terminal domain of these proteins has been shown in Escherichia coli PepA to function as a DNA-binding protein in Xer site-specific recombination and in transcriptional control of the carAB operon. It is not well conserved and in some members can be found only by PSI-BLAST (after 4-6 iterations). It is not clear if the DNA binding function is preserved in all or even in most of the members.
For additional information please see.
S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.
In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.
The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.
Aminotransferases share certain mechanistic features with other pyridoxalphosphate-dependent enzymes, such as the covalent binding of the pyridoxalphosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies. One of these, called class-III, includes acetylornithine aminotransferase, which catalyzes the transfer of an amino group from acetylornithine to alpha-ketoglutarate, yielding N-acetyl-glutamic-5-semi-aldehyde and glutamic acid; ornithine aminotransferase, which catalyzes the transfer of an amino group from ornithine to alpha-ketoglutarate, yielding glutamic-5-semi-aldehyde and glutamic acid; omega-amino acid--pyruvate aminotransferase, which catalyzes transamination between a variety of omega-amino acids, mono- and diamines, and pyruvate; 4-aminobutyrate aminotransferase (GABA transaminase), which catalyzes the transfer of an amino group from GABA to alpha-ketoglutarate, yielding succinate semialdehyde and glutamic acid; DAPA aminotransferase, a bacterial enzyme (bioA), which catalyzes an intermediate step in the biosynthesis of biotin, the transamination of 7-keto-8-aminopelargonic acid to form 7,8-diaminopelargonic acid; 2,2-dialkylglycine decarboxylase, a Burkholderia cepacia (Pseudomonas cepacia) enzyme (dgdA) that catalyzes the decarboxylating amino transfer of 2,2-dialkylglycine and pyruvate to dialkyl ketone, alanine and carbon dioxide; glutamate-1-semialdehyde aminotransferase (GSA); Bacillus subtilis aminotransferases yhxA and yodT; Haemophilus influenzae aminotransferase HI0949; and Caenorhabditis elegans aminotransferase T01B11.2.
Ornithine aminotransferase catalyses the conversion of L-ornithine and a 2-oxo acid to L-glutamate 5-semialdehyde and an L-amino acid. This enzyme is found in low-GC bacteria, where it is responsible for the fourth step in arginine biosynthesis, and in the mitochondrial matrix of eukaryotes, where it controls L-ornithine levels in tissues. In human hereditary ornithine aminotransferase deficiency, the elevated levels of intraocular concentrations of ornithine are responsible for gyrate atrophy, which affects the CNS and peripheral nervous system
Mammalian translationally controlled tumour protein (TCTP) (or P23) is a protein which has been found to be preferentially synthesised in cells during the early growth phase of some types of tumour, but which is also expressed in normal cells. The physiological function of TCTP is still not known. It was first identified as a histamine-releasing factor, acting in IgE +-dependent allergic reactions. In addition, TCTP has been shown to bind to tubulin in the cytoskeleton, has a high affinity for calcium, is the binding target for the antimalarial compound artemisinin, and is induced in vitamin D-dependent apoptosis. TCTP production is thought to be controlled at the translational as well as the transcriptional level.
TCTP is a hydrophilic protein of 18 to 20 Kd. TCTPs do not share significant sequence similarity with any other class of proteins. Recently, the structure of TCTP was determined and exhibited significant structural similarity to the human protein Mss4, which is a guanine nucleotide-free chaperone of the Rab protein. Close homologues have been found in plants, earthworm, Caenorhabditis elegans (F52H2.11), Hydra, Saccharomyces cerevisiae (YKL056c) and Schizosaccharomyces pombe (SpAC1F12.02c).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
L5 is a protein of about 180 amino-acid residues.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C13 (legumain family, clan CD). A type example is legumain from Canavalia ensiformis (Jack bean, Horse bean). The blood fluke parasite Schistosoma mansoni has two cysteine proteases in its digestive tract, one a cathepsin B-like protease, the other termed hemoglobinase. The latter has been hard to purify, free of cathepsin B, and expressed forms in Escherichia coli prove to be inactive, suggesting that hemoglobinase may act in association with cathepsin B. Plant vacuolar processing enzyme and legumain from legumes have been shown to have sequence and functional similarity to hemoglobinase. The catalytic residues of the family are currently unknown, but sequence alignments reveal one totally conserved cysteine and two totally conserved histidines.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the 14 kDa SRP14 component. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.
The cytochrome bd type terminal oxidases catalyse quinol dependent, Na+ independent oxygen uptake. Members of this family are integral membrane proteins and contain a protoheame IX centre B558.
Cytochrome bd may play an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae, where it is expressed under all conditions that permit diazotrophy.
The 14 kDa (or VI) subunit of the complex is not directly involved in electron transfer, but has a role in assembly of the complex.
This family contains the predominant bacterial/eukaryotic adenylyltransferases for nicotinamide-nucleotide and for the deamido form, nicotinate nucleotide. Nicotinamide-nucleotide adenylyltransferase synthesizes NAD by the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD by the de novo pathway.
This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.
Allantoicase (also known as allantoate amidinohydrolase) is involved in purine degradation, facilitating the utilization of purines as secondary nitrogen sources under nitrogen-limiting conditions. While purine degradation converges to uric acid in all vertebrates, its further degradation varies from species to species. Uric acid is excreted by birds, reptiles, and some mammals that do not have a functional uricase gene, whereas other mammals produce allantoin. Amphibians and microorganisms produce ammonia and carbon dioxide using the uricolytic pathway. Allantoicase performs the second step in this pathway catalyzing the conversion of allantoate into ureidoglycolate and urea.
allantoate + H(2)0 = (S)-ureidoglycolate + urea
The structure of allantoicase is best described as being composed of two repeats (the allantoicase repeats: AR1 and AR2), which are connected by a flexible linker. The crystal structure, resolved at 2.4A resolution, reveals that AR1 has a very similar fold to AR2, both repeats being jelly-roll motifs, composed of four-stranded and five-stranded antiparallel beta-sheets. Each jelly-roll motif has two conserved surface patches that probably constitute the active site.
Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of 2 cysteine thiol groups to a disulphide, accompanied by the transfer of 2 electrons and 2 protons. The net result is the covalent interconversion of a disulphide and a dithiol.
Compared to human thioredoxin, human U5 snRNP-specific protein U5-15kD contains 37 additional residues that may cause structural changes which most likely form putative binding sites for other spliceosomal proteins or RNA. Although U5-15kD apparently lacks protein disulphide isomerase activity, it is strictly required for pre-mRNA splicing.
Many eukaryotes possess polynucleotide kinase phosphatase (PNKP), a bifunctional enzyme with 5'-kinase and 3'-phosphatase activities provided by two non-overlapping catalytic domains. These proteins catalyse the dephosphorylation of DNA 3'-phosphates. It is believed that this activity is important for the repair of single-strand breaks in DNA caused by radiation or oxidative damage. Mammalian polynucleotide kinase phosphatase (PNKP) is a key component of both the base excision repair (BER) and nonhomologous end-joining (NHEJ) DNA repair pathways. PNKP creates 5'-phosphate/3'-hydroxyl termini, which are a necessary prerequisite for ligation during repair. PNKP is recruited to repair complexes through interactions between its N-terminal FHA domain and phosphorylated components of either pathway.
Synonym(s): PNKP,PNK
SKIP (SKI-interacting protein) is an essential spliceosomal component and transcriptional coregulator, which may provide regulatory coupling of transcription initiation and splicing. SKIP was identified in a yeast 2-hybrid screen, where it was shown to interact with both the cellular and viral forms of SKI through the highly conserved region on SKIP known as the SNW domain. SKIP is now known to interact with a number of other proteins as well. SKIP potentiates the activity of important transcription factors, such as vitamin D receptor, CBF1 (RBP-Jkappa), Smad2/3, and MyoD. It works with Ski in overcoming pRb-mediated cell cycle arrest, and it is targeted by the viral transactivators EBNA2 and E7.
Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognizing the lesion and recruiting downstream factors.
In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5'- and 3'-side of the lesion, respectively.
The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognized by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilize the double helix in a manner that facilitates the flipping-out of two base pairs.
Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand.
This entry describes proteins of unknown function.
This family is involved in biogenesis of respiratory and photosynthetic systems. In yeast the SCO1 protein is specifically required for a post-translational step in the accumulation of subunits 1 and 2 of cytochrome c oxidase (COXI and COX-II). It is a mitochondrion-associated cytochrome c oxidase assembly factor.
The purple nonsulphur photosynthetic eubacterium Rhodobacter capsulatus is a versatile organism that can obtain cellular energy by several means, including the capture of light energy for photosynthesis as well as the use of light-independent respiration, in which molecular oxygen serves as a terminal electron acceptor. The SenC protein is required for optimal cytochrome c oxidase activity in aerobically grown R. capsulatus cells and is involved in the induction of structural polypeptides of the light-harvesting and reaction centre complexes.
Alg14 is involved dolichol-linked oligosaccharide biosynthesis and anchors the catalytic subunit Alg13 to the ER membrane.
ATPase family gene 1 (AFG1) ATPase is a 377 amino acid putative protein with an ATPase motif typical of the protein family including SEC18p PAS1, CDC48-VCP and TBP. AFG1 also has substantial homology to these proteins outside the ATPase domain. This family of proteins contains a P-loop motif.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
This group of sequences contain aspartic endopeptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22B.
The peptidases were originally classified by hierarchical homology to the most conserved member - IMPAS 1. They are also known as signal peptide peptidase (SPP). They belong to the I-CliP family of peptidases. SPP cleaves cleaves remnant signal peptides left behind in the membrane by the action of signal peptidase and also plays key roles in immune surveillance and the maturation of certain viral proteins . SPPs do not require cofactors as demonstrated by expression in bacteria and purification of a proteolytically active form. The C-terminal region defines the functional domain, which is in itself sufficient for proteolytic activity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L16 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L16 is known to bind directly the 23S rRNA and to be located at the A site of the peptidyltransferase centre. L16 is a protein of 133 to 185 amino-acid residues.
This is a family of eukaryotic proteins, many of which are believed to be involved in cell adhesion. Members are involved in gastrulation and also in metastatis formation and the progression of cancer. Experimental evidence suggests that these proteins are transmembrane and possibly glycoproteins.
The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs.
To carry out its biological functions, Vps5 forms the retromer complex with at least four other proteins: Vps17, Vps26, Vps29, and Vps35. This family of Vps26-proteins also contains Down syndrome critical region 3/A.
Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). Tubulin-tyrosine ligase (TTL) catalyses the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. The true physiological function of TTL has so far not been established. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness.
3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis.
Rcd1 (Required cell differentiation 1) -like proteins are found among a wide range of organisms. Rcd1 was initially identified as an essential factor in nitrogen starvation-invoked differentiation in fission yeast. This results largely from a defect in nitrogen starvation-invoked induction of ste11+, a key transcriptional factor gene required for the onset of sexual development. It is one of the most conserved proteins in eukaryotes, and its mammalian homologue is expressed in a variety of differentiating tissues. The mammalian Rcd1 is a novel transcriptional cofactor and is critical for retinoic acid-induced differentiation of F9 mouse teratocarcinoma cells, at least in part, via forming complexes with retinoic acid receptor and activation transcription factor-2 (ATF-2). Two of the members in this family have been characterised as being involved in regulation of Ste11 regulated sex genes.
This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein which in human is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease.
The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein (e.g. Q07764), which is thought to be a regulatory protein.
This family features sequences that are similar to a region of hypothetical yeast gene product N2227. This is thought to be expressed during meiosis and may be involved in the defence response to stressful conditions.
This family consists of several LUC7 protein homologues that are restricted to eukaryotes. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The entry contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP).
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26A.
At least 3 eubacterial leader peptidases are known: murein prelipoprotein peptidase, which cleaves the leader peptide from a component of the bacterial outer membrane; type IV prepilin leader peptidase; and the serine-dependent leader peptidase 1, which has the more general role of cleaving the leader peptide from a variety of secreted proteins and proteins directed to the periplasm and periplasmic membrane. Leader peptidase 1 is similar to the eukaryotic signal peptidase, although the bacterial protein is monomeric, while the eukaryotic protein is multimeric.
Mitochondria contain a similar two-subunit serine protease that removes leader peptides from nuclear- and mitochondrial-encoded proteins, which localise in the inner mitochondrial space. The catalytic residues of a number of these peptides have been identified as a serine/lysine dyad.
Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.
The 19S component is divided into a "base" subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid" subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12). Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation.
This entry represents Rpn12 (also often annotated as 26S proteasome non-ATPase regulatory subunit 8). This protein has been shown to be important for the transition from metaphase to anaphase and the activation of Cdc28p kinase in yeast.
ArgRIII has been demonstrated to be an inositol polyphosphate kinase which catalyses the reaction
ATP + 1D-myo-inositol 1,4,5-trisphosphate = ADP + 1D-myo-inositol 1,3,4,5-tetrakisphosphate.
Sedlin is a 140 amino-acid protein with a putative role in endoplasmic reticulum-to-Golgi transport. Several missense mutations and deletion mutations in the SEDL gene, which result in protein truncation by frame shift, are responsible for spondyloepiphyseal dysplasia tarda, a progressive skeletal disorder (OMIM:313400). .
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to MEROPS peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.
The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity. Members of the papain family are widespread, found in baculovirus, eubacteria, yeast, and practically all protozoa, plants and mammals. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate.
The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159.
This family of proteins is required for the insertion of integral membrane proteins into cellular membranes. Many of these integral membrane proteins are associated with respiratory chain complexes, for example a large number of members of this family play an essential role in the activity and assembly of cytochrome c oxidase.
Stage III sporulation protein J (SP3J) is a probable lipoprotein, rich in basic and hydrophobic amino acids. Mutations in the protein abolish the transcription of prespore-specific genes transcribed by the sigma G form of RNA polymerase. SP3J could be involved in a signal transduction pathway coupling gene expression in the prespore to events in the mother cell, or it may be necessary for essential metabolic interactions between the two cells. The protein shows a high degree of similarity to Bacillus subtilis YQJG, to yeast OXA1 and also to bacterial 60 kDa inner-membrane proteins.
Paf1 is an RNA polymerase II-associated protein in yeast, which defines a complex that is distinct from the Srb/Mediator holoenzyme. The Paf1 complex, which also contains Cdc73, Ctr9, Hpr1, Ccr4, Rtf1 and Leo1, is required for full expression of a subset of yeast genes, particularly those responsive to signals from the Pkc1/MAP kinase cascade. The complex appears to play an essential role in RNA elongation.
This is a family of eukaryotic ER membrane proteins that are involved in the synthesis of glycosylphosphatidylinositol (GPI), a glycolipid that anchors many proteins to the eukaryotic cell surface. Proteins in this family are involved in transferring the second mannose in the biosynthetic pathway of GPI.
The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionines probably coordinate copper during the process of metal transport.
OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity.
This family consists of several optic atrophy 3 (OPA3) proteins and related proteins from other eukaryotic species, the function is unknown.
This is a family of proteins of unknown function.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26; Octopus S26; Drosophila S26 (DS31); plant cytoplasmic S26; and fungal S26. These proteins have 114 to 127 amino acids.
The complex organic chemistry involved in the transformation of GTP to tetrahydrobiopterin is catalysed by only three enzymes: GTP cyclohydrolase I, 6-pyruvoyltetrahydropterin synthase and sepiapterin reductase. Tetrahydrobiopterin is the cofactor for several aromatic amino acid monooxygenases and the nitric oxide synthases. 6-Pyruvoyl tetrahydropterin synthase (PTPS) is a Zn-dependent metalloprotein, transforms dihydroneopterin triphosphate into 6-pyruvoyltetrahydropterin in the presence of Mg(II) and for which the crystal structure is known.
The enzyme is a homohexameric, composed of a dimer of trimers. A transition metal binding site formed by the three histidine residues 23, 48 and 50 is present in each subunit, and bound Zn(II) is responsible for the enzymatic activity. Site-directed mutagenesis of each of these three histidine residues results in a complete loss of metal binding and enzymatic activity.
The function of the bacterial branch of the sequence lineage appears not to have been established.
DCoH is the dimerisation cofactor of hepatocyte nuclear factor 1 (HNF-1) that functions as both a transcriptional coactivator and a pterin dehydratase. X-ray crystallographic studies have shown that the ligand binds at four sites per tetrameric enzyme, with little apparent conformational change in the protein.
The U2 small nuclear ribonucleoprotein auxiliary factor (U2AF) is a heterodimeric splicing factor composed of a large and a small subunit. The large U2AF subunit recognises the intronic polypyrimidine tract, a sequence located adjacent to the 3' splice site that serves as an important signal for both constitutive and regulated pre-mRNA splicing. The small subunit interacts with the 3' splice site dinucleotide AG and is essential for regulated splicing. The subunits shuttle continuously between the nucleus and the cytoplasm via a mechanism that involves carrier receptors and is independent of binding to mRNA. Both subunits contain an arginine/ serine-rich (RS) domain, which acts as a nuclear localisation signal. Furthermore, the presence of an RS domain on either subunit is sufficient to trigger the nucleocytoplasmic import of the heterodimeric complex.
The human form of the U2 auxiliary factor small subunit, hU2AF35, contains a degenerate RNA recognition motif (RRM) and a C-terminal RS domain. The murine form has been shown to be genomically imprinted with monoallelic expression from the paternal allele. However, this is not the case in humans.
Members of this family are essential for 40S ribosomal biogenesis. They play a role in the methylation reaction of pre-rRNA processing. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry is for the ribosomal protein S30.
Proteins in this entry are found in archaea, bacteria and eukaryotes. Their function is unknown, but alignment shows several conserved polar residues which are potential catalytic residues. The structure of one of these proteins has been determined and shows homolgy to heat shock protein 33, which is a chaperone protein that inhibits the aggregation of partially denatured proteins.
MAT1 (menage a trois 1) is a RING finger protein with a characteristic C3HC4 motif located in the N-terminal domain. MAT1 stabilises the cyclin H-CDK7 complex to form a functional CDK-activating kinase (CAK) enzymatic complex which then goes on to activate many of the CDK enzymes intimately involved in the cell cycle. CDK7 forms a stable complex with cyclin H and MAT1 in vivo only when phosphorylated on either one of two residues (Ser164 or Thr170) in its T-loop. The requirement for MAT1 for the activation of CAK can be by-passed by the phosphorylation of CDK7 on the T-loop. The two mechanisms for CDK7 complex stabilisation and activation (MAT1 addition and T-loop phosphorylation), which can operate independently in vitro, actually cooperate under physiological conditions to maintain complex integrity. With prolonged exposure to elevated temperature, dissociation to monomeric subunits occurs in vivo when CDK7 is dephosphorylated, even in the presence of MAT1.
The Cyclin H-MAT1-CDK7 complex also forms part of TFIIH, a multiprotein complex required for both transcription and DNA repair.
This is a small family of mainly hypothetical proteins of unknown function.
Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP, and can dissociate HMGB1 already bound to TBP/TATA-box.
Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2.
This entry represents the precursor that yields both the alpha and beta subunits of TFIIA. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II.
The Thg1 protein from Saccharomyces cerevisiae (Baker's yeast) is responsible for adding a GMP residue to the 5' end of tRNA His.
The entry represents a subunit specific of RNA Pol III, the tRNA specific polymerase. The C34 subunit of Saccharomyces cerevisiae RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and therefore participates in Pol III recruitment.
This entry also includes some homologus archaeal proteins of unknown function.
Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at invasion front in the placenta from early pregnancy. This family also includes the Saccharomyces cerevisiae protein ENP1. ENP1 is an essential protein in S. cerevisiae and is localised in the nucleus. It is thought that ENP1 plays a direct role in the early steps of rRNA processing as enp1 defective S. cerevisiae cannot synthesise 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the 9 kDa SRP9 component. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.
The biotin operon of Escherichia coli contains 5 structural genes involved in the synthesis of biotin. Transcription of the operon is regulated via one of these proteins, the biotin ligase BirA. BirA is an asymetric protein with 3 specific domains - an N-terminal DNA-binding domain, a central catalytic domain and a C-terminal of unknown function. The ligase reaction intermediate, biotinyl-5'-AMP, is the co-repressor that triggers DNA binding by BirA. The alpha-helical N-terminal domain of the BirA protein has the helix-turn-helix structure of DNA-binding proteins with a central DNA recognition helix. BirA undergoes several conformational changes related to repressor function and the N-terminal DNA-binding function is connected to the rest of the molecule through a hinge which will allow relocation of the domains during the reaction. Biotin-binding causes a large structural change thought to facilitate ATP-binding.
Two repressor molecules form the operator-repressor complex, with dimer formation occuring simultaneously with DNA binding. DNA-binding may also cause a conformational change which allows this co-operative interaction. In the dimer structure, the beta-sheets in the central domain of each monomer are arranged side-by-side to form a single, seamless beta-sheet.
The apparent orthologs among the eukaryotes are larger proteins that contain a domain with high sequence homology to BirA.
A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:
There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA.
This entry contains Utp11, a large ribonuclear protein that associates with snoRNA U3.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The S25 ribosomal protein is a component of the 40S ribosomal subunit.
This family consists of several eukaryotic transcription initiation Spt4 proteins and some related archaeal sequences. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles.
The members of this family are all hypothetical eukaryotic proteins of unknown function. One member is described as being an adipocyte-specific protein, but no evidence of this was found.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L24 is one of the proteins from the large ribosomal subunit. In their mature form, these proteins have 103 to 150 amino-acid residues. This domain is found in L24 and L26 ribosomal proteins.
This family consists of several Rab5-interacting protein (RIP5 or Rab5ip) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
This family is defined by a C-terminal region of approximately 500 residues, which occurs in several hypothetical eukaryotic proteins of unknown function.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents ribosomal protein L15 and homologues found in bacteria, chloroplasts and mitochondria.
The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation.
Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication.
APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts.
One of the subunits of the APC that is required for ubiquitination activity is APC10, a one-domain protein homologous to a sequence element, termed the DOC domain, found in several hypothetical proteins that may also mediate ubiquitination reactions, because they contain combinations of either RING finger (see, cullin (see or HECT (see domains.
The DOC domain consists of a beta-sandwich, in which a five-stranded antiparallel beta-sheet is packed on top of a three stranded antiparallel beta-sheet, exhibiting a 'jellyroll' fold.
Proteins known to contain a DOC domain include:
This is a small family of proteins of unknown function.
Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively.
A number of the members of this family have been characterised as a probable N-acetylglucosaminyl-phosphatidylinositol de-N-acetylase, that catalyses the second step in glycosylphosphatidylinositol (GPI) biosynthesis.
Isy1 protein is important in the optimisation of splicing.
This family consists of several eukaryotic translation initiation factor 3 subunit 12 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits.
Members of this family have been called SAND proteins although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologues to domains of known structure that assemble into cargo vesicle adapter (AP) complexes.
This family consists of several eukaryotic Sin3 associated polypeptide p18 (SAP18) sequences. SAP18 is known to be a component of the Sin3-containing complex, which is responsible for the repression of transcription via the modification of histone polypeptides. SAP18 is also present in the ASAP complex which is thought to be involved in the regulation of splicing during the execution of programmed cell death.
This is a family of eukaryotic proteins with undetermined function.
Phf5 is a member of a novel murine multigene family that is highly conserved during evolution and belongs to the superfamily of PHD-finger proteins. At least one example, from Mus musculus (Mouse), may act as a chromatin-associated protein. The Schizosaccharomyces pombe (Fission yeast) ini1 gene is essential, required for splicing. It is localised in the nucleus, but not detected in the nucleolus and can be complemented by human ini1. The proteins of this family contain five CXXC motifs.
Many eukaryotic proteins are anchored to the cell surface via glycosylphosphatidylinositol (GPI), which is posttranslationally attached to the C terminus by GPI transamidase. The mammalian GPI transamidase is a complex of at least four subunits, GPI8, GAA1, PIG-S, and PIG-T. PIG-U is thought to represent a fifth subunit in this complex and may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI.
This family consists of several eukaryotic ATP11 proteins. The expression of functional F1-ATPase requires two proteins which are encoded by the ATP11 and ATP12 genes. Atp11p is a molecular chaperone of the mitochondrial matrix that participates in the biogenesis pathway to form F1, which is the catalytic unit of ATP synthase. It binds to the free beta subunits of F1, which prevents the beta subunit from associating with itself in non-productive complex. It also allows for the formation of a (alpha beta)3 hexamer.
This family consists of several microsomal signal peptidase 12 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 12 kDa subunit (SPC12).
Proteins in this entry are involved in cell cycle progression and pre-mRNA splicing.
This is a eukaryotic protein of unknown function.
GPI (glycosyl phosphatidyl inositol) transamidase is a multiprotein complex required for a terminal step of adding the glycosylphosphatidylinositol (GPI) anchor attachment onto proteins. Gpi16, Gpi8 and Gaa1 form a sub-complex of the GPI transamidase.
This family is a family of eukaryotic membrane proteins. It was previously annotated as including a putative receptor for human cytomegalovirus gH but this has has since been disputed. Analysis of the mouse Tapt1 protein (transmembrane anterior posterior transformation 1) has shown it to be involved in patterning of the vertebrate axial skeleton.
This entry represents tRNA pseudouridine synthase D (TruD) proteins, which appear to be responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. They are hydrophilic proteins of from 39 to 77 kDa and homologues are found in bacteria, archaea, and eukarya.
This family consists of several conserved hypothetical proteins from both eukaryotes and prokaryotes. The function of members of this family are unknown but are predicted to be SAM-dependent methyltransferases.
This entry describes proteins of unknown function. Structures for two of these proteins, YggU from Escherichia coli and MTH637 from the archaea Methanobacterium thermoautotrophicum, have been determined; they have a core 2-layer alpha/beta structure consisting of beta(2)-loop-alpha-beta(2)-alpha.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein. S18 belongs to a family of ribosomal proteins that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.
This family represents a conserved region approximately 100 residues long within mammalian hepatocellular carcinoma-associated antigen 59 and similar proteins. Family members are found in a variety of eukaryotes, mainly as hypothetical proteins.
This family consists of several hypothetical eukaryotic proteins of unknown function.
In transfer RNA many different modified nucleosides are found, especially in the anticodon region. tRNA (guanine-N1-)-methyltransferaseis one of several nucleases operating together with the tRNA-modifying enzymes before the formation of the mature tRNA. It catalyses the reaction:
S-adenosyl-L-methionine + tRNA -> S-adenosyl-L-homocysteine + tRNA containing N1-methylguaninemethylating guanosine(G) to N1-methylguanine (1-methylguanosine (m1G)) at position 37 of tRNAs that read CUN (leucine), CCN(proline), and CGG (arginine) codons. The presence of m1G improves the cellular growth rate and the polypeptide steptime and also prevents the tRNA from shifting the reading frame.
The mechanism of the trmD3-induced frameshift involving mutant tRNA(Pro) and tRNA(Leu) species has been investigated. It has been suggested that the conformation of the anticodon loop may be a major determining element for the formation of m1G37 in vivo.
Family member HYNA is the product of a novel gene expressed in human liver cancer tissue.
This entry represents a group of leucine carboxymethyltransferases which methylate the carboxyl group of leucine residues to form alpha-leucine ester residues. It includes LCTM1 which regulates the activity of serine/threonine phosphatase 2A (PP2A) through methylation of the C-terminal leucine residue of the catalytic subunit of PP2A . This affects the heteromultimeric composition of PP2A which in turn affects protein recognition and substrate specificity. Like many other methyltransferases LCTM1 uses S-adenosylmethionine (SAM) as the methyl donor. LCTM1 contains the common SAM-dependent methyltransferase core fold, with various insertions and additions creating a specific PP2A binding site. This entry also contains LCTM2, a homologue of LCTM1 which is not necessary for PP2A methylation and whose function is not clear.
This entry describes proteins of unknown function.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .
More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation.
Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event.
This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The protein L2 is found in all ribosomes and is one of the best conserved proteins of this mega-dalton complex. L2 is elongated, exposing one end of the protein to the surface of the intersubunit interface of the 50 S subunit and is essential for the association of the ribosomal subunits and might participate in the binding and translocation of the tRNAs. This entry represents bacterial, chloroplast and mitochondrial forms.
PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6- bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP, as the 2 products are now further apart. These conformations are thought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react.
Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting, muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise. Sufferers are usually able to lead a reasonably ordinary life by learning to adjust activity levels.
The ATP-dependent DNA helicase RecQ is involved in genome maintenance. All homologues tested to date unwind paired DNA, translocating in a 3' to 5' direction and several have a preference for forked or 4-way DNA structures (e.g. Holliday junctions) or for G-quartet DNA. The yeast protein, Sgs1, is present in numerous foci that coincide with sites of de novo synthesis DNA, such as the replication fork, and protein levels peak during S-phase.
A model has been proposed for Sgs1p action in the S-phase checkpoint response, both as a 'sensor' for damage during replication and a 'resolvase' for structures that arise at paused forks, such as the four-way 'chickenfoot' structure. The action of Sgs1p may serve to maintain the proper amount and integrity of ss DNA that is necessary for the binding of RPA (replication protein A, the eukaryotic ss DNA-binding protein)ÂDNA pol complexes. Sgs1p would thus function by detecting (or resolving) aberrant DNA structures, and would thus contribute to the full activation of the DNA-dependent protein kinase, Mec1p and the effector kinase, Rad53p. Its ability to bind both the large subunit of RPA and the RecA-like protein Rad51p, place it in a unique position to resolve inappropriate fork structures that can occur when either the leading or lagging strand synthesis is stalled. Thus, RecQ helicases integrate checkpoint activation and checkpoint response.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S5 is one of the proteins from the small ribosomal subunit, and is a protein of 166 to 254 amino-acid residues. In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, cyanelle, red algal chloroplast, archaeal and fungal mitochondrial S5; mammalian, Caenorhabditis elegans, Drosophila and plant S2; and yeast S4 (SUP44).
This group represents a DNA-damage repair protein, BRCA1. Germline mutations of the tumour-suppressor gene product BRCA1 lead to 50% familial breast cancer. The protein contains the BRCT C-terminal domain, an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.
Regulated exocytosis of neurotransmitters and hormones, as well as intracellular traffic, requires fusion of two lipid bilayers. SNARE proteins are thought to form a protein bridge, the SNARE complex, between an incoming vesicle and the acceptor compartment. SNARE proteins contribute to the specificity of membrane fusion, implying that the mechanisms by which SNAREs are targeted to subcellular compartments are important for specific docking and fusion of vesicles. This mechanism involves a family of conserved proteins, members of which appear to function at all sites of constitutive and regulated secretion in eukaryotes. Among them are 2 types of cytosolic protein, NSF (N-ethyl-maleimide-sensitive protein) and the SNAPs (alpha-, beta- and gamma-soluble NSF attachment proteins). The yeast vesicular fusion protein, sec17, a cytoplasmic peripheral membrane protein involved in vesicular transport between the endoplasmic reticulum and the golgi apparatus, shows a high degree of sequence similarity to the alpha-SNAP family.
SNAP-25 and its non-neuronal homologue Syndet/SNAP-23 are synthesized as soluble proteins in the cytosol. Both SNAP-25 and Syndet/SNAP-23 are palmitoylated at cysteine residues clustered in a loop between two N- and C-terminal coils and palmitoylation is essential for membrane binding and plasma membrane targeting. The C-terminal and the N-terminal helices of SNAP-25, are each targeted to the plasma membrane by two distinct cysteine-rich domains and appear to regulate the availability of SNAP to form complexes with SNARE.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit). In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 kDa, that shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. The exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1 that have weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (Mouse-ear cress) (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and contains a C-terminal PP2C domain.
PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase (PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents subunit F found in the V1 complex of V-ATPases in eukaryotes. Subunit F is a 16 kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed the origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. This entry is subunit 2, which binds the origin of replication. It plays a role in chromosome replication and mating type transcriptional silencing.
This group represents a translation initiation factor eIF-3b, which binds to the 40S ribosome and promotes the binding of methionyl-tRNAi and mRNA. eIF-3 is composed of at least 12 different subunits.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L17 is one of the proteins from the large ribosomal subunit. Bacterial L17 is a protein of 120 to 130 amino-acid residues while yeast YmL8 is twice as large (238 residues). The N-terminal half of YmL8 is colinear with the sequence of L17 from Escherichia coli.
Arv1 is a transmembrane protein, with potential zinc-binding motifs, that mediates sterol homeostasis. Its action is important in lipid homeostasis, which prevents free sterol toxicity. Arv1 contains a homology domain (AHD), which consists of an N-terminal cysteine-rich subdomain with a putative zinc-binding motif, followed by a C-terminal subdomain of 33 amino acids. The C-terminal subdomain of the AHD is critical for the protein's function. In yeast, Arv1p is important for the delivery of an early glycosylphosphatidylinositol GPI intermediate, GlcN-acylPI, to the first mannosyltransferase of GPI synthesis in the ER lumen. It is important for the traffic of sterol in yeast and in humans. In eukaryotic cells, it may fuction in the sphingolipid metabolic pathway as a transporter of ceramides between the ER and Golgi.
Ubiquitin related modifier 1 (Urm1) is a ubiquitin related protein that modifies proteins in the yeast ubiquitin-like urmylation pathway. Structural comparisons and phylogenetic analysis of the ubiquitin superfamily has indicated that Urm1 has the most conserved structural and sequence features of the common ancestor of the entire superfamily.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This entry represents the C-terminal region of the alpha subunit in the F1 complex of F-ATPases. In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha-subunit contains a highly conserved adenine-specific non-catalytic nucleotide-binding domain, with a conserved amino acid sequence of Gly-X-X-X-X-Gly-Lys. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.
More information about these proteins can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This entry represents the beta subunit found in the F1 complex of F-ATPases. In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.
More information about this protein can be found at Protein of the Month: ATP Synthases.
This family contains a number of eukaryotic D123 proteins approximately 330 residues long. It has been shown that mutated variants of D123 exhibit temperature-dependent differences in their degradation rate.
This entry represents the C terminus (approximately 300 residues) of eukaryotic micro-fibrillar-associated protein 1, which is a component of elastin-associated microfibrils in the extracellular matrix.
This family consists of several ribosome associated membrane protein RAMP4 (or SERP1) sequences. Stabilisation of membrane proteins in response to stress involves the concerted action of a rescue unit in the ER membrane comprised of SERP1/RAMP4, other components of the translocon, and molecular chaperones in the ER.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L19 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L19 is known to be located at the 30S-50S ribosomal subunit interface and may play a role in the structure and function of the aminoacyl-tRNA binding site. It belongs to a family of ribosomal proteins, including L19 from bacteria and the chloroplasts of red algae.
L19 is a protein of 120 to 130 amino-acid residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.
This family consists of several eukaryotic TBP-1 interacting protein (TBPIP) sequences. TBP-1 has been demonstrated to interact with the human immunodeficiency virus type 1 (HIV-1) viral protein Tat, then modulate the essential replication process of HIV. In addition, TBP-1 has been shown to be a component of the 26S proteasome, a basic multiprotein complex that degrades ubiquitinated proteins in an ATP-dependent fashion. Human TBPIP interacts with human TBP-1 then modulates the inhibitory action of human TBP-1 on HIV-Tat-mediated transactivation.
These probable G-protein coupled receptors were identified as parts of large genome screens. Members of this group are from insects, invertebrates and vertebrates, however the function that they may serve is unknown.
This entry describes proteins of unknown function.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA.
This is a family of eukaryotic ribosomal biogenesis regulatory proteins.
This family consists of several uncharacterised eukaryotic proteins.
The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.
14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.
Two different types of thiolase are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase and 3-ketoacyl-CoA thiolase. 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.
In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.
There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.
Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases.
Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.
This entry represents alpha class carbonic anhydrases.
More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.
A number of transmembrane (TM) channel proteins can be grouped together on the basis of sequence similarities.
These include:
MIP family proteins are thought to contain 6 TM domains. Sequence analysis suggests that the proteins may have arisen through tandem, intragenic duplication from an ancestral protein that contained 3 TM domains.
Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates a ttached to lipids or proteins. Aquaporin-CHIP (Aquaporin 1) belo ngs to the Colton blood group system and is associated with Co(a/b) antigen.
Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.
The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.
The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.
In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.
This entry represents SufC, which acts as an ATPase in the SUF system. SufC belongs to the ATP-binding cassette transporter family but is no longer thought to be part of a transporter. The complex is reported as cytosolic or associated with the membrane.
Members of this family are spindle pole body (SBP) components such as Spc97, Spc98 and gamma-tubulin. The SPB functions as the microtubule-organising centre in yeast, with the microtubule cytoskeleton playing an essential role in chromosome segregation, cellular organisation and vesicle trafficking in eukaryotic cells. In most cells, the centrosome is the primary microtubule-organising centre that nucleates and organises microtubules. Gamma-tubulin localises to centrosomes and is required for microtubule nucleation. In Saccharomyces cerevisiae, gamma-tubulin forms a stable complex with Spc97 and Spc98.
This family consists of several hypothetical eukaryotic proteins of unknown function.
Little is known of the function of hsp70 proteins. Some evidence suggests that the constitutive members have a role in the disassembly of clathrin cages, and may also participate in the post-translational transmembrane targetting of proteins to cellular organelles. No specific activities or associations have been found for the inducible members, although it has been suggested that they may accept incoming precursor proteins, keep them unfolded, then pass them on to the hsp60/hsp10 (cpn60/cpn10) complex for folding and assembly.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
This protein appears to be specific to the largest subunit of RNA polymerase III.
This entry contains enoyl-[acyl-carrier-protein] reductases. They are components of the type II (dissociable) fatty acid synthase system and catalyse the terminal reaction in the fatty acid elongation cycle.
They belong to the short-chain dehydrogenases/reductases (SDR) domain superfamily and are therefore related to members of and others.
Most SDRs contain two subdomains. The N-terminal subdomain binds the coenzyme, and the C-terminal subdomain binds the substrate, determines the substrate specificity and contains amino acids involved in catalysis. Despite low sequence similarity, all SDR structures display highly similar alpha/beta folding patterns with a central beta-sheet, typical of the Rossmann-fold .
Crystal structures of these proteins have been extensively studied.
(This information was partially derived from the PFAM database)
Heat shock factor binding protein 1 (HSBP1) appears to be a negative regulator of the heat shock response.
Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.
The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.
The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
S14 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S14 is known to be required for the assembly of 30S particles and may also be responsible for determining the conformation of 16S rRNA at the A site. It belongs to a family of ribosomal proteins that include, bacterial, algal and plant chloroplast, yeast mitochondrial, cyanelle and archael, Methanococcus vannielii S14's, as well as yeast mitochondrial MRP2, yeast YS29A/B and mammalian S29.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins are very basic. About 50 residues long, they are the smallest proteins of eukaryotic-type ribosomes.
Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast.
These proteins include Ypi1, a novel Saccharomyces cerevisiae type 1 protein phosphatase inhibitor and ppp1r11/hcgv, annotated as having protein phosphatase inhibitor activity.
This family includes proteins from pathogenic and non-pathogenic bacteria, Homo sapiens (Human) and Drosophila melanogaster (Fruit fly). In Bacillus cereus, a pathogen, it has been show to function as a channel-forming cytolysin. The human protein is expressed preferentially in mature macrophages, consistent with a cytolytic role.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
This protein appears to be specific to DNA-directed RNA polymerases, subunit 2.
2-methyl-4-amino-5-hydroxymethylpyrimidine diphosphate + 4-4-methyl-5-(2-phosphonooxyethyl)-thiazole = pyrophosphate + thiamin monophosphateHydroxyethylthiazole kinase expression is regulated at the mRNA level by intracellular thiamin pyrophosphate.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry prepresents the Zim17-type zinc finger motif thought to bind zinc. This domain is found in a number of eukaryotic proteins and is named after a short C-terminal motif of D(N/H)L. The domain is found in proteins having a novel zinc-finger essential for protein import into mitochondria.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
This domain is present in sequences representing dihydropteroate synthase, the enzyme that catalyzes the second to last step in folic acid biosynthesis.
Dihydropteroate synthase (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).
This family consists of several eukaryotic splicing factor 3B subunit 5 (SF3b5) proteins. SF3b5 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b5 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site. Also included in this entry is RDS3 complex subunit 10, another protein involved in mRNA splicing.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L21E family contains proteins from a number of eukaryotic and archaebacterial organisms which include; mammalian L2, Entamoeba histolytica L21, Caenorhabditis elegans L21 (C14B9.7), Saccharomyces cerevisiae (Baker's yeast) L21E (URP1) and Haloarcula marismortui HL31.
The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth. Thus ribosomes are 'recycled' and ready for another round of protein synthesis.
This entry represents uroporphyrinogen decarboxylase (HemE), which catalyzes the fifth step in the haem biosynthetic pathway, converting uroporphyrinogen III to coproporphyrinogen III by decarboxylating the four acetate side chains of the substrate. This step takes the pathway toward protoporphyrin IX, a common precursor of both haem and chlorophyll, rather than toward precorrin 2 and its products.
This activity is essential in all organisms, and subnormal activity of URO-D leads to the most common form of porphyria in humans, porphyria cutanea tarda (PCT).
Triosephosphate isomerase (TIM) is the glycolytic enzyme that catalyses the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. TIM plays an important role in several metabolic pathways and is essential for efficient energy production. It is a dimer of identical subunits, each of which is made up of about 250 amino-acid residues. A glutamic acid residue is involved in the catalytic mechanism. The sequence around the active site residue is perfectly conserved in all known TIM's. Deficiencies in TIM are associated with haemolytic anaemia coupled with a progressive, severe neurological disorder.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure.
This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins.
The following phosphorylases belong to the same family:
It should be noted that mammalian and some bacterial PNP as well as eukaryotic MTA phosphorylase belong to a different family of phosphorylases.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes the grpE protein. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
The X-ray crystal structure of GrpE in complex with the ATPase domain of DnaK revealed that GrpE is an asymmetric homodimer, bent in a manner that favours extensive contacts with only one DnaKATPase monomer. GrpE does not actively compete for the atomic positions occupied by the nucleotide. GrpE and ADP mutually reduce one another's affinity for DnaK 200-fold, and ATP instantly dissociates GrpE from DnaK.
Tctex-1 is a dynein light chain. Dynein translocates rhodopsin-bearing vesicles along microtubules and it has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. An efficient vectorial transport system must be required to deliver large numbers of newly synthesized rhodopsin molecules (~107 molecules per day per photoreceptor) to the base of the outer segment of the photoreceptor, Tctex-1 may well play a role in this process. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit the interaction between Tctex-1 and rhodopsin, which may be the molecular basis of retinitis pigmentosa.
In the mouse, the chromosomal location and pattern of expression of Tctex-1 make it a candidate for involvement in male sterility.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L21 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L21 is known to bind to the 23S rRNA in the presence of L20. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Bacterial L21 is a protein of about 100 amino-acid residues, the mature form of the spinach chloroplast L21 has 200 residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins grouped on the basis of sequence similarities.
The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker. Each domain contains an rRNA binding site, and the protein functions as a structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the hydrophobic cores of the two domains at both ends.
This family consists of several eukaryotic proteins, which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multistep process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus.
DNA damaging agents such as the anti-tumour drugs bleomycin and neocarzinostatin or those that generate oxygen radicals produce a variety of lesions in DNA. Amongst these is base-loss which forms apurinic/apyrimidinic (AP) sites or strand breaks with atypical 3' termini. DNA repair at the AP sites is initiated by specific endonuclease cleavage of the phosphodiester backbone. Such endonucleases are also generally capable of removing blocking groups from the 3' terminus of DNA strand breaks.
AP endonucleases can be classified into two families based on sequence similarity. Family 2 groups the enzymes listed below.
Escherichia coli endonuclease IV and its S. cerevisiae homologue Apn1 have been shown to be transition metalloproteins that require zinc and manganese for activity.
The folding pathway of tubulins includes highly specific interactions with a series of cofactors (A, B, C, D and E) after they are released from the eukaryotic chaperonin CCT. Cofactors A and D capture and stabilise tubulin in a quasi-native conformation. Cofactor E binds to the cofactor D-tubulin complex, and interaction with cofactor C then causes the release of tubulin poypeptides in the native state. This family is the tubulin-specific chaperone A.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S9 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial; algal chloroplast; cyanelle and archaeal S9 proteins; and mammalian; plant; and yeast mitochondrial ribosomal S9 proteins.
Members of this family are components of the mitotic spindle. It has been shown that Nuf2 from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. An arabidopsis protein has been included in this family that has previously not been identified as a member of this family. The match is not strong, but in common with other members of this family contains coiled-coil to the C-terminus of this region.
Eukaryotic translation initiation factor A (eIF-1A) (formerly known as eiF-4C) is a protein that seems to be required for maximal rate of protein biosynthesis. It enhances ribosome dissociation into subunits and stabilizes the binding of the initiator Met-tRNA to 40S ribosomal subunits. The archaea possess an eIF-1A homolog.
This family consists of several hypothetical eukaryotic proteins of unknown function.
This is a family of hypothetical proteins. A number of the sequence records state they are transmembrane proteins or putative permeases. It is not clear what source suggested that these proteins might be permeases and this information should be treated with caution.
This is a family of eukaryotic proteins with unknown function.
Maf1 is a negative regulator of RNA polymerase III. It targets the initiation factor TFIIIB.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry includes the asparagine, aspartic acid, lysine, and pyrrolysyl tRNA synthetases. Pyrrolysine is a lysine derivative with a bulky pyrroline ring.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Lysyl-tRNA synthetase is an alpha 2 homodimer that belong to both class I and class II. In eubacteria and eukaryota lysyl-tRNA synthetases belong to class II in the same family as aspartyl tRNA synthetase. The class Ic lysyl-tRNA synthetase family is present in archaea and some eubacteria. Moreover in some eubacteria there is a gene X, which is similar to a part of lysyl-tRNA synthetase from class II. Lysyl-tRNA synthetase is duplicated in some species with, for example in Escherichia coli, as a constitutive gene (lysS) and an induced one (lysU). No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. Lysine is activated by being attached to the alpha-phosphate of AMP before being transferred to the cognate tRNA. The refined crystal structures give "snapshots" of the active site corresponding to key steps in the aminoacylation reaction and provide the structural framework for understanding the mechanism of lysine activation. The active site of LysU is shaped to position the substrates for the nucleophilic attack of the lysine carboxylate on the ATP alpha-phosphate. No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. A loop close to the catalytic pocket, disordered in the lysine-bound structure, becomes ordered upon adenine binding.
Choline kinase, (ATP:choline phosphotransferase) belongs to the choline/ethanolamine kinase family.
Ethanolamine and choline are major membrane phospholipids, in the form of glycerophosphoethanolamine and glycerophosphocholine. Ethanolamine is also a component of the glycosylphosphatidylinositol (GPI) anchor, which is necessary for cell-surface protein attachment. The de novo synthesis of these phospholipids begins with the creation of phosphoethanolamine and phosphocholine by ethanolamine and choline kinases in the first step of the CDP-ethanolamine pathway. There are two putative choline/ethanolamine kinases (C/EKs) in the Trypanosoma brucei genome.
Ethanolamine kinase has no choline kinase activity and its activity is inhibited by ADP. Inositol supplementation represses ethanolamine kinase, decreasing the incorporation of ethanolamine into the CDP-ethanolamine pathway and into phosphatidylethanolamine and phosphatidylcholine.
This entry appears to represent a novel family of basic helix-loop-helix (bHLH) proteins that control differentiation and development of a variety of organs.
Human Nulp1 is a basic helix-loop-helix protein expressed broadly during early embryonic organogenesis. Over expression of human Nulp1 in COS-7 cells inhibits the transcriptional activity of serum response factor (SRF), suggesting that Nulp1 may act as a novel bHLH transcriptional repressor in the SRF signalling pathway to mediate cellular functions.
Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases.
This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.
Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. In yeast the SMP3 (YOR149C) has been implemented in plasmid stability.
Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).
Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.
These sequences represent dihydrolipoamide dehydrogenase, a flavoprotein that acts in a number of ways. It is the E3 component of dehydrogenase complexes for pyruvate, 2-oxoglutarate, 2-oxoisovalerate, and acetoin. It can also serve as the L protein of the glycine cleavage system. This family includes a few members known to have distinct functions (ferric leghemoglobin reductase and NADH:ferredoxin oxidoreductase) but that may be predicted by homology to act as dihydrolipoamide dehydrogenase as well. The motif GGXCXXXGCXP near the N-terminus contains a redox-active disulphide.
This homodimeric, FAD-containing member of the pyridine nucleotide disulphide oxidoreductase family contains a C-terminal motif Cys-SeCys-Gly, where SeCys is selenocysteine encoded by TGA (in some sequence reports interpreted as a stop codon). In some members of this subfamily, Cys-SeCys-Gly is replaced by Cys-Cys-Gly. The reach of the selenium atom at the C-terminal arm of the protein is proposed to allow broad substrate specificity.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of proteins contain serine peptidases belonging to the MEROPS peptidase family S54 (Rhomboid, clan S-). They are integral membrane proteins related to the Drosophila melanogaster (Fruit fly) rhomboid protein Members of this family are found in archaea, bacteria and eukaryotes.
The D. melanogaster rhomboid protease cleaves type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains. It cleaves the transmembrane proteins Spitz, Gurken and Keren within their transmembrane domains to release a soluble TGFalpha-like growth factor. Cleavage occurs in the Golgi, following translocation of the substrates from the endoplasmic reticulum membrane by Star, another transmembrane protein. The growth factors are then able to activate the epidermal growth factor receptor.
Few substrates of mammalian rhomboid homologues have been determined, but rhomboid-like protein 2 (MEROPS S54.002) has been shown to cleave ephrin B3. Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite.
In Saccharomyces cerevisiae (Baker's yeast) the Pcp1 (MDM37) protein (MEROPS S54.007) is a mitochondrial endopeptidase required for the activation of cytochrome c peroxidase and for the processing of the mitochondrial dynamin-like protein Mgm1. Mutations in Pcp1 result in cells have fragmented mitochondria, which have very few short tubulues.
The DegP/Htr family in Prokaryota, including cyanobacteria from which chloroplasts derive, consists of three serine-type endopeptidases: DegP (also named HtrA), DegQ (also named HhoA) and DegS (also named HtrH or HhoB). Consistent with the prokaryotic origins of chloroplasts a Arabidopsis thaliana (Mouse-ear cress) DegP2 gene encoding a novel chloroplast homologue of the prokaryotic trypsin-type Deg/Htr serine proteases has been identified.
DegP is essential for bacterial survival at temperatures above 42 degrees and for digesting misfolded protein in the periplasm. Mature DegP from Escherichia coli has 448 residues, of which His105, Asp135, and Ser210 form the catalytic triad. The protein has an N-terminal sequence typical of a leader peptide. Structural analysis indicates that bacterial HtrA is a serine protease belonging to the family of cage-forming proteases and only unfolded polypeptides can be threaded in extended conformation into the cage to access the proteolytic sites. Disulphide bonds of partially unfolded substrates impede protein breakdown and represent a conformational constraint for entering the inner cavity. This preference for unfolded polypeptides might be the reason for the increased proteolytic activity at higher temperatures.
The DegP/Htr family shares a modular architecture composed of an N-terminal segment believed to have regulatory functions, a conserved trypsin-like protease domain, and one or two PDZ domains, which mediate specific protein-protein interactions and bind preferentially to the C-terminal three to four residues of the target protein. DegP belongs to the trypsin clan SA. SA-type proteases have a two-domain structure with each domain forming a six-stranded barrel. The active site cleft is located at the interface of the two perpendicularly arranged barrel domains. The active site is constructed by several loops located at the C-terminal side of both barrel domains. The functional unit of DegP appears to be a trimer, which is stabilized exclusively by residues of the protease domains. The basic trimer has a funnel-like shape with the protease domains located at its top and the PDZ domains protruding to the outside. Once substrates have been bound, they have to be delivered into the interior of the funnel and the proteolytic sites. In contrast to other protease-chaperone systems, ATP does not drive binding and release of substrates.
The degQ and degS genes of E. coli encode proteins of 455 and 355 residues that are homologues of the DegP protease. Purified DegQ protein has the properties of a serine endopeptidase, and is processed by the removal of a 27-residue N-terminal signal sequence. Deletion studies suggest that DegQ, like DegP, functions as a periplasmic protease in vivo.
This entry represents a set of known and suspected serine proteases related to DegP2 from Arabidopsis thaliana. DegP2 is a serine protease that performs the primary cleavage of the photodamaged D1 protein in plant photosystem II.
This domain is involved in pre-rRNA processing. It has been shown to be required either for nucleolar retention or correct assembly of the box C/D snoRNP in Saccharomyces cerevisiae.
The Histidine Triad (HIT) motif, His-phi-His-phi-His-phi-phi (phi, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HIT superfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles. Hint homologues including rabbit Hint and yeast Hnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 and AMP-lysine to AMP plus the amine product and function as positive regulators of Cdk7/Kin28 in vivo. Fhit homologues are diadenosine polyphosphate hydrolases and function as tumour suppressors in human and mouse though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis. The third branch of the HIT superfamily, which includes GalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates rather than hydrolysing them.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The genomic structure and sequence of the human ribosomal protein L7a has been determined. The gene contains 8 exons and 7 introns, encompassing 3179 bp. The human gene resembles other mammalian ribosomal protein genes in so far as it contains a short first exon, a short 5' untranslated leader and its transcriptional start sites at C residues embedded in a poly-pyrimidine tract.
The sequence of a gene for ribosomal protein L4 of Saccharomyces cerevisiae (Baker's yeast) has also been determined, which, unlike most of its other ribosomal protein genes, has no intron. The single open reading frame is highly similar to mammalian ribosomal protein L7a.
There appear to be two genes for L4, both of which are active. Yeast cells containing a disruption of the L4-1 gene form smaller colonies than either wild-type or disrupted L4-2 strains. Disruption of both L4 genes is lethal, probably resulting from an inability of the organism to produce functional ribosomes.
Several other ribosomal proteins have been found to share sequence similarity with L7a, including yeast NHP2, Bacillus subtilis hypothetical protein ylxQ, Haloarcula marismortui (Halobacterium marismortui) Hs6, and Methanocaldococcus jannaschii MJ1203.
This InterPro entry focus on regions that characterise the ribosomal L7A proteins but distinguish them from the rest of the HMG-like family.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).
This entry represents EF1A (or EF-Tu) proteins found primarily in bacteria, mitochondria and chloroplasts. Eukaryotic and archaeal EF1A are excluded from this entry. When bound to GTP, EF-Tu can form a complex with any (correctly) aminoacylated tRNA except those for initiation and for selenocysteine, in which case EF-Tu is replaced by other factors.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
Initiation factor 2 (IF-2) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-2 promotes the GTP-dependent binding of the initiator tRNA to the small subunit of the ribosome. IF-2 is a protein of about 70 to 95 kDa that contains a central GTP-binding domain flanked by a highly variable N-terminal domain and a more conserved C-terminal domain. Some members of this group undergo protein self splicing that involves a post-translational excision of the intein followed by peptide ligation.
The function of IF-2 in facilitating the proper binding of initiator methionyl-tRNA to the ribosomal P site appears to be universally conserved, with an IF-2 homologue (aIF-2) present in archaea bacteria Methanopyrus kandleri.
Members of this family are related to the pre mRNA splicing factor PRP38 from yeast, therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation.
2-oxoglutarate dehydrogenase is a key enzyme in the TCA cycle, converting 2-oxoglutarate, coenzyme A and NAD(+) to succinyl-CoA, NADH and carbon dioxide. This activity of this enzyme is tightly regulated and it is a major determinant of the metabolic flux through the TCA cycle. This enzyme is composed of multiple copies of three different subunits: 2-oxoglutarate dehydrogenase (E1), dihydrolipoamide succinyltransferase (E2) and lipoamide dehydrogenase (E3) which is often shared with similar enzymes such as pyruvate dehydrogenase. The E2 component forms a large multimeric core which binds the peripheral E1 and E3 subunits. The substrate is transferred between the active sites of the different subunits by a lipoyl moiety, bound to a lysine residue from the E2 polypeptide.
This entry represents the E1 subunit of 2-oxoglutarate dehydrogenase. It catalyses the decarboxylation of this compound in a thiamine pyrophosphate-dependent manner, transferring the resultant succinyl group onto the liposyl moiety bound to the E2 subunit. The E1 ortholog from Corynebacterium glutamicum (Brevibacterium flavum) is unusual in having an N-terminal extension that resembles the E2 component of 2-oxoglutarate dehydrogenase enzyme.
Named the YEATS family, after 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', this family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity.
Genes that negatively regulate proliferation inside the cell are of considerable interest because of the implications in processes such as development and cancer. Prohibitin, a novel cytoplasmic anti-proliferative protein widely expressed in a variety of tissues, inhibits DNA synthesis. Studies have suggested that prohibitin may be a suppressor gene and is associated with tumour development and/or progression of at least some breast cancers. Sequence comparisons suggest that the prohibitin gene is an analogue of Cc, a Drosophila melanogaster gene that is vital for normal development.
Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.
The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.
This entry represents Kelch-related domains, including the BTB (broad-complex, tramtrack and bric a brac) domain, which defines a family of proteins involved in diverse biological processes. BTB proteins are divided into subgroups depending on what domain lies at the C-terminus. BTB-Kelch proteins have Kelch repeats that form a beta-propeller that can interact with actin filaments.
This family of proteins of unknown function contains a subset of Bax inhibitor-1 proteins.
Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.
ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.
Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.
The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.
Other proteins containing an ACB domain include:
AMP + MgATP = ADP + MgADPan essential reaction for many processes in living cells. Two ADK isozymes have been identified in mammalian cells. These specifically bind AMP and favour binding to ATP over other nucleotide triphosphates (AK1 is cytosolic and AK2 is located in the mitochondria). A third ADK has been identified in bovine heart and human cells, this is a mitochondrial GTP:AMP phosphotransferase, also specific for the phosphorylation of AMP, but can only use GTP or ITP as a substrate. ADK has also been identified in different bacterial species and in yeast . Two further enzymes are known to be related to the ADK family, i.e. yeast uridine monophosphokinase and slime mold UMP-CMP kinase. Within the ADK family there are several conserved regions, including the ATP-binding domains. One of the most conserved areas includes an Arg residue, whose modification inactivates the enzyme, together with an Asp that resides in the catalytic cleft of the enzyme and participates in a salt bridge.
Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides:
2'-deoxyribonucleoside diphosphate + oxidized thioredoxin + H2O = ribonucleoside diphosphate + reduced thioredoxinIt provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.
Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain. The small chain binds two iron atoms (three Glu, one Asp, and two His are involved in metal binding) and contains an active site tyrosine radical. The regions of the sequence that contain the metal-binding residues and the active site tyrosine are conserved in ribonucleotide reductase small chain from prokaryotes, eukaryotes and viruses. We have selected one of these regions as a signature pattern. It contains the active site residue as well as a glutamate and a histidine involved in the binding of iron.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family consists of ribosomal protein L5 from eukaryotes. The ribosomal 5S RNA is the only known rRNA species to bind a ribosomal protein before its assembly into the ribosomal subunits . In eukaryotes, the 5S rRNA molecule binds one protein species, a 34-kDa protein which has been implicated in the intracellular transport of 5 S rRNA..
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The L32e family consists of proteins that have 135 to 240 amino-acid residues.
This entry represents tRNA (guanine-N-7) methyltransferase, which catalyses the formation of N(7)-methylguanine at position 46 (m7G46) in tRNA. Capping of the pre-mRNA 5' end by addition a monomethylated guanosine cap (m(7)G) is an essential and the earliest modification in the biogenesis of mRNA. The reaction is catalysed by three enzymes: triphosphatase, guanylyltransferase, and tRNA (guanine-N-7) methyltransferase.
CutA1 is a widespread protein of about 12 kDa found in bacteria, plants, and animals, including humans. The protein was originally identified in a gene locus of Escherichia coli called cutA involved in divalent metal toleranc. The cutA locus consists of two operons, one containing a single gene encoding a cytoplasmic protein, CutA1, and the other composed of two genes encoding a 50-kDa (CutA2) and a 24-kDa (CutA3) inner membrane proteins. Molecular genetics studies on the E. coli cutA locus showed that some mutations lead to copper sensitivity due to its increased uptake. However, the specific function of CutA1 in E. coli is still unknown.
However, a possible role of mammalian CutA1 in the anchoring of the enzyme acetylcholinesterase (AChE)1 in neuronal cell membranes. CutA1 does not directly interact with AChE, but the CutA1 gene is widely expressed in different regions of the brain with an expression pattern that parallels that of AChE. In addition CutA1 Co-purified with AChE from human caudate nucleus. CutA1, thus, might provide an intriguing link between copper tolerance in bacteria and a complex process in the brain of the most evolved organisms.
Both rat and E. coli CutA1 have been crystallised. Both proteins are trimeric in the crystals and in solution through an inter-subunit beta-sheet formation. Each monomer exhibits the same overall structure, adopting a ferredoxin-like fold made of an alpha-beta sandwich with antiparallel beta-sheet and containing an additional short strand and a C-terminal helix. In the beta-sheet, alternate strands are connected by helices with positive crossovers, resulting in a double beta-alpha-beta motif where the antiparallel beta-sheet packs against antiparallel alpha-helices. The C-terminal helix packs orthogonal to the N terminus.
The strong structure similarity of CutA1 with PII proteins might point to an role for CutA1 in signalling through allosteric communication between monomers. CutA1 may be involved in the tuning of a disulphide bond cascade in bacteria and mammals, acting as the PII proteins do in the nitrogen signal cascade in bacteria and plants.
O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.
Glycoside hydrolase family 35comprises enzymes with only one known activity; beta-galactosidase.
Mammalian beta-galactosidase is a lysosomal enzyme (gene GLB1) which cleaves the terminal galactose from gangliosides, glycoproteins, and glycosaminoglycans and whose deficiency is the cause of the genetic disease Gm(1) gangliosidosis (Morquio disease type B).
Glucose-6-phosphate dehydrogenase (G6PDH) is a ubiquitous protein, present in bacteria and all eukaryotic cell types. The enzyme catalyses the the first step in the pentose pathway, i.e. the conversion of glucose-6-phosphate to gluconolactone 6-phosphate in the presence of NADP, producing NADPH. The ubiquitous expression of the enzyme gives it a major role in the production of NADPH for the many NADPH-mediated reductive processes in all cells. Deficiency of G6PDH is a common genetic abnormality affecting millions of people worldwide. Many sequence variants, most caused by single point mutations, are known, exhibiting a wide variety of phenotypes.
This entry represents type 2 phosphatidic acid phosphatase (PAP2; enzymes, such as phosphatidylglycerophosphatase Bfrom Escherichia coli. PAP2 enzymes have a core structure consisting of a 5-helical bundle, where the beginning of the third helix binds the cofactor. PAP2 enzymes catalyse the dephosphorylation of phosphatidate, yielding diacylglycerol and inorganic phosphate. In eukaryotic cells, PAP activity has a central role in the synthesis of phospholipids and triacylglycerol through its product diacylglycerol, and it also generates and/or degrades lipid-signalling molecules that are related to phosphatidate.
Other related enzymes have a similar core structure, including haloperoxidases such as bromoperoxidase (contains one core bundle, but forms a dimer), chloroperoxidases (contains two core bundles arranged as in other family dimers), bacitracin transport permease from Bacillus licheniformis, glucose-6-phosphatase from rat. The vanadium-dependent haloperoxidases exclusively catalyse the oxidation of halides, and act as histidine phosphatases, using histidine for the nucleophilic attack in the first step of the reaction. Amino acid residues involved in binding phosphate/vanadate are conserved between the two families, supporting a proposal that vanadium passes through a tetrahedral intermediate during the reaction mechanism.
Calmodulin (CaM) is recognized as a major calcium sensor and orchestrator of regulatory events through its interaction with a diverse group of cellular proteins. Three classes of recognition motifs exist for many of the known CaM binding proteins; the IQ motif as a consensus for Ca2+-independent binding and two related motifs for Ca2+-dependent binding, termed 18-14 and 1-5-10 based on the position of conserved hydrophobic residues.
The regulatory domain of scallop myosin is a three-chain protein complex that switches on this motor in response to Ca2+ binding. Side-chain interactions link the two light chains in tandem to adjacent segments of the heavy chain bearing the IQ-sequence motif. The Ca2+-binding site is a novel EF-hand motif on the essential light chain and is stabilized by linkages involving the heavy chain and both light chains, accounting for the requirement of all three chains for Ca2+ binding and regulation in the intact myosin molecule.
The drosophila pumilio gene codes for an unusual protein that binds through the Puf domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of Caenorhabditis elegans also has a Puf domain. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.
In Saccharomyces cerevisiae (Baker's yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.
The EH (for Eps15 Homology) domain is a protein-protein interaction module of approximately 95 residues which was originally identified as a repeated sequence present in three copies at the N-terminus of the tyrosine kinase substrates Eps15 and Eps15R . The EH domain was subsequently found in several proteins implicated in endocytosis, vesicle transport and signal transduction in organisms ranging from yeast to mammals. EH domains are present in one to three copies and they may include calcium-binding domains of the EF-hand type. Eps15 is divided into three domains: domain I contains signatures of a regulatory domain, including a candidate tyrosine phosphorylation site and EF-hand-type calcium-binding domains, domain II presents the characteristic heptad repeats of coiled-coil rod-like proteins, and domain III displays a repeated aspartic acid-proline-phenylalanine motif similar to a consensus sequence of several methylases.
EH domains have been shown to bind specifically but with moderate affinity to peptides containing short, unmodified motifs through predominantly hydrophobic interactions. The target motifs are divided into three classes: class I consists of the concensus Asn-Pro-Phe (NPF) sequence; class II consists of aromatic and hydrophobic di- and tripeptide motifs, including the Phe-Trp (FW), Trp-Trp (WW), and Ser-Trp-Gly (SWG) motifs; and class III contains the His-(Thr/Ser)-Phe motif (HTF/HSF). The structure of several EH domains has been solved by NMR spectroscopy. The fold consists of two helix-loop-helix characteristic of EF-hand domains, connected by a short antiparallel beta-sheet. The target peptide is bound in a hydrophobic pocket between two alpha helices. Sequence analysis and structural data indicate that not all the EF-hands are capable of binding calcium because of substitutions of the calcium-liganding residues in the loop.
This domain is often implicated in the regulation of protein transport/sorting and membrane trafficking. Messenger RNA translation initiation and cytoplasmic poly(A) tail shortening require the poly(A)-binding protein (PAB) in yeast. The PAB-dependent poly(A) ribonuclease (PAN) is organised into distinct domains containing repeated sequence elements.
The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins. It mediates proteinÂprotein interactions and the assembly of multiprotein complexes. The TPR motif consists of 3Â16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.
The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helixÂturnÂhelix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel alpha-helices. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24° within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.
Sushi domains are also known as Complement control protein (CCP) modules, or short consensus repeats (SCR), exist in a wide variety of complement and adhesion proteins. The structure is known for this domain, it is based on a beta-sandwich arrangement; one face made up of three beta-strands hydrogen-bonded to form a triple-stranded region at its centre and the other face formed from two separate beta-strands.
CD21 (also called C3d receptor, CR2, Epstein Barr virus receptor or EBV-R) is the receptor for EBV and for C3d, C3dg and iC3b. Complement components may activate B cells through CD21. CD21 is part of a large signal-transduction complex that also involves CD19, CD81, and Leu13.
Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Complement decay-accelerating factor (Antigen CD55) belongs to the Cromer blood group system and is associated with Cr(a), Dr(a), Es(a), Tc(a/b/c), Wd(a), WES(a/b), IFC and UMC antigens. Complement receptor type 1 (C3b/C4b receptor) (Antigen CD35) belongs to the Knops blood group system and is associated with Kn(a/b), McC(a), Sl(a) and Yk(a) antigens.
CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).
The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins. It comprises the following groups of actin-binding domains:
A comprehensive review of proteins containing this type of actin-binding domains is given in.
The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene have only a single copy. The structure of an example CH-domain has recently been solved.
Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.
In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The DAG kinase domain is assumed to be an accessory domain. Upon cell stimulation, DAG kinase converts DAG into phosphatidate, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. It catalyses the reaction: ATP + 1,2-diacylglycerol = ADP + 1,2-diacylglycerol 3-phosphate. The enzyme is stimulated by calcium and phosphatidylserine and phosphorylated by protein kinase C. This domain is always associated with
Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins, and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.
The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.
This entry represents the GTPase domain, containing the GTP-binding motifs that are needed for guanine-nucleotide binding and hydrolysis. The conservation of these motifs is absolute except for the the final motif in guanylate-binding proteins. The GTPase catalytic activity can be stimulated by oligomerisation of the protein, which is mediated by interactions between the GTPase domain, the middle domain and the GED.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two zinc ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. FYVE-type domains are divided into two known classes: FYVE domains that specifically bind to phosphatidylinositol 3-phosphate in lipid bilayers and FYVE-related domains of undetermined function. Those that bind to phosphatidylinositol 3-phosphate are often found in proteins targeted to lipid membranes that are involved in regulating membrane traffic. Most FYVE domains target proteins to endosomes by binding specifically to phosphatidylinositol-3-phosphate at the membrane surface. By contrast, the CARP2 FYVE-like domain is not optimized to bind to phosphoinositides or insert into lipid bilayers. FYVE domains are distinguished from other zinc fingers by three signature sequences: an N-terminal WxxD motif, a basic R(R/K)HHCR patch, and a C-terminal RVC motif.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
This entry represents a domain found in guanylate kinase and in L-type calcium channel.
Guanylate kinase (GK) catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes (such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.
L-type calcium channnels are formed from different alpha-1 subunit isoforms that determine the pharmacological properties of the channel, since they form the drug binding domain. Other properties, such as gating voltage-dependence, G protein modulation and kinase susceptibility, are influenced by alpha-2, delta and beta subunits.
PAC motifs occur C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold.
The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.
14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.
The actin-depolymerising factor homology (ADF-H) domain is an ~150-amino acid motif that is present in three phylogenetically distinct classes of eukaryotic actin-binding proteins:
Although these proteins are biochemically distinct and play different roles in actin dynamics, they all appear to use the ADF-H domain for their interactions with actin.
The ADF-H domain consists of a six-stranded mixed beta-sheet in which the four central strands (beta2-beta5) are anti-parallel and the two edge strands (beta1 and beta6) run parallel with the neighbouring strands. The sheet is surrounded by two alpha-helices on each side .
This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.
The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved. It consists of a three-stranded beta-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis.
The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates nuclear export of unspliced and partly-spliced viral RNAs. Rev contains an RNA-binding domain and an effector domain; the latter is believed to interact with a cellular cofactor required for the Rev response and hence HIV-1 replication. Human Rev interacting protein (hRIP) specifically interacts with the Rev effector. The amino acid sequence of hRIP is characterised by an N-terminal, C-4 class zinc finger motif.
Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumour promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC). Phorbol esters can directly stimulate PKC. The N-terminal region of PKC, known as C1, has been shown to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain, which is about 50 amino-acid residues long, and which is essential for DAG/PE-binding. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain.
CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes.
Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking, while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites.
Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations.
In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).
The name HECT comes from 'Homologous to the E6-AP Carboxyl Terminus'. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.
This domain is found in diverse proteins homologous to inositol monophosphatase. These proteins are Mg2+-dependent/Li+-sensitive phosphatases. That catalyse a variety of reactions.
Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.
The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.
A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:
The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.
The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
The AGC (cAMP-dependent, cGMP-dependent and protein kinase C) protein kinase family embraces a collection of protein kinases that display a high degree of sequence similarity within their respective kinase domains. AGC kinase proteins are characterised by three conserved phosphorylation sites that critically regulate their function. The first one is located in an activation loop in the centre of the kinase domain. The two other phosphorylation sites are located outside the kinase domain in a conserved region on its C-terminal side, the AGC-kinase C-terminal domain. These sites serves as phosphorylation-regulated switches to control both intra- and inter-molecular interactions. Without these priming phosphorylations, the kinases are catalytically inactive.
Several structures of the AGC-kinase C-terminal domain have been solved. The first phosphorylation site is located in a turn motif, the second one at the end of the domain in an hydrophobic pocket. In PKB the phosphorylated hydrophobic motif engages a hydrophobic groove within the N-lobe of the kinase domain which orders alpha helices close to the active site.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The role of the accessory domain of phosphoinositide 3-kinase (PI3-kinase) is unclear. It may be involved in substrate presentation .
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases.
The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases . The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.
Phosphatidylinositol-specific phospholipase C, an eukaryotic intracellular enzyme, plays an important role in signal transduction processes (see. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins.
In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC.
All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' (see and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain (see possibly involved in Ca-dependent membrane attachment.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the AN1-type zinc finger domain, which has a dimetal (zinc)-bound alpha/beta fold. This domain was first identified as a zinc finger at the C-terminus of AN1 a ubiquitin-like protein in Xenopus laevis. The AN1-type zinc finger contains six conserved cysteines and two histidines that could potentially coordinate 2 zinc atoms.
Certain stress-associated proteins (SAP) contain AN1 domain, often in combination with A20 zinc finger domains (SAP8) or C2H2 domains (SAP16). For example, the human protein Znf216 has an A20 zinc-finger at the N-terminus and an AN1 zinc-finger at the C-terminus, acting to negatively regulate the NFkappaB activation pathway and to interact with components of the immune response like RIP, IKKgamma and TRAF6. The interact of Znf216 with IKK-gamma and RIP is mediated by the A20 zinc-finger domain, while its interaction with TRAF6 is mediated by the AN1 zinc-finger domain; therefore, both zinc-finger domains are involved in regulating the immune response. The AN1 zinc finger domain is also found in proteins containing a ubiquitin-like domain, which are involved in the ubiquitination pathway. Proteins containing an AN1-type zinc finger include:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs.
Protein phosphorylation plays a central role in the regulation of cell functions, causing the activation or inhibition of many enzymes involved in various biochemical pathways. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental.
Protein phosphatase-1 (PP1) and protein phosphatase-2A (PP2A) have a broad specificity and there are two closely related isoforms of each, alpha and beta. PP2A is a trimeric enzyme that consists of a core composed of a catalytic subunit associated with a 65 kDa regulatory subunit and a third variable subunit. Protein phosphatase-2B (PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by calmodulin, is composed of two subunits the catalytic A-subunit and the calcium-binding B-subunit. The specificity of PP2B is restricted. Other serine/threonine specific protein phosphatases that have been characterised include mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V), which are closely related but yet distinct from PP2A; yeast phosphatase PPH3, which is similar to PP2A, but with different enzymatic properties; and Drosophila phosphatase-Y (PP-Y), and yeast phosphatases Z1 and Z2 which are closely related but yet distinct from PP1.
Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran Binding Protein 1 (RanBP1) has guanine nucleotide dissociation inhibitory activity, specific for the GTP form of Ran and also functions to stimulate Ran GTPase activating protein(GAP)-mediated GTP hydrolysis by Ran. RanBP1 contributes to maintaining the gradient of RanGTP across the nuclear envelope high (GDI activity) or the cytoplasmic levels of RanGTP low (GAP cofactor).
All RanBP1 proteins contain an approx 150 amino acid residue Ran binding domain. Ran BP1 binds directly to RanGTP with high affinity. There are four sites of contact between Ran and the Ran binding domain. One of these involves binding of the C-terminal segment of Ran to a groove on the Ran binding domain that is analogous to the surface utilised in the EVH1Âpeptide interaction. Nup358 contains four Ran binding domains. The structure of the first of these is known.
UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases. The human homologue of yeast Rad23A is one example of a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The solution structure of human Rad23A UBA(2) showed that the domain forms a compact three-helix bundle. Comparison of the structures of UBA(1) and UBA(2) reveals that both form very similar folds and have a conserved large hydrophobic surface patch which may be a common protein-interacting surface present in diverse UBA domains. Evidence that ubiquitin binds to UBA domains leads to the prediction that the hydrophobic surface patch of UBA domains interacts with the hydrophobic surface on the five-stranded beta-sheet of ubiquitin.
This domain is similar in sequence to the N-terminal domain of translation elongation factor EF1B (or EF-Ts) from bacteria, mitochondria and chloroplasts.
More information about EF1B (EF-Ts) proteins can be found at Protein of the Month: Elongation Factors.
Ras proteins are small GTPases that regulate cell growth, proliferation and differentiation. The different Ras isoforms  H-ras, N-ras and K-ras  generate distinct signal outputs, despite interacting with a common set of activators and effectors. Ras is activated by guanine nucleotide exchange factors (GEFs) that release GDP and allow GTP binding. Many RasGEFs have been identified. These are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active GTP-bound Ras interacts with several effector proteins: among the best characterised are the Raf kinases, phosphatidylinositol 3-kinase (PI3K), RalGEFs and NORE/MST1.
Ras proteins are synthesized as cytosolic precursors that undergo post-translational processing to be able to associate with cell membranes. First, protein farnesyl transferase, a cytosolic enzyme, attaches a farnesyl group to the cysteine residue of the CAAX motif. Second, the farnesylated CAAX sequence targets Ras to the cytosolic surface of the ER where an endopeptidase removes the AAX tripeptide. Third, the alpha-carboxyl group on the now carboxy-terminal farnesylcysteine is methylated by isoprenylcysteine carboxyl methyltransferase. Finally, after methylation, Ras proteins take one of two routes to the cell surface, which is dictated by a second targeting signal that is located immediately amino-terminal to the farnesylated cysteine. N-ras and H-ras are expressed stably on the plasma membrane, on Golgi in transfected cells, and at least transiently on the ER. Ras has also been visualized on endosomes.
Small GTPases are involved in intracellular cell signalling processes. The Ras family includes a large number of small GTPases. Members of the Rho subfamily of Ras-like small GTPases include Cdc42 and Rac, as well as Rho isoforms.
The crystal structure of a number of the members of this entry have been determined: Rnd3/RhoE, RhoA and Cdc42.
Small GTPases are involved in intracellular cell signalling processes. The Ras family includes a large number of small GTPases. Members of the Rab GTPases subfamily have been implicated in vesicle trafficking.
The crystal structure of a number of the members of this entry have been determined:
Ran (or TC4), is an evolutionary conserved member of the Ras superfamily of small GTPases that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran has been implicated in a large number of processes, including nucleocytoplasmic transport, RNA synthesis, processing and export and cell cycle checkpoint control. Ran plays a crucial role in both import/export pathways and determines the directionality of nuclear transport. Import receptors (importins) bind their cargos in the cytoplasm where the concentration of RanGTP is low (due to action of RanGAP), and release their cargos in the nucleus where the concentration of RanGTP is high (due to action of RanGEF). Export receptors (exportins) respond to RanGTP in the opposite manner. Furthermore, it has been shown that nuclear transport factor 2 (NTF2) stimulates efficient nuclear import of a cargo protein. NTF2 binds specifically to RanGDP and to the FXFG repeat containing nucleoporins.
Ran is generally included in the RAS 'superfamily' of small GTP-binding proteins, but it is only slightly related to the other RAS proteins. It also differs from RAS proteins in that it lacks cysteine residues at its C-terminal and is therefore not subject to prenylation. Instead, Ran has an acidic C-terminus. It is, however, similar to RAS family members in requiring a specific guanine nucleotide exchange factor (GEF) and a specific GTPase activating protein (GAP) as stimulators of overall GTPase activity.
Ran consists of a core domain that is structurally similar to the GTP-binding domains of other small GTPases but, in addition, Ran has a C-terminal extension consisting of an unstructured linker and a 16 residue alpha-helix that is located opposite the "Switch I" region in the RanGDP structure. Three regions of Ran change conformation depending on the nucleotide bound, the Switch I and II regions, which interact with the bound nucleotide, as well as the C-terminal extension. In RanGDP, the C-terminal extension contacts the core of the protein, while in RanGTP, the extension is extending away from the core, most likely due to a steric clash between the switch I region and the linker part of the C-terminal extension. This suggests that the C-terminal extension in RanGDP is crucial for shielding residues in the core domain and preventing the switch regions from adopting a GTP-like form. This prevents binding of transport factors to RanGDP that would otherwise lead to uncoordinated interaction between importin beta-like proteins and cellular factors.
More information about these proteins can be found at Protein of the Month: Importins.
The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.
The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2-beta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 ÂupÂ, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.
ADP-ribosylation factors (ARF) are 20 kDa GTP-binding proteins involved in protein trafficking. They may modulate vesicle budding and uncoating within the Golgi apparatus. ARF's also act as allosteric activators of cholera toxin ADP-ribosyltransferase activity. They are evolutionary conserved and present in all eukaryotes. At least six forms of ARF are present in mammals and three in budding yeast. The ARF family also includes proteins highly related to ARF's but which lack the cholera toxin cofactor activity, they are collectively known as ARL's (ARF-like). The ARFs are N-terminally myristoylated (the ARLs have not yet been shown to be modified in such a fashion).
The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.
The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 'up', restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.
The SAR1 protein, first identified in budding yeast, is a 21 kDa GTP- binding protein involved in vesicular transport between the endoplasmic reticulum and the Golgi. It is a GTP-binding protein that takes part in the formation of secretory vesicles by binding to an ER type II membrane protein, Sec12p. It is evolutionary conserved and seems to be present in all eukaryotes.
SAR1 is generally included in the RAS 'superfamily' of small GTP-binding proteins, but it is only slightly related to other RAS proteins. It also differs from RAS proteins in that it lacks cysteine residues at the C terminus and is therefore not subject to prenylation. SAR1 is slightly related to ARFs.
Epidermal growth factors and transforming growth factors belong to a general class of proteins that share a repeat pattern involving a number of conserved Cys residues. Growth factors are involved in cell recognition and division. The repeating pattern, especially of cysteines (the so-called EGF repeat), is thought to be important to the 3D structure of the proteins, and hence its recognition by receptors and other molecules. The type 1 EGF signature includes six conserved cysteines believed to be involved in disulphide bond formation. The EGF motif is found frequently in nature, particularly in extracellular proteins.
Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF), play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins.
This entry represents the cullin homology region, which is composed of three domains: a 4-helical bundle domain, an alpha+beta domain, and a winged helix-like domain.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents RING-type zinc finger domains. The RING-finger is a specialised type of Zn-finger of 40 to 60 residues that binds two atoms of zinc, and is probably involved in mediating protein-protein interactions.. There are two different variants, the C3HC4-type and a C3H2C3-type, which are clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as 'RING-H2 finger'. The RING domain is a protein interaction domain that has been implicated in a range of diverse biological processes. E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain. E3 ubiquitin-protein ligases determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3. Various RING fingers also exhibit binding to E2 ubiquitin-conjugating enzymes (Ubc's).
Several 3D-structures for RING-fingers are known. The 3D structure of the zinc ligation system is unique to the RING domain and is referred to as the 'cross-brace' motif. The spacing of the cysteines in such a domain is C-x(2)-C-x(9 to 39)-C-x(1 to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. Metal ligand pairs one and three co-ordinate to bind one zinc ion, whilst pairs two and four bind the second, as illustrated in the following schematic representation:
Note that in the older literature, some RING-fingers are denoted as LIM-domains. The LIM-domain Zn-finger is a fundamentally different family, albeit with similar Cys-spacing.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila melanogaster segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumour suppressor protein, and the nuclear transport factor importin-alpha, amongst others. A subset of these proteins is conserved across eukaryotic kingdoms. In higher plants, some Arm-repeat proteins function in intracellular signalling like their mammalian counterparts, while others have novel functions.
The 3-dimensional fold of an armadillo repeat is known from the crystal structure of beta-catenin, where the 12 repeats form a superhelix of alpha helices with three helices per unit. The cylindrical structure features a positively charged grove, which presumably interacts with the acidic surfaces of the known interaction partners of beta-catenin.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:
Based on their cellular localisation, PTPases are also classified as:
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.
This entry represents dual specificity protein-tyrosine phosphatases. Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr and tyrosine specific protein phosphatase activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. The crystal structure of a human DSP, vaccinia H1-related phosphatase (or VHR), has been determined at 2.1 angstrom resolution. A shallow active site pocket in VHR allows for the hydrolysis of phosphorylated serine, threonine, or tyrosine protein residues, whereas the deeper active site of protein tyrosine phosphatases (PTPs) restricts substrate specificity to only phosphotyrosine. Positively charged crevices near the active site may explain the enzyme's preference for substrates with two phosphorylated residues. The VHR structure defines a conserved structural scaffold for both DSPs and PTPs. A "recognition region" connecting helix alpha1 to strand beta1, may determine differences in substrate specificity between VHR, the PTPs, and other DSPs.
These proteins may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect again dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a new regulatory centre.
The egg peptide speract receptor is a transmembrane glycoprotein. Other members of this family include the macrophage scavenger receptor type I (a membrane glycoprotein implicated in the pathologic deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell proliferation).
Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain , three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers . EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.
This repeat was first described in 1986 by Lawler and Hynes. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) as well as extracellular matrix protein like mindin, F-spondin, SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium contain one or more instance of this repeat. It has been involved in cell-cell interraction, inhibition of angiogenesis and apoptosis.
The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat.
The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.
There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.
Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1, SUMO1, NEDD8, Rad23, Elongin B and Parkin, the latter being involved in Parkinson's disease.
Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Tyrosine phosphorylating activity was originally detected in two viral transforming proteins, but many retroviral transforming proteins and their cellular counterparts have since been shown to possess such activity. The growth factor receptors, which are activated by ligand binding, and the insulin-related peptide receptor, are also family members.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Eukaryotic protein kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue which is important for the catalytic activity of the enzyme.
PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 µM. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.
PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.
All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:
Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.
Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.
Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain. It has been shown that this domain occurs in prokaryotes.
Mov34 proteins act as the regulatory subunit of the 26 proteasome, which is involved in the ATP-dependent degradation of ubiquitinated proteins. The function of this domain is unclear, but it is found in the N-terminus of the proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors.
A number of the proteins associated with this family belong to MEROPS peptidase family M67 (clan M-). This includes the Poh1 peptidase of Saccharomyces cerevisiae (Baker's yeast) which is a component of the 19S proteasome regulatory particle.
The 'pleckstrin homology' (PH) domain is a domain of about 100 residues that occurs in a wide range of proteins involved in intracellular signalling or as constituents of the cytoskeleton.
The function of this domain is not clear, several putative functions have been suggested:
It is possible that different PH domains have totally different ligand requirements.
The 3D structure of several PH domains has been determined. All known cases have a common structure consisting of two perpendicular anti-parallel beta sheets, followed by a C-terminal amphipathic helix. The loops connecting the beta-strands differ greatly in length, making the PH domain relatively difficult to detect. There are no totally invariant residues within the PH domain.
Proteins reported to contain one more PH domains belong to the following families:
Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.
The baculovirus inhibitor of apoptosis protein repeat (BIR) is a domain of tandem repeats separated by a variable length linker that seems to confer cell death-preventing activity. The BIR domains characterise the Inhibitor of Apoptosis (IAP) family of proteins (MEROPS proteinase inhibitor family I32, clan IV) that suppress apoptosis by interacting with and inhibiting the enzymatic activity of both initiator and effector caspases (MEROPS peptidase family C14). Several distinct mammalian IAPs including XIAP, c-IAP1, c-IAP2, and ML-IAP, have been identified, and they all exhibit antiapoptotic activity in cell culture. The functional unit in each IAP protein is the baculoviral IAP repeat (BIR), which contains approximately 80 amino acids folded around a zinc atom. Most mammalian IAPs have more than one BIR domain, with the different BIR domains performing distinct functions. For example, in XIAP, the third BIR domain (BIR3) potently inhibits the catalytic activity of caspase-9, whereas the linker sequences immediately preceding the second BIR domain (BIR2) selectively targets caspase-3 or Â7.
The first-recognised members of family MEROPS inhibitor family I32 were viral proteins that inhibited the apoptosis of infected cells: Cp-IAP from Cydia pomonella granulosis virus (CpGV) and Op-IAP from Orgyia pseudotsugata multicapsid polyhedrosis virus(OpMNPV). The discovery of homologous proteins in mammals followed soon after with the recognition that mutations in the gene for neuronal apoptosis inhibitory protein (NIAP) underlie spinal muscular atrophy. The inhibitors in family I32 all possess one or more 80-residue domains known as BIR (baculovirus inhibitor repeat) domains and have accordingly been termed 'BIR-containing' or 'BIRC' proteins as well as IAP proteins.
The mechanism of inhibition of caspases by the IAP proteins is complex, and reactive site residues cannot yet be identified with any confidence. Despite the conservation of the BIR or IAP (inhibitor of apoptosis) domains throughout the family it seems clear that other parts of the molecules also make essential contributions to inhibitory activity.
Homologs of most components in the mammalian apoptotic pathway have been identified in fruit flies. The Drosophila Apaf-1, known as Dapaf-1, HAC-1 or Dark, shares significant sequence similarity with its mammalian counterpart, and is critically important for the activation of the Drosophila initiator caspase Dronc. Dronc, in turn, cleaves and activates the effector caspase DrICE. The Drosophila IAP, DIAP1, binds to and in-activates both DrICE and Dronc through its BIR1 and BIR2 domains. During apoptosis, the anti-death function of DIAP1 is countered by at least four pro-apoptotic proteins, Reaper, Hid, Grim, and sickle, through direct physical interactions. These four proteins represent the functional homologs of the mammalian protein Smac, and they all share a conserved IAP-binding motif at their N termini. The three proteins Reaper, Hid, and Grim are collectively referred to as the RHG proteins.
Both XIAP and DIAP1 contain a RING domain at their C termini, and can act as an E3 ubiquitin ligase. Indeed, both XIAP and DIAP1 have been shown to promote self-ubiquitination and degradation as well as to negatively regulate the target caspases. Nonetheless, important differences exist between XIAP and DIAP1. The primary function of XIAP is thought to inhibit the catalytic activities of caspases; to what extent the ubiquitinating activity of XIAP contributes to its function remains unclear. For DIAP1, however, the ubiquitinating activity appears to be essential for its function.
Recently a Drosophila p53 protein has been identified that mediates apoptosis via a novel pathway involving the activation of the Reaper gene and subsequent inhibition of the inhibitors of apoptosis (IAPs). CIAP1, a major mammalian homolog of Drosophila IAPs, is irreversibly inhibited (cleaved) during p53-dependent apoptosis and this cleavage is mediated by a serine protease. Serine protease inhibitors that block CIAP1 cleavage inhibit p53-dependent apoptosis. Furthermore, activation of the p53 protein increases the transcription of the HTRA2 gene, which encodes a serine protease that interacts with CIAP1 and potentiates apoptosis. Therefore mammalian p53 protein activates apoptosis through a novel pathway functionally similar to that in Drosophila, which involves HTRA2 and subsequent inhibition of CIAP1 by cleavage.
The 3D structure of the C2 domain of synaptotagmin has been reported, the domain forms an eight-stranded beta sandwich constructed around a conserved 4-stranded motif, designated a C2 key. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel beta sandwich.
The forkhead-associated (FHA) domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands.
To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.
Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains, 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy. The 3-D structure of the head portion of myosin has been determined and a model for actin-myosin complex has been constructed.
The globular head is well conserved, some highly-conserved regions possibly relating to functional and structural domains. The rod-like tail starts with an invariant proline residue, and contains many repeats of a 28 residue region, interrupted at 4 regularly-spaced points known as skip residues. Although the sequence of the tail is not well conserved, the chemical character is, hydrophobic, charged and skip residues occuring in a highly ordered and repeated fashion.
The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.
The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the PHD (homeodomain) zinc finger domain, which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.
The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the zinc finger domain found in A20. A20 is an inhibitor of cell death that inhibits NF-kappaB activation via the tumour necrosis factor receptor associated factor pathway. The zinc finger domains appear to mediate self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Actin is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins.
In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.
Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.
The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.
Such a structure is shown in the following schematic representation:
It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.
The helix-hairpin-helix (HhH) motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins. The HhH motif is similar to, but distinct from, the helix-turn-helix (HtH) and the helix-loop-helix (HLH) motifs. All three motifs have two helices (H1 and H2) connected by a short turn. DNA-binding proteins with a HhH structural motif are involved in non-sequence-specific DNA binding that occurs via the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups. These HhH motifs are observed in DNA repair enzymes and in DNA polymerases. By contrast, proteins with a HtH motif bind DNA in a sequence-specific manner through the binding of H2 with the major groove; these proteins are primarily gene regulatory proteins. DNA-binding proteins with the HLH structural motif are transcriptional regulatory proteins and are principally related to a wide array of developmental processes.
Examples of proteins that contain a HhH motif include the 5'-exonuclease domains of prokaryotic DNA polymerases, the eukaryotic/prokaryotic RAD2 family of 5'-3' exonucleases such as T4 RNase H and T5, eukaryotic 5' endonucleases such as FEN-1 (Flap), and some viral exonucleases.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP), All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties.
Some of the proteins containing an UBP zinc finger include:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents ZZ-type zinc finger domains, named because of their ability to bind two zinc ions. These domains contain 4-6 Cys residues that participate in zinc binding (plus additional Ser/His residues), including a Cys-X2-Cys motif found in other zinc finger domains. These zinc fingers are thought to be involved in protein-protein interactions. The structure of the ZZ domain shows that it belongs to the family of cross-brace zinc finger motifs that include the PHD, RING, and FYVE domains. ZZ-type zinc finger domains are found in:
Single copies of the ZZ zinc finger occur in the transcriptional adaptor/coactivator proteins P300, in cAMP response element-binding protein (CREB)-binding protein (CBP) and ADA2. CBP provides several binding sites for transcriptional coactivators. The site of interaction with the tumour suppressor protein p53 and the oncoprotein E1A with CBP/P300 is a Cys-rich region that incorporates two zinc-binding motifs: ZZ-type and TAZ2-type. The ZZ-type zinc finger of CBP contains two twisted anti-parallel beta-sheets and a short alpha-helix, and binds two zinc ions. One zinc ion is coordinated by four cysteine residues via 2 Cys-X2-Cys motifs, and the third zinc ion via a third Cys-X-Cys motif and a His-X-His motif. The first zinc cluster is strictly conserved, whereas the second zinc cluster displays variability in the position of the two His residues.
In Arabidopsis thaliana (Mouse-ear cress), the hypersensitive to red and blue 1 (Hrb1) protein, which regulating both red and blue light responses, contains a ZZ-type zinc finger domain.
ZZ-type zinc finger domains have also been identified in the testis-specific E3 ubiquitin ligase MEX that promotes death receptor-induced apoptosis. MEX has four putative zinc finger domains: one ZZ-type, one SWIM-type and two RING-type. The region containing the ZZ-type and RING-type zinc fingers is required for interaction with UbcH5a and MEX self-association, whereas the SWIM domain was critical for MEX ubiquitination.
In addition, the Cys-rich domains of dystrophin, utrophin and an 87kDa post-synaptic protein contain a ZZ-type zinc finger with high sequence identity to P300/CBP ZZ-type zinc fingers. In dystrophin and utrophin, the ZZ-type zinc finger lies between a WW domain (flanked by and EF hand) and the C-terminal coiled-coil domain. Dystrophin is thought to act as a link between the actin cytoskeleton and the extracellular matrix, and perturbations of the dystrophin-associated complex, for example, between dystrophin and the transmembrane glycoprotein beta-dystroglycan, may lead to muscular dystrophy. Dystrophin and its autosomal homologue utrophin interact with beta-dystroglycan via their C-terminal regions, which are comprised of a WW domain, an EF hand domain and a ZZ-type zinc finger domain. The WW domain is the primary site of interaction between dystrophin or utrophin and dystroglycan, while the EF hand and ZZ-type zinc finger domains stabilise and strengthen this interaction.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.
A chitin biosynthesis protein from yeast also seems to belong to this group.
The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.
This entry represents the 7-fold alpha-alpha-superhelical ARM-type repeat found at the C-terminal of clathrin heavy chains and in VPS (vacuolar protein sorting-associated) proteins. In clathrin heavy chains, the C-terminal 7-fold ARM-type repeats interact to form the central hub of the triskelion. VPS proteins are required for vacuolar assembly and vacuolar traffick, and contain one clathrin-type repeat.
More information about these proteins can be found at Protein of the Month: Clathrin.
Dynamin GTPase effector domain found in proteins related to dynamin.
Dynamin is a GTP-hydrolysing protein that is an essential participant in clathrin-mediated endocytosis by cells. It self-assembles into 'collars' in vivo at the necks of invaginated coated pits; the self-assembly of dynamin being coordinated by the GTPase domain. Mutation studies indicate that dynamin functions as a molecular regulator of receptor-mediated endocytosis.
The PWI domain, named after a highly conserved PWI tri-peptide located within its N-terminal region, is a ~80 amino acid module, which is found either at the N-terminus or at the C-terminus of eukaryotic proteins involved in pre-mRNA processing. It is generally found in association with other domains such as RRM and RS. The PWI domain is a RNA/DNA-binding domain that has an equal preference for single- and double-stranded nucleic acids and is likely to have multiple important functions in pre-mRNA processing. Proteins containing this domain include the SR-related nuclear matrix protein of 160 kD (SRm160) splicing and 3'-end cleavage-stimulatory factor, and the mammalian splicing factor PRP3.
The PWI domain is a soluble, globular and independently folded domain which consists of a four-helix bundle, with structured N- and C-terminal elements.
The PX (phox) domain occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities. The PX domain is approximately 120 residues long, and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of p47phox binds to the SH3 domain in the same protein. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain.
The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of RNA-associated proteins. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site.
The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure.
Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.
The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.
The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.
Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.
WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.
The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. An evolutionarily conserved sequence of around 70 amino acids, the KH domain is present in a wide variety of nucleic acid-binding proteins. The KH domain binds RNA, and can function in RNA recognition. It is found in multiple copies in several proteins, where they can function cooperatively or independently. For example, in the AU-rich element RNA-binding protein KSRP, which has 4 KH domains, KH domains 3 and 4 behave as independent binding modules to interact with different regions of the AU-rich RNA targets. The solution structure of the first KH domain of FMR1 and of the C-terminal KH domain of hnRNP K determined by nuclear magnetic resonance (NMR) revealed a beta-alpha-alpha-beta-beta-alpha structure. Proteins containing KH domains include:
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphatase and adenylate cyclase.
Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 Kd which shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. Its exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1, which has weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and which contains a C-terminal PP2C domain.
PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase(PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.
This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphatase and adenylate cyclase.
Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 Kd which shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. Its exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1, which has weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and which contains a C-terminal PP2C domain.
PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase(PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.
The drosophila tudor protein is encoded by a 'posterior group' gene, which when mutated disrupt normal abdominal segmentation and pole cell formation. Another drosophila gene, homeless, is required for RNA localization during oogenesis. The tudor protein contains multiple repeats of a domain which is also found in homeless.
The tudor domain is found in many proteins that colocalise with ribonucleoprotein or single-strand DNA-associated complexes in the nucleus, in the mitochondrial membrane, or at kinetochores. It is not known whether the domain binds directly to RNA and ssDNA, or controls interactions with the nucleoprotein complexes. At least one tudor-containing protein, homeless, also contains a zinc finger typical of RNA-binding proteins.
The resolution of the solution structure of the Tudor domain of human SMN revealed that the Tudor domain forms a strongly bent antiparallel beta-sheet with five strands forming a barrel-like fold. The structure exhibits a conserved negatively charged surface that interacts with the C-terminal Arg and Gly-rich tails of the spliceosomal Sm D1 and D3 proteins.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents B-box-type zinc finger domains, which are around 40 residues in length. B-box zinc fingers can be divided into two groups, where types 1 and 2 B-box domains differ in their consensus sequence and in the spacing of the 7-8 zinc-binding residues. Several proteins contain both types 1 and 2 B-boxes, suggesting some level of cooperativity between these two domains. B-box domains are found in over 1500 proteins from a variety of organisms. They are found in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). TRIM proteins contain a type 2 B-box domain, and may also contain a type 1 B-box. In proteins that do not contain RING or coiled-coil domains, the B-box domain is primarily type 2. Many type 2 B-box proteins are involved in ubiquitinylation. Proteins containing a B-box zinc finger domain include transcription factors, ribonucleoproteins and proto-oncoproteins; for example, MID1, MID2, TRIM9, TNL, TRIM36, TRIM63, TRIFIC, NCL1 and CONSTANS-like proteins.
The microtubule-associated E3 ligase MID1 contains a type 1 B-box zinc finger domain. MID1 specifically binds Alpha-4, which in turn recruits the catalytic subunit of phosphatase 2A (PP2Ac). This complex is required for targeting of PP2Ac for proteasome-mediated degradation. The MID1 B-box coordinates two zinc ions and adopts a beta/beta/alpha cross-brace structure similar to that of ZZ, PHD, RING and FYVE zinc fingers.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:
where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins.
This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.
The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.
The MCM proteins bind together in a large complex. Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7. This core complex in human MCMs has been associated with helicase activity in vitro, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.
Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus.
C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents zinc finger domains resembling the C2H2-type.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents C-x8-C-x5-C-x3-H (CCCH) type Zinc finger (Znf) domains. Proteins containing CCCH Znf domains include Znf proteins from eukaryotes involved in cell cycle or growth phase-related regulation, e.g. human TIS11B (butyrate response factor 1), a probable regulatory protein involved in regulating the response to growth factors, and the mouse TTP growth factor-inducible nuclear protein, which has the same function. The mouse TTP protein is induced by growth factors. Another protein containing this domain is the human splicing factor U2AF 35 kD subunit, which plays a critical role in both constitutive and enhancer-dependent splicing by mediating essential protein-protein interactions and protein-RNA interactions required for 3' splice site selection. It has been shown that different CCCH-type Znf proteins interact with the 3'-untranslated region of various mRNA. This type of Znf is very often present in two copies.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases.
Many eukaryotic proteins containing one or more copies of a putative RNA-binding domain of about 90 amino acids are known to bind single-stranded RNAs. The largest group of single strand RNA-binding proteins is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence. RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu). The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) appears to have two RRM-like domains with specialised features for protein recognition. The motif also appears in a few single stranded DNA binding proteins.
The typical RRM consists of four anti-parallel beta-strands and two alpha-helices arranged in a beta-alpha-beta-beta-alpha-beta fold with side chains that stack with RNA bases. Specificity of RNA binding is determined by multiple contacts with surrounding amino acids. A third helix is present during RNA binding in some cases. The RRM is reviewed in a number of publications.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterised, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA.
Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.
Reaction of amidotransferase domain:
L-glutamine + H2O = L-glutamate + NH3
Reactions of FMN-binding domain:
2-oxoglutarate + NH3 = 2-iminoglutarate + H2O
2e + FMNox = FMNred
2-iminoglutarate + FMNred = L-glutamate + FMNox
The 3-D structure of ribonuclease inhibitor, a protein containing 15 LRRs, has been determined, revealing LRRs to be a new class of alpha/beta fold. LRRs form elongated non-globular structures and are often flanked by cysteine rich domains. This subtype is found in ribonuclease inhibitors.
Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.
Reaction of amidotransferase domain:
L-glutamine + H2O = L-glutamate + NH3
Reactions of FMN-binding domain:
2-oxoglutarate + NH3 = 2-iminoglutarate + H2O
2e + FMNox = FMNred
2-iminoglutarate + FMNred = L-glutamate + FMNox
This entry represents a most populated subfamily of leucine-rich repeats.
AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors..
AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades.
They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it.
They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE).
More information about these protein can be found at Protein of the Month: AAA ATPases. This entry represents the core domain of the AAA+ ATPases
AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins, in DNA-binding proteins from plants and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex.
High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions.
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
This domain is also found in transcription factor IIB (TFIIB) and retinoblastoma.
The HAT (Half A TPR) repeat has a repetitive pattern characterised by three aromatic residues with a conserved spacing. They are structurally and sequentially similar to TPRs (tetratricopeptide repeats), though they lack the highly conserved alanine and glycine residues found in TPRs. The number of HAT repeats found in different proteins varies between 9 and 12. HAT-repeat-containing proteins appear to be components of macromolecular complexes that are required for RNA processing. The repeats may be involved in protein-protein interactions. The HAT motif has striking structural similarities to HEAT repeats, being of a similar length and consisting of two short helices connected by a loop domain, as in HEAT repeats.
This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
The process of vesicular fusion with target membranes depends on a set of SNAREs (SNAP-Receptors), which are associated with the fusing membranes. Target SNAREs (t-SNAREs) are localised on the target membrane and belong to two different families, the syntaxin-like family and the SNAP-25 like family. One member of each family, together with a v-SNARE localised on the vesicular membrane, are required for fusion.
The Syntaxins are type-I transmembrane proteins that contain several regions with coiled-coil propensity in their cytosolic part, the SNARE motif. SNAP-25 is a protein consisting of two coiled-coil regions, which is associated with the membrane by lipid anchors. SNARE motifs assemble into parallel four helix bundles stabilised by the burial of these hydrophobic helix faces in the bundle core. Monomeric SNARE motifs are disordered so this assembly reaction is accompanied by a dramatic increase in alpha-helical secondary structure. The parallel arrangement of SNARE motifs within complexes bring the transmembrane anchors, and the two membranes, into close proximity. Recently, it was shown that the two coiled-coil regions of SNAP-25 and one of the coiled-coil regions of the syntaxins are related. This domain is found in both Syntaxin and SNAP-25 families as well as in other proteins.
High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.
The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins; LEF1 lymphoid enhancer binding factor 1; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.
Histone H3 is one of the four histones, along with H2A, H2B and H4, which form the eukaryotic nucleosome octomer core; the nucleosome octamer winds ~146 DNA base-pairs. It is a highly conserved protein of 135 amino acid residues.
Several proteins have been found to contain a C-terminal H3-like domain, including the mammalian centromeric protein CENP-A (which may act as a core histone necessary for the assembly of centromeres); yeast chromatin- associated protein CSE4; and Caenorhabditis elegans chromosome III proteins YL82_CAEEL and YMH3_CAEEL, whose function is unknown.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has ATPase and subunit interaction capacity.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent C-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has DNA-binding capacity.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry represents the C-terminal region of DNA topoisomerase I enzymes from eukaryotes (type IB enzymes). This region covers both the catalytic core and the DNA-binding domains.
Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry describes domain 2 found in type IA topoisomerases, which may be an extension of the Toprim domain. The structures of bacterial topoisomerases I and III have been shown to consist of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA. The N-terminal Toprim domain together with domain 3 forms the active site of the enzyme, while domains 2 and 4 form a single-strand DNA-binding groove. The Toprim domain () forms a compact Rossmann fold that coordinates the Mg+2 ion..
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry describes the DNA-binding domain (domain 3) found in type IA topoisomerases. The structures of bacterial topoisomerases I and III have been shown to consist of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA. The N-terminal Toprim domain together with domain 3 (beta-barrel) forms the active site of the enzyme, while domains 2 and 4 (both winged-helix-like) form a single-strand DNA-binding groove. All topoisomerases cleave DNA by forming a transient phosphotyrosine bond; in type IA topoisomerases, the active site tyrosine is in domain 3.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a zinc finger motif found in transcription factor IIs (TFIIS). In eukaryotes the initiation of transcription of protein encoding genes by polymerase II (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least eight different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, -IIH and -IIS. During mRNA elongation, Pol II can encounter DNA sequences that cause reverse movement of the enzyme. Such backtracking involves extrusion of the RNA 3'-end into the pore, and can lead to transcriptional arrest. Escape from arrest requires cleavage of the extruded RNA with the help of TFIIS, which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS extends from the polymerase surface via a pore to the internal active site. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
TFIIS is a protein of about 300 amino acids. It contains three regions: a variable N-terminal domain not required for TFIIS activity; a conserved central domain required for Pol II binding; and a conserved C-terminal C4-type zinc finger essential for RNA cleavage. The zinc finger folds in a conformation termed a zinc ribbon characterised by a three-stranded antiparallel beta-sheet and two beta-hairpins. A backbone model for Pol II-TFIIS complex was obtained from X-ray analysis. It shows that a beta hairpin protrudes from the zinc finger and complements the pol II active site.
Some viral proteins also contain the TFIIS zinc ribbon C-terminal domain. The Vaccinia virus protein, unlike its eukaryotic homologue, is an integral RNA polymerase subunit rather than a readily separable transcription factor.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
This motif occurs C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins. Examples from the metazoa are described as either "Acidic leucine-rich nuclear phosphoprotein 32 family member A" or have been characterised as U2A', the protein that interacts with U2B'' facilitating the interaction with U2 snRNA. U2A' is required for the spliceosome assembly and the efficient addition of U2 snRNP onto the pre-mRNA. The crystal structure of the spliceosomal U2B"-U2A' protein complex bound to a fragment of U2 small nuclear RNA has been described.
Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including
Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.
C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents U1-type zinc finger domains, a family of C2H2-type zinc fingers present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins, nevertheless with a low affinity constant. SAM domains also appear to possess the ability to bind RNA. Smaug  a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA  binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding.
Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures.
Synonym(s): Rsp5 or WWP domain
The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. It is frequently associated with other domains typical for proteins in signal transduction processes.
A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein and others.
This domain is responsible for the 3'-5' exonuclease proofreading activity of Escherichia coli DNA polymerase I (polI) and other enzymes, it catalyses the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D).
The N-terminal and internal 5'3'-exonuclease domains are commonly found together, and are most often associated with 5' to 3' nuclease activities. The XPG protein signatures are never found outside the '53EXO' domains. The latter are found in more diverse proteins. The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families.
In the eubacterial type A DNA-polymerases, the N-terminal and internal domains are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A is always present towards the C-terminus. Several eukaryotic structure-dependent endonucleases and exonucleases have the 53EXO domains separated by 24 to 27 amino acids, and the XPG protein signatures are always present. In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. Eukaryotic DNA repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures.
Endonuclease III is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair. The 3-D structures of Escherichia coli endonuclease III and catalytic domain of MutY have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL). Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs (see. The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif.
The HhH-GPD domain gets its name from its hallmark helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This domain is found in a diverse range of structurally related DNA repair proteins that include: endonuclease III,and DNA glycosylase MutY, an A/G-specific adenine glycosylase. Both of these enzymes have a C terminal iron-sulphur cluster loop (FCL). The methyl-CPG binding protein (MBD4) also contain a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II 8-oxoguanine DNA glycosylases and other members of the AlkA family.
DNA-directed DNA polymerases are the key enzymes catalysing the accurate replication of DNA. They require either a small RNA molecule or a protein as a primer for the de novo synthesis of a DNA chain. A number of polymerases belong to this family.
Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.
XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.
Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.
XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.
This entry represents the N terminal of XPG.
DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation.
A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.
This entry is found in DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.
This domain of unknown function is found in the Xeroderma pigmentosum group D (XPD) proteins which belong to a family of ATP-dependent helicases characterised by a 'D-E-A-H' motif. This resembles the 'D-E-A-D-box' of other known helicases, which represents a special version of the B motif of ATP-binding proteins. In XPD, His replaces the second Asp. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre-mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.
The domain, which defines this group of proteins is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.
The eukaryotic translation initiation factor 4A (eIF4A) is a member of the DEA(D/H)-box RNA helicase family This is a diverse group of proteins that couples an ATPase activity to RNA binding and unwinding. The structure of the carboxyl-terminal domain of eIF4A has been determined to 1.75 A resolution; it has a parallel alpha-beta topology that superimposes, with minor variations, on the structures and conserved motifs of the equivalent domain in other, distantly related helicases.
This domain of unknown function is found at the C-terminal of some ATP-dependent helicases characterised by a 'D-E-A-H' motif. This resembles the 'D-E-A-D-box' of other known helicases, a special version of the B motif of ATP-binding proteins however His replaces the second Asp. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre-mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.
The toprim (topoisomerase-primase) domain is a conserved region from DnaG primases, topoisomerases, OLD family nucleases and RecR/M DNA repair proteins. The fold of the TOPRIM domain resembles a Rossman-like nucleotide binding fold, with a central beta-sheet formed by 4 parallel beta-strands flanked by 3 alpha-helices. Only 5 residues are conserved across all TOPRIM domain, 2 of these are glycines which may play a structural role, the other 3 are acidic residues that are present in 2 conserved sequence motifs. These may have a metal binding function
The TOPRIM domain may form a shallow groove on these molecules and play a role in the binding of double-helical DNA/RNA hybrids.
Formin homology (FH) proteins play a crucial role in the reorganisation of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other, and may also act to inhibit actin polymerisation. The FH3 domain is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).
This entry represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains.At its C terminus is the DRF autoregulatory region.
Syntaxins A and B are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane. Syntaxins are a family of receptors for intracellular transport vesicles. Each target membrane may be identified by a specific member of the syntaxin family. Members of the syntaxin family have a size ranging from 30 Kd to 40 Kd; a C-terminal extremity which is highly hydrophobic and anchors the protein on the cytoplasmic surface of cellular membranes; a central, well conserved region, which seems to be in a coiled-coil conformation.
Quality control of intracellular proteins is essential for cellular homeostasis. Molecular chaperones recognise and contribute to the refolding of misfolded or unfolded proteins, whereas the ubiquitin-proteasome system mediates the degradation of such abnormal proteins. Ubiquitin-protein ligases (E3s) determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3.
Members of the U-box family of proteins constitute a class of ubiquitin-protein ligases (E3s) distinct from the HECT-type and RING finger-containing E3 families. Using yeast two-hybrid technology, all mammalian U-box proteins have been reported to interact with molecular chaperones or co-chaperones, including Hsp90, Hsp70, DnaJc7, EKN1, CRN, and VCP. This suggests that the function of U box-type E3s is to mediate the degradation of unfolded or misfolded proteins in conjunction with molecular chaperones as receptors that recognise such abnormal proteins.
Unlike the RING finger domain that is stabilised by Zn2+ ions coordinated by the cysteines and a histidine, the U-box scaffold is probably stabilised by a system of salt-bridges and hydrogen bonds. The charged and polar residues that participate in this network of bonds are more strongly conserved in the U-box proteins than in classic RING fingers, which supports their role in maintaining the stability of the U box. Thus, the U box appears to have evolved from a RING finger domain by appropriation of a new set of residues required to stabilise its structure, concomitant with the loss of the original, metal-chelating residues.
This region is found in a number of histone lysine methyltransferases (HMTase), C-terminal to the SET domain; it is generally described as the post-SET domain.
Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.
The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.
The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.
SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex and is also involved in the ubiquitin pathway. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1.
The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA binding domain found in diverse nuclear proteins involved in chromosomal organization, including in apoptosis. In yeast, SAP is found in the most distal N-terminal region of E3 SUMO-protein ligase SIZ1, where it is involved in nuclear localization.
Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.
The polyadenylate-binding protein (PABP) has a conserved C-terminal domain (PABC), which is also found in the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains. PABP recognises the 3' mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilisation/degradation. PABC domains of PABP are peptide-binding domains that mediate PABP homo-oligomerisation and protein-protein interactions. In mammals, the PABC domain of PABP functions to recruit several different translation factors to the mRNA poly(A) tail.
DNA damaging agents such as the anti-tumour drugs bleomycin and neocarzinostatin or those that generate oxygen radicals produce a variety of lesions in DNA. Amongst these is base-loss which forms apurinic/apyrimidinic (AP) sites or strand breaks with atypical 3' termini. DNA repair at the AP sites is initiated by specific endonuclease cleavage of the phosphodiester backbone. Such endonucleases are also generally capable of removing blocking groups from the 3' terminus of DNA strand breaks.
AP endonucleases can be classified into two families based on sequence similarity. Family 2 groups the enzymes listed below.
Escherichia coli endonuclease IV and its S. cerevisiae homologue Apn1 have been shown to be transition metalloproteins that require zinc and manganese for activity.
Endonuclease III is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair. The 3-D structures of Escherichia coli endonuclease III and catalytic domain of MutY have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL). Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs. The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif.
The iron-sulphur cluster loop (FCL) is also found in DNA-(apurinic or apyrimidinic site) lyase, a subfamily of endonuclease III. The enzyme has both apurinic and apyrimidinic endonuclease activity and a DNA N-glycosylase activity. It cuts damaged DNA at cytosines, thymines and guanines, and acts on the damaged strand 5' of the damaged site. The enzyme binds a 4Fe-4S cluster which is not important for the catalytic activity, but is probably involved in the alignment of the enzyme along the DNA strand.
This is large family of DNA binding helix-turn helix proteins that include a bacterial plasmid copy control protein, bacterial methylases, various bacteriophage transcription control proteins and a vegetative specific protein from Dictyostelium discoideum (Slime mould).
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the core domain (domain 3) found in proteins of the MutS family. The core domain of MutS adopts a multi-helical structure comprised of two subdomains, which are interrupted by the clamp domain. Two of the helices in the core domain comprise the levers that extend towards the DNA.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the C-terminal region found in proteins in the MutS family of DNA mismatch repair proteins. The C-terminal region of MutS is comprised of the ATPase domain and the HTH (helix-turn-helix) domain, the latter being involved in dimer contacts. Yeast MSH3, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.
This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. This entry is designated "type 3", and is found in nuclear cap-binding proteins, eIF4G, and UPF2.
The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Proteins usually contain two or more structurally similar MIF4G domains connected by unstructured linkers. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low.
The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80. The complex binds to 5'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans.
Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.
Nonsense-mediated mRNA decay (NMD) in eukaryotes involves UPF1, UPF2 and UPF3 to accelerate the decay rate of two unique classes of transcripts: (1) nonsense mRNAs that arise through errors in gene expression, and (2) naturally occurring transcripts that lack coding errors but have built-in features that target them for accelerated decay (error-free mRNAs). NMD can trigger decay during any round of translation and can target CBC-bound or eIF-4E-bound transcripts. UPF2 contains MIF4G domains, while UPF3 contains an RNP domain.
This entry represents the MI domain (after MA-3 and eIF4G), it is a protein-protein interaction module of ~130 amino acids. It appears in several translation factors and is found in:
The MI domain consists of seven alpha-helices, which pack into a globular form. The packing arrangement consists of repeating pairs of antiparallel helices packed one upon the other such that a superhelical axis is generated perpendicular to the alpha-helical axes.
The MI domain has also been named MA3 domain.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the zinc finger domain found in RanBP2 proteins. Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran binding protein 2 (RanBP2) is a 358-kDa nucleoporin located on the cytoplasmic side of the nuclear pore complex which plays a role in nuclear protein import. RanBP2 contains multiple zinc fingers which mediate binding to RanGDP.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain.
This domain has a 2-layer beta(3)-alpha(2)-beta fold, and is present in a number of other proteins as well, including FAF1 (Fas-associated factor 1) and undulin 2. Many of these proteins also contain the UBX domain C-terminal to the FAF domain. This domain is found in many eukaryotic proteins.
The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals; related FAS1 domains are also found in bacteria.
The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.
FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains.
Proteins known to contain a FAS1 domain include:
The FAS1 domains of both human periostin and BIgH3 proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues. Gamma-carboxyglutamate residues are more commonly associated with GLA domains, where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.
This entry contains:
Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.
In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.
NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.
NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.
This family contains acyltransferases involved in phospholipid biosynthesis and other proteins of unknown function. This domain is found in tafazzins, defects in which are the cause of Barth syndrome; a severe inherited disorder which is often fatal in childhood and is characterised by cardiac and skeletal abnormalities. Phospholipid/glycerol acyltransferase is not found in the viruses or the archaea and is under represented in the bacteria. Bacterial glycerol-phosphate acyltransferases are involved in membrane biogenesis since they use fatty acid chains to form the first membrane phospholipids.
These proteins contain a short bi-helical repeat that is related to HEAT. Cyanobacteria and red algae harvest light energy using macromolecular complexes known as phycobilisomes (PBS), peripherally attached to the photosynthetic membrane. The major components of PBS are the phycobiliproteins. These heterodimeric proteins are covalently attached to phycobilins: open-chain tetrapyrrole chromophores, which function as the photosynthetic light-harvesting pigments. Phycobiliproteins differ in sequence and in the nature and number of attached phycobilins to each of their subunits. These proteins include the lyase enzymes that specifically attach particular phycobilins to apophycobiliprotein subunits. The most comprehensively studied of these is the CpcE/Flyasewhich attaches phycocyanobilin (PCB) to the alpha subunit of apophycocyanin. Similarly, MpeU/V attaches phycoerythrobilin to phycoerythrin II, while CpeY/Z is thought to be involved in phycoerythrobilin (PEB) attachment to phycoerythrin (PE) I (PEs I and II differ in sequence and in the number of attached molecules of PEB: PE I has five, PE II has six).
All the reactions of the above lyases involve an apoprotein cysteine SH addition to a terminal delta 3,3'-double bond. Such a reaction is not possible in the case of phycoviolobilin (PVB), the phycobilin of alpha-phycoerythrocyanin (alpha-PEC). It is thought that in this case, PCB, not PVB, is first added to apo-alpha-PEC, and is then isomerized to PVB. The addition reaction has been shown to occur in the presence of either of the components of alpha-PEC-PVB lyase PecE or PecF (or both). The isomerisation reaction occurs only when both PecE and PecF components are present, i.e. the PecE/F phycobiliprotein lyase is also a phycobilin isomerase. Another member of this family is the NblB protein, whose similarity to the phycobiliprotein lyases was previously noted. This constitutively expressed protein is not known to have any lyase activity. It is thought to be involved in the coordination of PBS degradation with environmental nutrient limitation. It has been suggested that the similarity of NblB to the phycobiliprotein lyases is due to the ability to bind tetrapyrrole phycobilins via the common repeated motif.
The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. It is normally about 70 amino acids in length. It is thought to be an intracellular protein-binding or lipid-binding signalling domain, which has an important function in membrane-associated processes. Mutations in the GRAM domain of myotubularins cause a muscle disease, which suggests that the domain is essential for the full function of the enzyme. Myotubularin-related proteins are a large subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids.
This domain of unknown function is found in FBox and BRCT domain containing plant proteins.
PUG is a domain in protein kinases, N-glycanases and other nuclear proteins found in eukaryotes.
PSP is a proline-rich domain of unknown function found in spliceosome associated proteins.
RPR is a domain of unknown function present in proteins which are involved in regulation of nuclear pre-mRNA.
TLDc is a domain of unknown function, restricted to eukaryotes, and commonly found in TBC and LysM domain containing proteins.
The RWD eukaryotic domain is found in RING finger and WD repeat containing proteins and DEXDc-like helicase subfamily related to the ubiquitin-conjugating enzymes domain.
Yeast Vps10p is a receptor for sorting and transport of the soluble vacuolar hydrolase carboxypeptidase Y to the lysosome-like vacuole.. In mammalian cells, proteins containing this domain are involved in the transport of lipoproteins and sorting of endosomal proteins. They may also act as receptors for some neuropeptides.
The N terminus of murine brain SorCS contains two putative cleavage sites for the convertase furin which mark the beginning of the VPS10 domain, which is followed by a module of imperfect leucine-rich repeats and a transmembrane domain. The short intracellular C-terminus contains consensus signals for rapid internalization. The identified putative binding motifs for SH2 and SH3 domains are unique in the family of VPS10 domain receptors. SorCS is predominantly expressed in brain, but also in heart, liver, and kidney. SorCS transcripts detected by in situ hybridization in the murine central nervous system point to a neuronal expression.
This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases.
Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.
The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.
This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.
O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.
Members of this family belong to the chitinase class II group which includes chitinase, chitodextrinase and the killer toxin of Kluyveromyces lactis (Yeast) (Candida sphaerica) and all belong to glycoside hydrolase, family 18 The chitinases hydrolyse chitin oligosaccharides.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.
The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity. Members of the papain family are widespread, found in baculovirus, eubacteria, yeast, and practically all protozoa, plants and mammals. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate.
The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a cysteine-rich (C6HC) zinc finger domain that is present in Triad1, and which is conserved in other proteins encoded by various eukaryotes. The C6HC consensus pattern is:
The C6HC zinc finger motif is the fourth family member of the zinc-binding RING, LIM, and LAP/PHD fingers. Strikingly, in most of the proteins the C6HC domain is flanked by two RING finger structures The novel C6HC motif has been called DRIL (double RING finger linked). The strong conservation of the larger tripartite TRIAD (twoRING fingers and DRIL) structure indicates that the three subdomains are functionally linked and identifies a novel class of proteins.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.
This family of proteins include rRNA adenine dimethylases (e.g. KsgA) and the Erythromycin resistance methylases (Erm).
The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.
The crystal structure of KsgA from Escherichia coli has been solved to a resolution of 2.1A. It bears a strong similarity to the crystal structure of ErmC' from Bacillus stearothermophilus and a lesser similarity to the yeast mitochondrial transcription factor, sc-mtTFB.
The Erm family of RNA methyltransferases, which methylate a single adenosine base in 23S rRNA confer resistance to the MLS-B group of antibiotics. Despite their sequence similarity, the two enzyme families have strikingly different levels of regulation that remain to be elucidated. Other orthologs, of this family include the yeast and Homo sapiens (Human) mitochondrial transcription factors (MTF1 and h-mtTFB respectively), which are nuclear encoded. Human-mtTFB is able to stimulate transcription in vitro independently of its S-adenosylmethionine binding and rRNA methyltransferase activity.
This family is found in Lsm (like-Sm) proteins, which have a core structure consisting of an open beta-barrel with an SH3-like topology.
Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Other snRNPs, such as U7 snRNP, can contain different Lsm proteins.
Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. Archaeal Lsm proteins are likely to represent the ancestral Lsm domain.
Eukaryotic translation initiation factor A (eIF-1A) (formerly known as eiF-4C) is a protein that seems to be required for maximal rate of protein biosynthesis. It enhances ribosome dissociation into subunits and stabilizes the binding of the initiator Met-tRNA to 40S ribosomal subunits. The archaea possess an eIF-1A homolog.
This family includes eukaryotic translation initiation factor 6 (eIF6) as well as presumed archaeal homologues.
The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This event is rate-limiting for translation, and depends on external stimuli and the status of the cell. Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and prevents its association with the 40S ribosomal subunit ribosomes. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
A major role in the regulation of eukaryotic protein-coding genes is played by the gene-specific transcriptional regulators, which recruit the RNA polymerase II holoenzyme to the specific promoter. The Rpb4 and Rpb7 subunits of yeast RNA polymerase II form a heterodimeric complex essential for promoter-directed transcription initiation. The Rpb4-Rpb7 complex is not required for stable recruitment of polymerase to active preinitiation complexes, suggesting that Rpb4-Rpb7 mediates an essential step subsequent to promoter binding.
This entry represents a domain present in DNA-directed RNA polymerase II subunit, Rpb4.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
The core of the bacterial RNA polymerase (RNAP) consists of four subunits, two alpha, a beta and a beta', which are conserved from bacteria to mammals. The alpha subunit (RpoA) initiates RNAP assembly by dimerising to form a platform on which the beta subunits can interact, and plays a direct role in promoter recognition. In eukaryotes, RNA polymerase (RNAP) II is responsible for all mRNA synthesis. RNAP-II consists of 12 subunits, where subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the bacterial RpoA homodimer. Archaeal RNAP closely resembles eukaryotic RNAP-II, and is composed of 12 subunits, of which D and L form a heterodimer resembling the Rpb3/Rpb11 and RpoA/RpoA dimers.
The bacterial RpoA, eukaryotic Rpb3 and archaeal D subunits share sequence and structural motifs, and can be placed into a single family. These subunits also have unique sequence motifs, especially at their C-terminal ends, which are involved in promoter specificity, for example the CTD of the bacterial RNAP alpha subunit.
The task of transcribing nuclear genes is shared between three RNA polymerases in eukaryotes: RNA polymerase (pol) I synthesizes the large rRNA, pol II synthesizes mRNA and pol III synthesizes tRNA and 5S rRNA. Pol I transcription is localised to discrete sites called nucleoli; these can be likened to ribosome factories, in which rRNA is synthesised by pol I in the fibrillar centres and then processed and assembled into ribosomes in the surrounding granular regions. Prokaryotes, in contrast, posses a single RNA polymerase, with transcription being controlled by the particular signam factor interacting with the catalytic core.
This entry describes an N-terminal conserved region which can be found in the largest subunits of prokaryoptic and eukaryotic RNA polymerases.
The LisH motif is found in a large number of eukaryotic proteins, from metazoa, fungi and plants that have a wide range of functions. The recently solved structure of the LisH domain in the N-terminal region of LIS1 depicted it as a novel dimerization motif, and that other structural elements are likely to play an important role in dimerisation.
A sequence motif, LisH, has been identified in the products of genes mutated in Miller-Dieker lissencephaly, Treacher Collins, oral-facial-digital type 1 and contiguous syndrome ocular albinism with late onset sensorineural deafness syndromes. An additional homologous motif was detected in a gene product fused to the fibroblast growth factor receptor type 1 in patients with an atypical stem cell myeloproliferative disorder. In total, over 100 eukaryotic intracellular proteins are shown to possess a LIS1 homology (LisH) motif, including several katanin p60 subunits, muskelin, tonneau, LEUNIG, Nopp140, aimless and numerous WD repeat-containing beta-propeller proteins.
It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The predicted secondary structure of LisH motifs, and their occurrence in homologues of Gbeta beta-propeller subunits, suggests that they are analogues of Ggamma subunits, and might associate with the periphery of beta-propeller domains.
The 33-residue LIS1 homology (LisH) motif is found in eukaryotic intracellular proteins involved in microtubule dynamics, cell migration, nucleokinesis and chromosome segregation. The LisH motif is likely to possess a conserved protein-binding function and it has been proposed that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The LisH motif is found associated to other domains, such as WD-40 (see, SPRY, Kelch, AAA ATPase, RasGEF, or HEAT (see. The secondary structure of the LisH domain is predicted to be two alpha- helices.
Some proteins known to contain a LisH motif are listed below:The C-terminal to LisH (CTLH) motif is a predicted alpha-helical sequence of unknown function that is found adjacent to the LisH motif in a number of these proteins but is absent in other (e.g. LIS1). The CTLH domain can also be found in the absence of the LisH motif, like in:
PINc describes a large group of domains which are predicted to play a role in nucleotide-binding, potentially being found in RNases.
PINc domains in nematode SMG-5 and yeast NMD4p are predicted to be involved in the posttranscriptional gene silencing pathway known as RNA interference (RNAi). In an early step in RNAi, the initiating dsRNA is cleaved into small interfering RNAs (siRNAs), 21-23 nucleotides long, by the enzyme Dicer. After processing by Dicer, siRNAs associate with a multicomponent complex called the RNA-induced silencing complex that recognises and cleaves the cognate message.
Sel1-like repeats are tetratricopeptide repeat sequences originally identified in a Caenorhabditis elegans receptor molecule which is a key negative regulator of the Notch pathway. Mammalian homologues have since been identified although these mainly pancreatic proteins have yet to have a function assigned.
This entry represents the CARP motif, which occurs as a tandem repeat in the C-terminal of many cyclase-associated proteins (CAPs), as well as in tubulin binding cofactor C and the X-linked retinitis pigmentosa 2 protein (RP2). CARP-containing proteins appear to have a role in cell signalling.
Cyclase-associated proteins (CAPs) are highly conserved monomeric actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways. Only yeast CAPs are involved in adenylate cyclase activation. The C-terminal domain of CAP proteins is responsible for G-actin-binding that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes.
Tubulin binding cofactor C (or tubulin-specific chaperone C) (TBCC) is a folding cofactor that participates in tubulin biogenesis along with the other tubulin folding cofactors A (TBCA), B (TBCB), E (TBCE) and D (TBCD), as well as the GTP-binding protein Arl2.
Retinitis pigmentosa (RP) comprises a large group of heterogeneous diseases that results in progressive retinal degeneration. Human X-linked retinitis pigmentosa 2 protein (RP2) consists of an N-terminal beta-helix and a C-terminal ferredoxin-like alpha/beta domain. RP2 is a specific effector protein of the GTP-binding protein Arl3. The Arl3 protein is a member of the Arf (ADP ribosylation factor) subfamily of Ras-related proteins. The beta-helix domain of RP2 is required for the RP2 interaction with Arl3. The CARP motif is found in the N-terminal beta-helix domain of RP2 proteins.
This repeated motif of unknown function has been found between the transmembrane helices of cystinosin, yeast ERS1 and mannose-P-dolichol utilization defect 1. The positioning of this repeat suggests that it may be associated with the glycosylation machinery.
Deubiquitinating enzymes (DUB) form a large family of cysteine protease that can deconjugate ubiquitin or ubiquitin-like proteins (see from ubiquitin-conjugated proteins. All DUBs contain a catalytic domain surrounded by one or more subdomains, some of which contribute to target recognition. The ~120-residue DUSP (domain present in ubiquitin-specific proteases) domain is one of these specific subdomains. Single or tandem DUSP domains are located both N- and C-terminal to the ubiquitin carboxyl-terminal hydrolase catalytic core domain (see.
The DUSP domain displays a tripod-like AB3 fold with a three-helix bundle and a three-stranded anti-parallel beta-sheet resembling the legs and seat of the tripod (see PDB:1W6V). Conserved residues are predominantly involved in hydrophobic packing interactions within the three alpha-helices. The most conserved DUSP residues, forming the PGPI motif, are flanked by two long loops that vary both in length and sequence. The PGPI motif packs against the three-helix bundle and is highly ordered.
The function of the DUSP domain is unknown but it may play a role in protein/protein interaction or substrate recognition. This domain is associated with ubiquitin carboxyl-terminal hydrolase family 2 (MEROPS peptidase family C19). They are a family 100 to 200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast; others include:
Mammalian prolyl 4-hydroxylase alpha catalyses the posttranslational formation of 4- hydroxyproline in -xaa-pro-gly-sequences in collagens and other proteins. Prokaryotic enzymes might catalyse hydroxylation of antibiotic peptides. These are 2-oxoglutarate-dependent dioxygenases, requiring 2-oxoglutarate and dioxygen as cosubstrates and ferrous iron as a cofactor.
This entry represents iron-sulphur domain containing proteins that have a CDGSH sequence motif (although the Ser residue can also be an Ala or Thr), and is found in proteins from a wide range of organisms with the exception of fungi. Proteins carrying this domain include ferredoxin-dependent glutamate synthase. CDGSH-type domains are also found in the iron-containing outer membrane protein mitoNEET. MitoNEET contains the conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H, a defining feature of CDGSH domains, and is likely involved in iron binding.
The tertiary structures of pectate lyases and rhamnogalacturonase A show a stack of parallel beta strands that are coiled into a large helix. Each coil of the helix represents a structural repeat that, in some homologues, can be recognised from sequence information alone. Conservation of asparagines might be connected with asparagine-ladders that contribute to the stability of the fold. Proteins containing these repeats most often are enzymes with polysaccharide substrates.
This entry represents the LPS-induced tumour necrosis factor alpha factor (LITAF) is induced in mamalian cells following treatment with lipopolysaccharide. The LITAF domain is a possible membrane-associated motif which contains an N-terminal CXXC kuckle followed by a long (25 amino acid) hydrophobic region and a C-terminal (H)XCXXC knuckle. Both of these knuckles are highly characteristic of Zn2+ binding domains, and the N-terminal region of one LITAF domain-containing protein is thought to bind the intracellular molecule Nedd4 which suggests that the hydrophobic region does not span the membrane. It may instead insert into the membrane, bringing together the N- and C-terminal CXXC knuckles to form a compact Zn2+ binding structure.
The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding.
TLC is a protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis , TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. Proteins containing this domain may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains.
The Ubiquitin Interacting Motif (UIM), or 'LALAL-motif', is a stretch of about 20 amino acid residues, which was first described in the 26S proteasome subunit PSD4/RPN-10 that is known to recognise ubiquitin. In addition, the UIM is found, often in tandem or triplet arrays, in a variety of proteins either involved in ubiquitination and ubiquitin metabolism, or known to interact with ubiquitin-like modifiers. Among the UIM proteins are two different subgroups of the UBP (ubiquitin carboxy-terminal hydrolase) family of deubiquitinating enzymes, one F-box protein, one family of HECT-containing ubiquitin-ligases (E3s) from plants, and several proteins containing ubiquitin-associated UBA and/or UBX domains. In most of these proteins, the UIM occurs in multiple copies and in association with other domains such as UBA, UBX, ENTH, EH, VHS, SH3, HECT, VWFA, EF-hand calcium-binding, WD-40, F-box, LIM, protein kinase, ankyrin, PX, phosphatidylinositol 3- and 4-kinase, C2, OTU, dnaJ, RING-finger or FYVE-finger. UIMs have been shown to bind ubiquitin and to serve as a specific targeting signal important for monoubiquitination. Thus, UIMs may have several functions in ubiquitin metabolism each of which may require different numbers of UIMs.
The UIM is unlikely to form an independent folding domain. Instead, based on the spacing of the conserved residues, the motif probably forms a short alpha-helix that can be embedded into different protein folds. Some proteins known to contain an UIM are listed below:
This describes a heat shock chaperonin-binding motif found in the stress-inducible phosphoprotein STI1. Both N- and C-termini of STI1 are capable of binding heat shock proteins and the domain is found both singly and duplicated in other proteins.
This domain is found in MoaA, NifB, PqqE, coproporphyrinogen III oxidase, biotin synthase and MiaB families, and includes a representative in the eukaryotic elongator subunit, Elp-3. Some members of the family are methyltransferases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD).
SPP and potential eukaryotic homologs represent a family of aspartic proteases that promote intramembrane proteolysis to release biologically important peptides. Signal peptide peptidase (SPP) catalyses intramembrane proteolysis of some signal peptides after they have been cleaved from a preprotein. In humans, SPP activity is required to generate signal sequence-derived human lymphocyte antigen-E epitopes that are recognised by the immune system, and are required in the processing of the hepatitis C virus core protein.
This is a family of uncharacterised proteins which includes Escherichia coli SprT. The majority of members contain the metallopeptidase zinc binding signature which has a HExxH motif, however there is no evidence for them being metallopeptidases.
This family currently contains one sequence of known function human mitochondrial transcription termination factor (mTERF), a multizipper protein but binds to DNA as a monomer, with evidence pointing to intramolecular leucine zipper interactions. The precursors contain a mitochondrial targeting sequence, and the mature mTERF exhibits three leucine zippers, of which one is bipartite, and two widely spaced basic domains. Both basic domains and the three leucine zipper motifs are necessary for DNA binding. The leucine zippers are not implicated in a dimerisation role as in other leucine zippers.
The rest of the family consists of hypothetical proteins none of which have any functional information.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and the bacterial transcription antitermination proteins NusG.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
The RING finger is a well characterised zinc finger which coordinates two zinc atoms in a cross-braced manner (see. According to the pattern of cysteines and histidines three different subfamilies of RING finger can be defined. The classical RING finger (RING-HC) has a histidine at the fourth coordinating position and a cysteine at the fifth. In the RING-H2 variant, both the fourth and fifth positions are occupied by histidines. The RING-CH, which is very similar to the classical RING finger, differs from both of these variants in that it has a cys residue in the fourth position and a His in the fifth. Another difference between the RING-CH and the common RING variants is a somewhat longer peptide segment between the fourth and fifth zinc-coordinating residues. The RING-CH zinc finger has thus the same arrangement of cysteine and histidine (C4HC3) as the PHD zinc finger (see but it contains features (spacing between the cysteines and the histidine) characteristic of the genuine RING-finger (C3HC4). The RING-CH-type is an E3 ligase mainly found in proteins associated to membranes.
The solution structure of the RING-CH-type zinc finger of the herpesvirus Mir1 protein has shown that it is an outlying relative of the cellular RING finger domain family, with its polypeptide backbone much more closely resembling that of RING domains than PHD domains. The only real difference between the classic and variant RING domains, other than the alteration of zinc ligands, is the loss of the small beta-sheet found in RING domains and the replacement of one strand of this sheet with a single turn of helix. Some proteins that contains a RING-CH-type zinc finger are listed below:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The BSD domain is an about 60-residue long domain named after the BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins in which it is found. Additionally, it is also found in several hypothetical proteins. The BSD domain occurs in one or two copies in a variety of species ranging from primal protozoan to human. It can be found associated with other domains such as the BTB domain (see or the U-box in multidomain proteins. The function of the BSD domain is yet unknown.
Secondary structure prediction indicates the presence of three predicted alpha helices, which probably form a three-helical bundle in small domains. The third predicted helix contains neighbouring phenylalanine and tryptophan residues - less common amino acids that are invariant in all the BSD domains identified and that are the most striking sequence features of the domain.
Some proteins known to contain one or two BSD domains are listed below:The PAM domain (PCI/PINT associated module) is found in a number of proteins that form multiprotein complexes, e.g. the Sac3-Thp1 complex, the regulatory subunit of the 26S proteasome and the COP-9 signalosome. The domain is present in a single copy and has an alpha-helical fold. It is thought to play a role in protein binding.
The CR, or CT11-RanBPM, domain is a protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi).
The PA14 domain forms an insert in bacterial beta-glucosidases, other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium pre-spore cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding.
This domain is found in Saccharomyces cerevisiae (Baker's yeast) protein SMP2, proteins with an N-terminal lipin domain and phosphatidylinositol transfer proteins. SMP2 is involved in plasmid maintenance and respiration. Lipin proteins are involved in adipose tissue development and insulin resistance.
This domain is the central domain of AARP2 (asparagine and aspartate rich protein 2). It is weakly similar to the GTP-binding domain of elongation factor TU. PfAARP2 is an antigen from Plasmodium falciparum of 150 kDa, which is encoded by a unique gene on chromosome 1. The central region of Pfaarp2 contains blocks of repetitions encoding asparagine and aspartate residues.
Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalysing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterised from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present: one involved in purine biosynthesis and the other in the purine nucleotide cycle.
The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking, and with ubiquitin for receptor sorting. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-, beta- and gamma-adaptin from AP clathrin adaptor complexes, and the GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 7 or 8 strands in 2 beta-sheets in a Greek key topology. Although these domains share a similar fold, there is little sequence identity between the alpha/beta-adaptins and gamma-adaptin/GAE.
More information about these proteins can be found at Protein of the Month: Clathrin.
A novel antigen of Plasmodium falciparum has been cloned that contains a hydrophobic domain typical of an integral membrane protein. The antigen is designated apical membrane antigen 1 (AMA-1) by virtue of appearing to be located in the apical complex. AMA-1 appears to be transported to the merozoite surface close to the time of schizont rupture.
The 66kDa merozoite surface antigen (PK66) of Plasmodium knowlesi, a simian malaria, possesses vaccine-related properties believed to originate from a receptor-like role in parasite invasion of erythrocytes. The sequence of PK66 is conserved throughout plasmodium, and shows high similarity to P. falciparum AMA-1. Following schizont rupture, the distribution of PK66 changes in a coordinate manner associated with merozoite invasion. Prior to rupture, the protein is concentrated at the apical end, following which it distributes itself entirely across the surface of the free merozoite. Immunofluorescence studies suggest that, during invasion, PK66 is excluded from the erythrocyte at, and behind, the invasion interface.
The beta subunit of archaeal and eukaryotic translation initiation factor 2 (IF2beta) and the N-terminal domain of translation initiation factor 5 (IF5) show significant sequence homology. Archaeal IF2beta contains two independent structural domains: an N-terminal mixed alpha/beta core domain (topological similarity to the common core of ribosomal proteins L23 and L15e), and a C-terminal domain consisting of a zinc-binding C4 finger. Archaeal IF2beta is a ribosome-dependent GTPase that stimulates the binding of initiator Met-tRNA(i)(Met) to the ribosomes, even in the absence of other factors. The C-terminal domain of eukaryotic IF5 is involved in the formation of the multi-factor complex (MFC), an important intermediate for the 43S pre-initiation complex assembly. IF5 interacts directly with IF1, IF2beta and IF3c, which together with IF2-bound Met-tRNA(i)(Met) form the MFC.
This entry represents the N-terminal alpha/beta domain found in IF2beta and IF5.
Members of this entry adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Formin homology (FH) proteins play a crucial role in the reorganization of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other, and may also act to inhibit actin polymerisation. The FH3 domain is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).
This entry represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains.
This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.
This entry represents the p29 subunit (also known as Rpp29 or Pop4) of the related ribonucleoproteins ribonuclease (RNase) P and RNase MRP, which can be found in both eukaryotes and arachea. The structure of the RNase P subunit, Rpp29, from Methanobacterium thermoautotrophicum has been determined. Mth Rpp29 is a member of the oligonucleotide/oligosaccharide binding fold family. It contains a structured beta-barrel core and unstructured N- and C-terminal extensions bearing several highly conserved amino acid residues that could be involved in RNA contacts in the protein-RNA complex. Rpp29 catalyses the endonucleolytic cleavage of RNA, removing 5'-extranucleotides from tRNA precursor. It interacts with the Rpp25 and Pop5 subunits.
RNase P is a ubiquitous ribonucleoprotein enzyme primarily responsible for cleaving the 5' leader sequence during maturation of tRNAs in all three domains of life. In eubacteria, this enzyme is made up of two subunits: a large RNA (approximately 120 kDa) responsible for mediating catalysis, and a small protein cofactor (approximately 15 kDa) that modulates substrate recognition and is required for efficient in vivo catalysis. In contrast, multiple proteins are associated with eukaryotic and archaeal RNase P, and these proteins exhibit no recognizable homology to the conserved bacterial protein subunit. In reconstitution experiments with recombinantly expressed and purified protein subunits Mth Rpp29, a homologue of the Rpp29 protein subunit from eukaryotic RNase P, is an essential protein component of the archaeal holoenzyme. In Saccharomyces cerevisiae (Baker's yeast), RNase P consists of 9 protein subunits (Pop1, Pop3-8, Rpr2 and Rpp1), while in humans there are 10 subunits (Rpp14, 20, 21, 25, 29, 30, 38, 40, hPop1, 5).
RNase MRP (mitochondrial RNA processing) is an rRNA processing enzyme that cleaves a specific site within precursor rRNA to generate the mature 5'-end of 5.8S rRNA. RNase MRP also cleaves primers for mitochondrial DNA replication and CLB2 mRNA. In yeast, RNase MRP possesses one putatively catalytic RNA and at least 9 protein subunits and is highly related to RNase P (Pop1, Pop3-Pop8, Rpp1, Snm1 and Rmp1).
Members of this family adopt a structure consisting of a small globular all-beta-domain, with a three-stranded beta-sheet and a contiguous beta-hairpin. They bind to Mago alpha-helices via extensive electrostatic interactions and at a beta2-beta3 loop via hydrophobic interactions.
This entry represents the C-terminal domain found in DNA/pantothenate metabolism flavoproteins, which affects synthesis of DNA and pantothenate metabolism. These proteins contain ATP, phosphopantothenate, and cysteine binding sites. The structure of this domain has been determined in human phosphopantothenoylcysteine (PPC) synthetase and as the PPC synthase domain (CoaB) from the Escherichia coli coenzyme A bifunctional protein CoaBC. This domain adopts a 3-layer alpha/beta/alpha fold with mixed beta-sheets, which topologically resembles a combination of Rossmann-like and ribokinase-like folds. The structure of these proteins predicts a ping pong mechanism with initial formation of an acyladenylate intermediate, followed by release of pyrophosphate and attack by cysteine to form the final products PPC and AMP.
The Obg family comprises a group of ancient P-loop small G proteins (GTPases) belonging to the TRAFAC (for translation factors) class and can be subdivided into several distinct protein subfamilies. OBG GTPases have been found in both prokaryotes and eukaryotes. The structure of the OBG GTPase from Thermus thermophilus has been determined.
This entry represents a C-terminal domain found in certain OBG GTPases. This domain contains a four-stranded beta sheet and three alpha helices flanked by an additional beta strand. It is predominantly found in the bacterial GTP-binding protein Obg, and is functionally uncharacterised.
The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain.
This domain has a 2-layer beta(3)-alpha(2)-beta fold, and is present in a number of other proteins as well, including FAF1 (Fas-associated factor 1) and undulin 2. Many of these proteins also contain the UBX domain C-terminal to the FAF domain. This domain is found in many eukaryotic proteins.
This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity.
This family is the protein translocase SEC61 complex gamma subunit of the archaeal and eukaryotic type. It does not hit bacterial SecE proteins. Sec61 is required for protein translocation in the endoplasmic reticulum.
The Sec61 complex (eukaryotes) or SecY complex (prokaryotes) forms a conserved heterotrimeric integral membrane protein complex and forms a protein-conducting channel that allows polypeptides to be transferred across (or integrated into) the endoplasmic reticulum (eukaryotes) or across the cytoplasmic membrane (prokaryotes). This complex is itself a part of a larger translocase heterotrimeric complex composed of alpha, beta and gamma subunits.
The channel is a passive conduit for polypeptides. It therefore has to associate with other components that provide a driving force. The partner proteins in bacteria and eukaryotes differ. In bacteria, the translocase complex comprises 7 proteins, including a chaperone protein (SecB) an ATPase (SecA), an integral membrane complex (SecY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD) and SecF. The SecA ATPase interacts dynamically with the SecYEG integral membrane components to drive the transmembrane movement of newly synthesized preproteins. In yeast (and probably in all eukaryotes), the full translocase comprises another membrane protein subcomplex (the tetrameric Sec62/63p complex), and the lumenal protein BiP, a member of the Hsp70 family of ATPases. BiP promotes translocation by acting as a molecular ratchet, preventing the polypeptide chain from sliding back into the cytosol.
This entry represents the major facilitator superfamily (MFS) domain, which consists of twelve transmembrane helices. MFS proteins are the largest group of secondary membrane transporters in the cell. Among the different families of transporters, only two occur ubiquitously in all classifications of organisms; these are the ATP-Binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients. The MFS family contains members that function as uniporters, symporters or antiporters. In addition their solute specificity are also diverse. MFS proteins contain 12 transmembrane regions (with some variations).
This domain can be found in glycerol-3-phosphate transporter from Escherichia coli, which transports glycerol-3-phosphate into the cytoplasm and inorganic phosphate into the periplasm. The E. coli proton/sugar transporter lactose permease (LacY) also carries this domain, and acts to couple lactose and H+ translocation..
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents subunit C from the A0 complex of A-ATPases, and subunits C and D from the V0 complex of V-ATPases, all of which are involved in the translocation of protons across a membrane. There is more than one type of D subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity .
More information about this protein can be found at Protein of the Month: ATP Synthases.
Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.
The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices.
The eubacterial secY protein interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.
Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export. Homologs of secY are found in archaebacteria. SecY is also encoded in the chloroplast genome of some algae where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.
A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.
This entry represents a group of uncharacterised small proteins found in both eukaryotes and prokaryotes, including NMA1147 from Neisseria meningitidis and YgfY from Escherichia coli. YgfY may be involved in transcriptional regulation. The structure of these proteins consists of a complex bundle of five alpha-helices, which is composed of an up-down 3-helix bundle plus an orthogonal 2-helix bundle.
In the Escherichia coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains.
This entry represents the C-terminal domain of bacterial trigger factor proteins, which has a multi-helical structure consisting of an irregular array of long and short helices. This domain is structurally similar to the peptide-binding domain of the bacterial porin chaperone SurA.
Glycolipid transfer protein (GLTP) is a cytosolic protein that catalyses the intermembrane transfer of glycolipids such as glycosphingolipids, glyceroglycolipids, and possibly glucosylceramides, but not of phospholipids. GLTP has a multi-helical structure consisting of two layers of orthogonally packed helices.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.
The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3'-position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner.
This entry contains uncharacterised proteins. Those with structural information consist of two domains: an all-alpha domain with a 3-helical bundle fold, and an alpha-beta domain in 3 layers, alpha/beta/alpha.
ATP-NAD kinases catalyse the phosphorylation of NAD to NADP utilizing ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. ATP-NAD kinase contains two domains, where domain 1 has an alpha/beta topology that is related in structure to the N-terminal of phosphofructokinase, and domain 2 has an atypical beta-sandwich topology made of four structural repeats of beta(3) units.
The alpha-helical ferredoxin domain contains two Fe4-S4 clusters, typical of bacterial ferredoxin. Iron-sulphur proteins play an important role in electron transfer processes and in various enzymatic reactions. In eukaryotes, the mitochondria are the major site of Fe-S cluster biosynthesis in the cell, used for the assembly of mitochondrial and non-mitochondrial Fe-S proteins. The alpha-helical ferredoxin domain is present in several proteins involved in redox reactions, including the C-terminal of the respiratory proteins succinate dehydrogenase (SQR) in bacteria/mitochondria, and fumarate reductase (QFR) in bacteria. SQR is analogous to the mitochondrial respiratory complex II, and is involved in the electron transport pathway from succinate as a donor to the acceptor ubiquinone. SQR helps prevent the formation of reactive oxygen species and is used during aerobic respiration, whereas QFR does not and, consequently, is used to catalyse the final step of anaerobic respiration using the acceptor fumarate.
The alpha-helical ferredoxin domain is also present in the N-terminal of the cytosolic protein dihydropyrimidine dehydrogenase, (DPD) which catalyses the NADPH-dependent, rate-limiting step in pyrimidine degradation, converting pyrimidines to 5,6-dihydro compounds. DPD catalysis involves electron transfer from NADPH to the substrate via the Fe4-S4 centre and FAD. In mammals, this pathway produces the neurotransmitter beta-alanine.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. L29 belongs to a family of ribosomal proteins of 63 to 138 amino-acid residues which, on the basis of sequence similarities, groups:
The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.
Such a structure is shown in the following schematic representation:
It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.
The Prefoldin/GimC family of proteins are found in eukaryotes and archaea. Prefoldin is part of a molecular chaperone system that promotes the correct folding of nascent polypeptide chains. Prefoldin/GimC interacts with the nascent chain to stabilise it prior to its folding within the central cavity of a chaperonin. Prefoldin/GimC is a hexamer consisting of two types of subunits, alpha and beta. Archaeal prefoldin contains one type of alpha and one type of beta subunit, while eukaryotic prefoldin/GimC contains two different but related alpha subunits and four related beta subunits.
This entry represents an alpha-helical tRNA-binding arm found in class I and II aminoacyl-tRNA synthetase enzymes, as well as in the methicillin resistance protein FemA.
The tRNA-binding arm domain is conserved between class I and class II aminoacyl-tRNA synthetase enzymes, consisting of two alpha helices in an antiparallel hairpin with a left-handed twist. The appended tRNA-binding domains recognize a small number of nucleotides that are conserved specifically in each cognate tRNA species for the discrimination between the cognate and noncognate tRNAs. These nucleotides are called identity elements, and constitute the identity set. The tRNA-binding arm occurs as the C-terminal domain in some class I enzymes, such as valyl-tRNA synthetase, and as the N-terminal domain in some class II enzymes, such as phenylalanyl-tRNA synthetase.
The methicillin resistance protein, FemA (factors essential for methicillin resistance), contains a probable tRNA-binding arm that is similar in structure to those found in tRNA synthetases. In FemA, the tRNA-binding arm is inserted into the C-terminal NAT-like domain, and is thought to bind tRNA-glycine. FemA, along with FemB and FemX, plays a vital role in peptidoglycan biosynthesis specific to Staphylococci.
Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.
The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).
After cytochrome c is synthesized in the cytoplasm as apocytochrome c, it is transported through the outer mitochondrial membrane to the intermembrane space, where haem is covalently attached by thioester bonds to two cysteine residues located in the cytochrome c centre. Cytochrome c is required during oxidative phosphorylation as an electron shuttle between Complex III (cytochrome c reductase) and IV (cytochrome c oxidase). In addition, cytochrome c is involved in apoptosis in more complex organisms such as Xenopus, rats and humans. Cellular stress can induce cytochrome c release from the mitochondrial membrane. In mammals, cytochrome c triggers the assembly of the apoptosome, consisting of cytochrome c, Apaf-1 and dATP, which activates caspase-9, leading to cell death. There are several different members of the cytochrome c family with different functional roles, for instance cytochrome c549 is associated with photosystem II.
The known structures of c-type cytochromes have six different classes of fold. Of these, four are unique to c-type cytochromes. The consensus sequence for the cytochrome c centre is Cys-X-X-Cys-His, where the histidine residue is one of the two axial ligands of the haem iron. This arrangement is shared by all proteins known to belong to the cytochrome c family, which presently includes both mono-haem proteins and multi-haem proteins. This entry represents mono-haem cytochrome c proteins (excluding class II and f-type cytochromes), such as cytochromes c, c1, c2, c5, c555, c550 to c553, c556, and c6.
Cytochrome c-type centres are also found in the active sites of many enzymes, including cytochrome cd1-nitrite reductase as the N-terminal haem c domain, in quinoprotein alcohol dehydrogenase as the C-terminal domain, in Quinohemoprotein amine dehydrogenase A chain as domains 1 and 2, and in the cytochrome bc1 complex as the cytochrome bc1 domain.
Homeodomain proteins are transcription factors that share a related DNA binding homeodomain. The homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well conserved in many other animals, including vertebrates. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two alpha-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. Many proteins contain homeodomains, including Drosophila Engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and HOX proteins.
The homeodomain motif is very similar in sequence and structure to domains in a wide range of DNA-binding proteins, including recombinases, Myb proteins, GARP response regulators, human telomeric proteins (hTRF1), paired domain proteins (PAX), yeast RAP1, centromere-binding proteins CENP-B and ABP-1, transcriptional regulators (TyrR), AraC-type transcriptional activators, and tetracycline repressor-like proteins (TetR, QacR, YcdC).
Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.
The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein. S18 belongs to a family of ribosomal proteins that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.
UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases. HHR23A, the human homologue of yeast Rad23A is a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The fold of the UBA domain consists of a compact three-helical bundle with a right-handed twist, and have a conserved hydrophobic surface patch for protein-protein interactions. UBA-like domains can be found in other proteins as well, such as the TS-N domain in the elongation factor Ts (EF-Ts), which catalyses the recycling of the GTPase EF-Tu required for the binding of aminoacyl-tRNA top the ribosomal A site; and the C-terminal domain of TAP/NXF1, which functions in nuclear export through the interaction of its UBA-like domain with FG nucleoporins.
This entry contains the phosphatidylinositol transfer protein, Sec14p, which catalyses the exchange of phosphatidylinositol and phosphatidylcholine between membrane bilayers in vitro. Other related proteins include the retinaldehyde/retinal-binding proteins, which are functional components of the visual cycle, guanine nucleotide exchange factor, and alpha-tocopherol transfer protein, which enhances the transfer of ligand between separate membranes, as well as several hypothetical proteins.
Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.
TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
This domain is found in the central region of transcription elongation factor S-II and in several hypothetical proteins.
Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. S13 contains thee helices and a beta-hairpin in the core of the protein, which form a helix-two turns-helix (H2TH) motif, and a non-globular C-terminal extension.
This H2TH motif can be found in other proteins as well. In the DNA repair protein, MutM (formamidopyrimidine DNA glycosylase; Fpg), the middle domain contains the H2TH motif. MutM is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidatively damaged bases (N-glycosylase activity) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity). Other repair enzymes, such as E. coli Endonuclease VIII that excises oxidized pyrimidines from DNA, also contain a DNA-binding H2TH motif within the middle domain. The H2TH domains of these repair proteins are only peripherally involved in binding DNA; their primary function may be simply to position the N-terminal lobe and C-terminal zinc finger domain of the glycosylases for interactions with DNA.
The middle domain of topoisomerase IV-B subunit contains a H2TH motif that is structurally related to the DNA repair proteins. Although the H2TH domain appears to be retained in all archaeal and plant type IIB topoisomerases identified to date, it has no known function and has not been observed in other topoisomerase families.
This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.
A putative DNA-binding domain with a conserved structure is found in several different protein families. The core structure of the domain consists of a three-helical fold that is architecturally similar to that of the "winged-helix" fold, but is topologically distinct. Representatives of this domain can be found in domains B1 and B5 from the beta subunit of phenylalanine-tRNA synthetases, the C-terminal region of the DNA/RPA-binding domain of the DNA excision repair factor XPA, the N-terminal domain of the transcriptional activators BmrR and MtaN, the most conserved domain of the retinal development protein Dachshund, and the DNA-binding domain of the gpNU1 subunit from the bacteriophage lambda viral packing protein terminase.
This entry represents a domain with a spectrin-repeat-like fold consisting of three helices in a closed bundle with a left-handed twist. This domain is found in the succinate dehydrogenase/fumarate reductase oxidoreductase family of proteins, such as:
The folding pathway of tubulins includes highly specific interactions with a series of cofactors (A, B, C, D and E) after they are released from the eukaryotic chaperonin CCT. Cofactors A and D capture and stabilise tubulin in a quasi-native conformation. Cofactor E binds to the cofactor D-tubulin complex, and interaction with cofactor C then causes the release of tubulin poypeptides in the native state. This family is the tubulin-specific chaperone A.
Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.
ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.
Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.
The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.
Other proteins containing an ACB domain include:
The RNA-binding domains of the ribosomal protein S15 and the influenza virus non-structural protein NS1 share the same structural fold, consisting of three helices in an irregular array. S15 is one of 21 proteins in the small, bacterial 30S ribosomal subunit, and is required for assembly of the subunit through its binding to 16S rRNA. The multifunctional glutamyl-prolyl-tRNA synthase (EPRS) contains three tandem repeats linking two catalytic domains, all three of which contribute to RNA-binding; the second repeated element bears structural resemblance to the S15/NS1 RNA-binding domain.
High mobility group (HMG) box domains are involved in binding DNA, and may be involved in protein-protein interactions as well. The structure of the HMG-box domain consists of three helices in an irregular array. HMG-box domains are found in one or more copies in HMG-box proteins, which form a large, diverse family involved in the regulation of DNA-dependent processes such as transcription, replication, and strand repair, all of which require the bending and unwinding of chromatin. Many of these proteins are regulators of gene expression. HMG-box proteins are found in a variety of eukaryotic organisms, and can be broadly divided into two groups, based on sequence-dependent and sequence-independent DNA recognition; the former usually contain one HMG-box motif, while the latter can contain multiple HMG-box motifs.
HMG-box domains can be found in single or multiple copies in the following protein classes: HMG1 and HMG2 non-histone components of chromatin; SRY (sex determining region Y protein) involved in differential gonadogenesis; the SOX family of transcription factors; sequence-specific LEF1 (lymphoid enhancer binding factor 1) and TCF-1 (T-cell factor 1) involved in regulation of organogenesis and thymocyte differentiation; structure-specific recognition protein SSRP involved in transcription and replication; MTF1 mitochondrial transcription factor; nucleolar transcription factors UBF 1/2 (upstream binding factor) involved in transcription by RNA polymerase I; Abf2 yeast ARS-binding factor; yeast transcription factors lxr1, Rox1, Nhp6b and Spp41; mating type proteins (MAT) involved in the sexual reproduction of fungi; and the YABBY plant-specific transcription factors.
Histones mediate DNA organisation and plays a dominant role in regulating eukaryotic transcription. The histone-fold consists of a core of three helices, where the long middle helix is flanked at each end by shorter ones. Proteins displaying this structure include the nucleosome core histones, which form octomers composed of two copies of each of the four histones, H2A, H2B, H3 and H4; archaeal histone, which possesses only the core domain part of eukaryotic histone; and the TATA-box binding protein (TBP)-associated factors (TAF), where the histone fold is a common motif for mediating TAF-TAF interactions. TAF proteins include TAF(II)18 and TAF(II)28, which form a heterodimer, TAF(II)42 and TAF(II)62, which form a heterotetramer similar to (H3-H4)2, and the negative cofactor 2 (NC2) alpha and beta chains, which form a heterodimer. The TAF proteins are a component of transcription factor IID (TFIID), along with the TBP protein. TFIID forms part of the pre-initiation complex on core promoter elements required for RNA polymerase II-dependent transcription. The TAF subunits of TFIID mediate transcriptional activation of subsets of eukaryotic genes. The NC2 complex mediates the inhibition of TATA-dependent transcription through interactions with TBP.
The 20S proteasome is a multicatalytic complex that is responsible for the non-lysosomal degradation of intracellular proteins. The proteasome is composed of a catalytic core that is regulated by protein complexes, which bind to the ends of the cylindrical core structure. One of these regulatory complexes is the PA28 activator complex (also known as the 11S regulator, or REG), a ring-shaped hexameric structure that enhances the peptidase activity of the core enzyme. Three REG subunits have been isolated, REGalpha, REGbeta and REGgamma. REGalpha and REGbeta preferentially form a heteromeric complex with alternating alpha and beta subunits. The structure of the human REGalpha subunit reveals a heptameric barrel-shaped assembly containing a central channel. The binding of REG is thought to create a pore through with substrates and products can pass.
Ferritin is one of the major non-haem iron storage proteins in animals, plants, and microorganisms. Ferritin is a multisubunit protein with a hollow interior, which contains a mineral core of hydrated ferric oxide, thereby ensuring its solubility in an aqueous environment. Each subunit consists of a closed, four-helical bundle with a left-handed twist and one crossover connection.
This family contains ferritin and other ferritin-like proteins such as bacterioferritin (cytochrome b1) that binds haem between two subunits, non-haem ferritin, dodecameric ferritin homologue (DPS) that binds to and protects DNA, and the N-terminal domain of rubrerythrin that is found in many air-sensitive bacteria and archaea. In addition, ribonucleotide reductase-like proteins show a similar structure to the ferritin-like fold; these di-iron carboxylate proteins constitute a diverse class of non-haem iron enzymes performing a multitude of redox reactions. This family includes the alpha and beta subunits of methane monooxygenase hydrolase, delta 9-stearoyl-acyl carrier protein desaturase and manganese catalase (T-catalase).
The twenty aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA (tRNA) molecule in a highly specific two-step reaction. All of these proteins fall into one of two classes comprised of ten enzymes each: class 1 (Arg, Cys, Glu, Gln, Ile, Leu, Met, Tyr, Trp and Val) and class 2 (Ala, Asn, Gly, His, Lys, Phe, Pro, Ser, and Thr). Class 1 enzymes are mostly monomeric, and contain a characteristic Rossman binding fold that bind the tRNA acceptor stem from the minor groove side, using two highly conserved sequences. In contrast, class 2 enzymes share an anti-parallel beta-sheet formation that binds to the major groove side of the acceptor stem. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c. Class 1a (Arg, Cys, Ile, Leu, Met, Val) possess an RNA-binding domain with an alpha-helix-bundle fold; the binding of the anticodon of tRNA to the RNA-binding domain induces a conformation change in the catalytic domain of the enzyme.
Acyl carrier protein (ACP) is an essential cofactor in the synthesis of fatty acids by the fatty acid synthetases systems in bacteria and plants. In addition to fatty acid synthesis, ACP is also involved in many other reactions that require acyl transfer steps, such as the synthesis of polyketide antibiotics, biotin precursor, membrane-derived oligosaccharides, and activation of toxins, and functions as an essential cofactor in lipoylation of pyruvate and alpha-ketoglutarate dehydrogenase complexes. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. Phosphopantetheine is attached to a serine residue in these proteins. The core structure of ACP consists of a four-helical bundle, where helix three is shorter than the others.
Several other proteins share structural homology with ACP, such as the bacterial apo-D-alanyl carrier protein, which facilitates the incorporation of D-alanine into lipoteichoic acid by a ligase, necessary for the growth and development of Gram-positive organisms; and the thioester domain of the bacterial peptide carrier protein (PCP) found within large modular non-ribosomal peptide synthetases, which are responsible for the synthesis of a variety of microbial bioactive peptides.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.
The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
In the absence of cAMP, Protein Kinase A (PKA) exists as an equimolar tetramer of regulatory (R) and catalytic (C) subunits. In addition to its role as an inhibitor of the C subunit, the R subunit anchors the holoenzyme to specific intracellular locations and prevents the C subunit from entering the nucleus. All R subunits have a conserved domain structure consisting of the N-terminal dimerization domain, inhibitory region, cAMP-binding domain A and cAMP-binding domain B. R subunits interact with C subunits primarily through the inhibitory site. The cAMP-binding domains show extensive sequence similarity and bind cAMP cooperatively.
Two types of regulatory (R) subunits exist - types I and I - which differ in molecular weight, sequence, autophosphorylation cabaility, cellular location and tissue distribution. Types I and II were further sub-divided into alpha and beta subtypes, based mainly on sequence similarity. This entry represents types I-alpha, I-beta, II-alpha and II-beta regulatory subunits of PKA proteins. These subunits contain the dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs).
Bacteriophage lambda C1 repressor controls the expression of viral genes as part of the lysogeny/lytic growth switch. C1 is essential for maintaining lysogeny, where the phage replicates non-disruptively along with the host. If the host cell is threatened, then lytic growth is induced. The Lambda C1 repressor consists of two domains connected by a linker: an N-terminal DNA-binding domain that also mediates interactions with RNA polymerase, and a C-terminal dimerisation domain. The DNA-binding domain consists of four helices in a closed folded leaf motif. Several different phage repressors from different helix-turn-helix families contain DNA-binding domains that adopt a similar topology. These include the Lambda Cro repressor, Bacteriophage 434 C1 and Cro repressors, P22 C2 repressor, and Bacteriophage Mu Ner protein.
The DNA-binding domain of Bacillus subtilis spore inhibition repressor SinR is identical to that of phage repressors. SinR represses sporulation, which only occurs in response to adverse conditions. This provides a possible evolutionary link between the two adaptive responses of bacterial sporulation and prophage induction.
Other DNA-binding domains also display similar structural folds to that of Lambda C1. These include bacterial regulators such as the purine repressor (PurR), the lactose repressor (Lacr) and the fructose repressor (FruR), each of which has an N-terminal DNA-binding domain that exhibits a fold similar to that of lambda C1, except that they lack the first helix. POU-specific domains found in transcription factors such as in Oct-1, Pit-1 and Hepatocyte nuclear factor 1a (LFB1/HNF1) display four-helical fold DNA-binding domains similar to that of Lambda C1. The N-terminal domain of cyanase has an alpha-helix bundle motif similar to Lambda C1, but it probably does not bind DNA. Cyanase is an enzyme found in bacteria and plants that catalyses the reaction of cyanate with bicarbonate to produce ammonia and carbon dioxide in response to extracellular cyanate.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the M domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.
This entry represents the calponin-homology (CH) domain, a superfamily of actin-binding domains found in cytoskeletal proteins (contain two CH domain in tandem repeat), in regulatory proteins from muscle, and in signal transduction proteins. This domain has a core structure consisting of a 4-helical bundle. This domain is found in:
The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins, and can be found fused to the C-terminus of DNA topoisomerase in Chlamydia.
MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded beta-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 alpha-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.
The SWIB and MDM2 domains are homologous and share a common fold.
In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. GST is found as a domain in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST. Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.
Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain, which adopts a 4-helical bundle fold. This entry is the C-terminal domain.
Glutaredoxin 2 (Grx2), glutathione-dependent disulphide oxidoreductases, is structurally similar to GSTs, even though they lack any sequence similarity. Grx2 is also composed of N and C terminal subdomains. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with glutathione in cellular redox regulation including the response to oxidative stress. Grx2 is dissimilar to other glutaredoxins apart from containing the conserved active site residues.
Soluble N-ethylmaleimide attachment protein receptor (SNARE) proteins are a family of membrane-associated proteins characterised by an alpha-helical coiled-coil domain called the SNARE motif. These proteins are classified as v-SNAREs and t-SNAREs based on their localisation on vesicle or target membrane; another classification scheme defines R-SNAREs and Q-SNAREs, as based on the conserved arginine or glutamine residue in the centre of the SNARE motif. SNAREs are localised to distinct membrane compartments of the secretory and endocytic trafficking pathways, and contribute to the specificity of intracellular membrane fusion processes.
The t-SNARE domain consists of a 4-helical bundle with a coiled-coil twist. The SNARE motif contributes to the fusion of two membranes. SNARE motifs fall into four classes: homologues of syntaxin 1a (t-SNARE), VAMP-2 (v-SNARE), and the N- and C-terminal SNARE motifs of SNAP-25. It is thought that one member from each class interacts to form a SNARE complex.
The SNARE motif represented in this entry is found in the N-terminal domains of certain syntaxin family members: syntaxin 1a, which is required for neurotransmitter releas, syntaxin 6, which is found in endosomal transport vesicles, yeast Sso1p, and Vam3p, a yeast syntaxin essential for vacuolar fusion. The SNARE motifs in these proteins share structural similarity, despite having a low level of sequence similarity.
Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.
TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
This entry represents the conserved N-terminal domain found in the transcription elongation factors TFIIS, elongin A and CRSP70. The N-terminal domain in these transcription factors is conserved from yeast to man, and has a 4-helical bundle fold with a left-handed twist within a left-handed superhelix. Elongin A is a mammalian transcription elongation factor that forms the active subunit of the Elongin complex, which stimulates the rate of elongation by RNA polymerase II by suppressing the transient pausing of the polymerase at many sites along the DNA template. CRSP70 is an essential subunit of the CRSP complex, which is required for the activity of the enhancer-binding protein Sp1.
Cytochrome c oxidase is an oligomeric enzymatic complex that is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.
In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptide subunits. One of these subunits is the potentially haem-binding subunit, VIb, which is encoded in the nucleus.
Integration host factor (IHF) is a small heterodimeric protein that binds the minor groove of DNA in a sequence-specific manner and induces a large bend. This bending stabilises distinct DNA conformations that are required during several bacterial processes, such as recombination, transposition, replication and transcription. The core structure of IHF consists of a partly opened 4-helical bundle that is capped with a beta-sheet.
Prokaryotic protein HU and the bacteriophage SPO1 transcription factor TF1 are closely related to IHF. These proteins are collectively referred to as type II DNA-binding proteins (DBPII), forming a group of basic, dimeric proteins found in all bacteria that are able to bind DNA to induce and stabilise DNA bending. HU plays a structural role in replication initiation, transcription regulation, site-specific recombination, and the compaction of the bacterial genome. TF1 is essential for viral multiplication.
The DNA-binding domain of the TraM protein, an essential component of the DNA transfer machinery of the conjugative resistance plasmid R1, appears to have a similar structure to DBPII.
Sterile alpha motif (SAM) domains are known to be involved in diverse protein-protein interactions, associating with both SAM-containing and non-SAM-containing proteins pathway. SAM domains exhibit a conserved structure, consisting of a 4-5-helical bundle of two orthogonally packed alpha-hairpins. However SAM domains display a diversity of function, being involved in interactions with proteins, DNA and RNA. The name sterile alpha motif arose from its presence in proteins that are essential for yeast sexual differentiation. The SAM domain has had various names, including SPM, PTN (pointed), SEP (yeast sterility, Ets-related, PcG proteins), NCR (N-terminal conserved region) and HLH (helix-loop-helix) domain, all of which are related and can be classified as SAM domains.
SAM domains occur in eukaryotic and in some bacterial proteins. Structures have been determined for several proteins that contain SAM domains, including Ets-1 transcription factor, which plays a role in the development and invasion of tumour cells by regulating the expression of matrix-degrading proteases; Etv6 transcription factor, gene rearrangements of which have been demonstrated in several malignancies; EphA4 receptor tyrosine kinase, which is believed to be important for the correct localization of a motoneuron pool to a specific position in the spinal cord; EphB2 receptor, which is involved in spine morphogenesis via intersectin, Cdc42 and N-Wasp; p73, a p53 homologue involved in neuronal development; and polyhomeotic, which is a member of the Polycomb group of genes (Pc-G) required for the maintenance of the spatial expression pattern of homeotic genes.
In prokaryotes, RuvA, RuvB, and RuvC process the universal DNA intermediate of homologous recombination, termed Holliday junction. The tetrameric DNA helicase RuvA specifically binds to the Holliday junction and facilitates the isomerization of the junction from the stacked folded configuration to the square-planar structure. In the RuvA tetramer, each subunit consists of three domains, I, II and III, where I and II form the major core that is responsible for Holliday junction binding and base pair rearrangements of Holliday junction executed at the crossover point, whereas domain III regulates branch migration through direct contact with RuvB. Domain 2 has a SAM (sterile alpha motif)-like alpha bundle fold that occurs as a duplication containing two helix-hairpin-helix (HhH) motifs.
The C-terminal domain (CTD) of the excision repair protein UvrC shows structural similarity to RuvA domain 2. The CTD of UvrC is essential for 5' incision in the prokaryotic nucleotide excision repair process, and acts to mediate structure-specific binding to single-stranded-double-stranded junction DNA.
Domain 3 of NAD+-dependent DNA ligase consists of a duplication of two RuvA-like domains (four HhH motifs), and also contains a zinc-finger subdomain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD+ as a cofactor.
This entry represents an alpha-helical bundle domain, which has a SAM domain-like fold. This compact domain consists of a 4-5 helical bundle of two orthogonally packed alpha-hairpins, and contains one classic and one pseudo HhH (helix-hairpin-helix) motif. This domain is found at N-terminal of the DNA repair protein Rad1, at the C-terminal of the transcription elongation protein NusA, and at the C-terminal of the hypothetical protein AF1548.
Human Rad51 protein is a homologue of Escherichia coli RecA protein, and functions in DNA repair and recombination. In higher eukaryotes, Rad51 protein is essential for cell viability. The N-terminal region of Rad51 is highly conserved among eukaryotic Rad51 proteins but is absent from RecA, suggesting a Rad51-specific function for this region. The-terminal domain is involved in interactions with DNA and proteins; DNA binding may be regulated via phosphorylation within the N-terminal domain.
NusA (N utilisation substance A) from E. coli is an essential transcription factor that associates with the RNA polymerase (RNAP) core enzyme, where it modulates transcriptional pausing, termination and anti-termination. The C-terminal of NusA consists of two repeat units, and is responsible for the interaction of NisA with the C-terminal of RNAP, and with its interaction with protein N from phage lambda during anti-termination.
Mammalian DNA polymerase beta (polB) is a 39-kDa protein with both nucleotidyltransferase and 5'-deoxyribose phosphodiesterase activities, playing a role in both excision repair and meiosis. polB has a modular organisation with an 8-kDa N-terminal domain (NTD) connected to the 31-kDa C-terminal domain by a protease-hypersensitive hinge region. The NTD acts as a single-stranded DNA binding domain, interacting most efficiently with the 5'-phosphate of the downstream primer of the gapped DNA. This interaction is mediated by a helix-hairpin-helix motif (HhH), which is also found in several other DNA repair enzymes. The residue threonine 79 (T79), which is located within the NTD, was identified as being critical to polB function, even though it makes no contact with either DNA template or dNTP substrate; T79 is located between two HhH motifs, and acts as a hinge residue that is important for positioning the DNA within the active site.
The catalytic core (residues 148-242) of murine terminal deoxynucleotidyl transferase (TdT) displays a structural fold that is similar to polB, and shares a common two-metal ion mechanism of nucleotidyl transfer with polB. TdT elongates DNA strands in a template-independent manner, and belongs to the pol X family of polymerases. TdT has only been found in vertebrates, where it is highly conserved. TdT brings additional diversity in the immune repertoire by adding nucleotides, called N regions, to the V(D)J recombination junction sites of immunoglobulin and T-cell receptor genes.
The helix-hairpin-helix (HhH) motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins. The HhH motif is similar to, but distinct from, the helix-turn-helix (HtH) and the helix-loop-helix (HLH) motifs. All three motifs have two helices (H1 and H2) connected by a short turn. DNA-binding proteins with a HhH structural motif are involved in non-sequence-specific DNA binding that occurs via the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups. These HhH motifs are observed in DNA repair enzymes and in DNA polymerases. By contrast, proteins with a HtH motif bind DNA in a sequence-specific manner through the binding of H2 with the major groove; these proteins are primarily gene regulatory proteins. DNA-binding proteins with the HLH structural motif are transcriptional regulatory proteins and are principally related to a wide array of developmental processes.
Examples of proteins that contain a HhH motif include the 5'-exonuclease domains of prokaryotic DNA polymerases, the eukaryotic/prokaryotic RAD2 family of 5'-3' exonucleases such as T4 RNase H and T5, eukaryotic 5' endonucleases such as FEN-1 (Flap), and some viral exonucleases.
The HRDC (helicase and RNaseD C-terminal) domain is comprised of two orthogonally packed alpha-hairpin subdomains, and is involved in interactions with DNA and protein.
The HRDC (helicase and RNaseD C-terminal) domain is found at the C terminus of many RecQ helicases, including the human Werner and Bloom syndrome proteins. RecQ helicases have been shown to unwind DNA in an ATP-dependent manner. The structure of the HRDC domain consists of a 4-5 helical bundle of two orthogonally packed alpha-hairpins, and as such it resembles auxiliary domains in bacterial DNA helicases and other proteins that interact with nucleic acids. A positively charged region on the surface of the HRDC domain is able to interact with DNA.
The HRDC domain is also present in eukaryotic and archaeal RNA polymerase II subunit RBP4, the N-terminal of which forms a heterodimerisation alpha-hairpin.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases). The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .
In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.
In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.
The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain. This entry represents the C-terminal domain of the alpha/A/beta/B subunits, which forms a left-handed superhelix composed of 4-5 individual helices. The C-terminal domain can vary between the alpha and beta subunits, and between different ATPases .
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .
More information about this protein can be found at Protein of the Month: ATP Synthases.
The splicing factor Prp18 is required for the second step of pre-mRNA splicing. PRP18 appears to be primarily associated with the U5 snRNP.
The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles.
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
This domain is also found as the core domain in transcription factor IIB (TFIIB) and in the retinoblastoma tumour suppressor.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the basis of sequence similarities. The structure for S7 is known.
The Escherichia coli DNA polymerase III gamma complex clamp loader assembles the ring-shaped beta sliding clamp onto DNA. The core polymerase is tethered to the template by beta, enabling progressive replication of the genome. The E. coli complex clamp loader contains five different subunits, clamp loading only requires 3 of these - the gamma, delta, delta' complex. Three gamma subunits, and one each of delta and delta', are arranged in a circle. Each subunit adopts the same chain topology, and folds into three domains. However, the relative orientation of these domains is different for each subunit. The carboxy-terminal domains provide the major subunit contacts of the pentamer, although other intersubunit contacts are present. The amino-terminal domains do not form a continuous circle. These domains are arranged in a highly asymmetric fashion, and appear to dangle under the carboxy-terminal pentamer 'umbrella'.
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the oligomerisation domain found in the large subunit of carbamoyl phosphate synthases as well as in certain other carboxy phsophate domain-containing enzymes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker.
DNA glycosylases act to repair oxidative damage in DNA. These proteins are redundant as there are several different types of DNA glycosylases that are ale to compensate for one another. Examples include the endonuclease III subfamily, the mismatch glycosylases subfamily, the 3-methyladenine DNA glycosylases I subfamily, and the DNA repair glycosylases subfamily.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Structurally, an alpha-helix-bundle anticodon-binding domain characterises the class Ia synthetases, whereas the class Ib synthetases, GlnRS and GluRS have distinct anticodon-binding domains. The Rossmann-fold and anticodon-binding domains are connected by a beta-alpha-alpha-beta-alpha topology ('SC fold') domain that contains the class I specific KMSKS motif.
The large subunit (R1) of ribonucleotide reductase (RNR), is an essential enzyme required for DNA replication and DNA repair. In both Escherichia coli and higher organisms, the enzyme consists of two non-identical subunits, a dimer of an 85-kDa protein, R1, and a dimer of a 45-kDa protein, R2. Both subunits are essential for RNR enzyme activity - R1 contains, in the substrate binding site, the reducing active cysteine pair and R2 provides a catalytically essential organic radical. R1 is able to bind and reduce the four common ribonucleoside diphosphates. Substrate specificity is determined by nucleoside triphosphates binding to a protein site different from the active site and acting as allosteric effectors. Thus the presence of ATP makes the enzyme reduce CDP and UDP, dGTP favours ADP reduction and dTTP favours GDP reduction. dATP is a general inhibitor. This provides a mechanism for a balanced enzymatic production of building blocks for DNA synthesis.
This entry represents a multi-helical domain composed of two all-alpha subdomains that is found as the C-terminal domain in cryptochrome proteins, as well as at the N-terminal of DNA photolyase where it acts as a FAD-binding domain (the N-terminal of DNA photolyase binds a light-harvesting cofactor).
Photolyases and cryptochromes are related flavoproteins that bind FAD. Photolyases harness the energy of blue light to repair DNA damage by removing pyrimidine dimers. Cryptochromes (CRY1 and CRY2) are blue light photoreceptors that mediate blue light-induced gene expression.
DNA photolyases are DNA repair enzymes that repair mismatched pyrimidine dimers induced by exposure to ultra-violet light. They bind to UV-damaged DNA containing pyrimidine dimers and, upon absorbing a near-UV photon (300 to 500 nm), they catalyse dimer splitting, breaking the cyclobutane ring joining the two pyrimidines of the dimer so as to split them into the constituent monomers; this process is called photoreactivation. DNA photolyases require two choromophore-cofactors for their activity. All monomers contain a reduced FAD moiety, and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a second chromophore. Either chromophore may act as the primary photon acceptor, peak absorptions occurring in the blue region of the spectrum and in the UV-B region, at a wavelength around 290nm.
6-phosphogluconate dehydrogenase catalyses the oxidative decarboxylation of 6-phosphogluconate to ribulose 5-phosphate with the concomitant reduction of NADP to NADPH. The metazoan 6PGDHs have a well-conserved glycine-serine rich sequence at the C-terminus, which is lacking from bacterial enzymes and from those of the parasitic protozoan Trypanosoma brucei. The active dimer of the mammalian enzyme assembles with the C-terminal tail of one subunit threaded through the other, forming part of the substrate-binding site. The tail of T. brucei 6PGDH is shorter than that of the mammalian enzyme and its terminal residues associate tightly with the second monomer. The three-dimensional structure shows this generates additional interactions between the subunits close to the active site; the coenzyme-binding domain is thereby associated more tightly with the helical domain. Three residues, conserved in all other known sequences, are important in creating a salt bridge between monomers close to the substrate-binding site.
This domain is structurally similar to domains found in several different families, including those represented by mannitol 2-dehydrogenase, acetohydroxy acid isomeroreductase, short chain L-3-hydroxyacyl CoA dehydrogenase, UDP-glucose/GDP-mannose dehydrogenase (dimerisation domain), N-(1-D-carboxylethyl)-L-norvaline dehydrogenase, glycerol-3-phosphate dehydrogenase, and ketopantoate reductase (PanE).
Protein prenyltransferases catalyze the transfer of the carbon moiety of C15 farnesyl pyrophosphate or geranylgeranyl pyrophosphate synthase to a conserved cysteine residue in a CaaX motif of protein and peptide substrates. The addition of a farnesyl group is required to anchor proteins to the cell membrane. In the 3D structure of a mammalian Ras farnesyltransferases (Ftase), both subunits are largely composed of alpha-helices. The alpha-2 to alpha-15 helices in the alpha subunit fold into a novel helical hairpin structure, resulting in a crescent-shape domain that envelopes part of the subunit. The 12 helices of the beta-subunit form an alpha-alpha barrel. Six additional helices connect the inner core of helices and form the outside of the helical barrel. A deep cleft surrounded by hydrophobic amino acids in the centre of the barrel is proposed as the FPP-binding pocket. A single Zn2+ ion is located at the junction between the hydrophilic surface groove near the subunit interface
Terpenoid cyclases such as squalene cyclase, pentalenene synthase, 5-epi-aristolochene synthase, and trichodiene synthase are responsible for the synthesis of cholesterol, a hydrocarbon precursor of the pentalenolactone family of antibiotics, a precursor of the antifungal phytoalexin capsidiol, and the precursor of antibiotics and mycotoxins, respectively. In the structures of these three enzymes, the similar structural feature referred to as 'terpenoid synthase fold' with 10-12 mostly antiparallel alpha-helices is found, as also observed in protein prenyltransferases. The high structural similarity provides support for the hypothesis that the three families of prenyltransferases have related evolution despite their low sequence similarity.
Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).
Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.
This entry represents the core of type I and II citrate synthase enzymes, comprising both the large and small alpha-helical domains. In addition, this entry represents the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive. ATP citrate synthase (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial-produced citrate.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L7/12 consists of two domains that are connected by a flexible region. The N-terminal domain is required for dimer formation and for anchoring the protein to the ribosome by binding to ribosomal protein L10, while the C-terminal domain is required for translation factors binding.
This entry represents type 2 phosphatidic acid phosphatase (PAP2; enzymes, such as phosphatidylglycerophosphatase Bfrom Escherichia coli. PAP2 enzymes have a core structure consisting of a 5-helical bundle, where the beginning of the third helix binds the cofactor. PAP2 enzymes catalyse the dephosphorylation of phosphatidate, yielding diacylglycerol and inorganic phosphate. In eukaryotic cells, PAP activity has a central role in the synthesis of phospholipids and triacylglycerol through its product diacylglycerol, and it also generates and/or degrades lipid-signalling molecules that are related to phosphatidate.
Other related enzymes have a similar core structure, including haloperoxidases such as bromoperoxidase (contains one core bundle, but forms a dimer), chloroperoxidases (contains two core bundles arranged as in other family dimers), bacitracin transport permease from Bacillus licheniformis, glucose-6-phosphatase from rat. The vanadium-dependent haloperoxidases exclusively catalyse the oxidation of halides, and act as histidine phosphatases, using histidine for the nucleophilic attack in the first step of the reaction. Amino acid residues involved in binding phosphate/vanadate are conserved between the two families, supporting a proposal that vanadium passes through a tetrahedral intermediate during the reaction mechanism.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the core domain (domain 3) found in proteins of the MutS family. The core domain of MutS adopts a multi-helical structure comprised of two subdomains, which are interrupted by the clamp domain. Two of the helices in the core domain comprise the levers that extend towards the DNA.
Proteins containing a RhoGAP (Rho GTPase Activating Protein) domain usually function to catalyze the hydrolysis of GTP that is bound to Rho, Rac and/or Cdc42, inactivating these regulators of the actin cytoskeleton. The 53 known human RhoGAP domain-containing proteins are the largest known group of Rho GTPase regulators and significantly outnumber the 21 Rho GTPases they presumably regulate. This excess of GAP proteins probably indicates complex regulation of the Rho GTPases and is consistent with the existence of almost as many (48) human Dbl domain-containing Rho GEFs that act antagonistically to the RhoGAP proteins by activating the Rho GTPases. Phylogenetic analysis offers evidence for frequent domain duplication and for duplication of the entire genes containing these GAP domains.
This entry represents a structural domain with an armadillo (ARM)-like fold, consisting of a multi-helical fold comprised of two curved layers of alpha helices arranged in a regular right-handed superhelix, where the repeats that make up this structure are arranged about a common axis. These superhelical structures present an extensive solvent-accessible surface that is well suited to binding large substrates such as proteins and nucleic acids. Domains and repeats with an ARM-like fold have been found in a number of proteins, including:
The sequence similarity among these different repeats or domains is low, however they exhibit considerable structural similarity. Furthermore, the number of repeats present in the superhelical structure can vary between orthologues, indicating that rapid loss/gain of repeats has occurred frequently in evolution. A common phylogenetic origin has been proposed for the armadillo and HEAT repeats.
The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.
The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.
The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.
14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.
This entry represents domains with a multi-helical, alpha-alpha 2-layered structural fold as found in: the ENTH domain of Epsin; the VHS domain of Hrs, Tom1, and ADP-ribosylation factors; the RPR domain of PCF11 protein; and the N-terminal domain of phosphoinositide-binding clathrin adaptor.
The epsin NH2-terminal homology (ENTH) domain is a membrane interacting module composed of a superhelix of alpha-helices. It is present at the NH2-terminus of proteins that often contain consensus sequences for binding to clathrin coat components and their accessory factors, and therefore function as endocytic adaptors. ENTH domain containing proteins have additional roles in signalling and actin regulation and may have yet other actions in the nucleus. The ENTH domain is structurally similar to the VHS domain.
The ENTH domain is approximately 150 amino acids long. The ENTH domain forms a compact globular structure, composed of eight alpha-helices connected by loops of varying length. Three helical hairpins that are stacked consecutively with a right-handed twist determine the general topology of the domain. This stacking gives the ENTH domain a rectangular appearance when viewed face on. The most highly conserved amino acids fall roughly into two classes: internal residues that are involved in packing and therefore are necessary for structural integrity, and solvent accessible residues that may be involved in protein-protein interactions.
VHS domains are found at the N-termini of select proteins involved in intracellular membrane trafficking. The domain consists of eight helices arranged in a superhelix. The surface of the domain has two main features: a basic patch on one side due to several conserved positively charged residues on helix 3 and a negatively charged ridge on the opposite side, formed by residues on helix 2. Comparison of the two VHS domains and the ENTH domain reveals a conserved surface, composed of helices 2 and 4, that is utilised for protein-protein interactions. In addition, VHS domain-containing proteins are also often localized to membranes. It has therefore been suggested that the conserved positively charged surface of helix 3 in VHS and ENTH domains plays a role in membrane binding.
The enzymes belonging to this family are involved in phosphate ester hydrolysis and contain a triad of closely spaced zinc ions at their active centres. Both families of enzymes hydrolyse phosphodiesters. Substrates for phospholipase C are phosphatidylinositol and phosphatidylcholine, while P1 nuclease is an endonuclease hydrolysing single stranded ribo- and deoxyribonucleotides. P1 nuclease also has activity as a phosphomonoesterase against 3'-terminal phosphates of nucleotides. The Zn ions in both enzymes form almost identical trinuclear sites.
The enzyme L-aspartate ammonia-lyase (aspartase) catalyses the reversible deamination of the amino acid L-aspartic acid, using a carbanion mechanism to produce fumaric acid and ammonium ion. Aspartases from different organisms show high sequence homology, and this homology extends to functionally related enzymes such as the class II fumarases, the argininosuccinate and adenylosuccinate lyases. The high-resolution structure of aspartase reveals a monomer that is composed of three domains oriented in an elongated S-shape. The central domain, comprised of five-helices, provides the subunit contacts in the functionally active tetramer. The active sites are located in clefts between the subunits and structural and mutagenic studies have identified several of the active site functional groups. A separate regulatory site has been identified. The substrate, aspartic acid, can also play the role of an activator, binding at this site along with a required divalent metal ion.
Terpenoid cyclases catalyze remarkably complex cyclisation cascades that are initiated by the formation of a highly reactive carbocation in a polyisoprene substrate. The pathways of monoterpene, sesquiterpene, and diterpene biosynthesis are conveniently divided into several stages. The first encompasses the synthesis of isopentenyl diphosphate, isomerization to dimethylallyl diphosphate, prenyltransferase-catalysed condensation of these two C5-units to geranyl diphosphate (GDP), and the subsequent 1'-4 additions of isopentenyl diphosphate to generate farnesyl (FDP) and geranylgeranyl (GGDP) diphosphate. In the second stage, the prenyl diphosphates undergo a range of cyclisations based on variations on the same mechanistic theme to produce the parent skeletons of each class. Thus, GDP (C10) gives rise to monoterpenes, FDP (C15) to sesquiterpenes, and GGDP (C20) to diterpenes. These transformations catalysed by the terpenoid synthases (cyclases) may be followed by a variety of redox modifications of the parent skeletal types to produce the many thousands of different terpenoid metabolites of the essential oils, turpentines, and resins of plant origin. Terpenoid synthases enzymes provide a template for binding and stabilizing the flexible substrate in the precise orientation required for catalysis, trigger carbocation formation, chaperone the conformations of the reactive carbocation intermediates through a unique cyclisation sequence, and sequester and stabilize carbocations from premature quenching.
Partially folded polypeptide chains, either newly made by ribosomes or emerging from mature proteins unfolded by stress, run the risk of aggregating with one another to the detriment of the organism. Folding of newly synthesised polypeptides in the crowded cellular environment requires the assistance of molecular chaperone proteins, such as the large bacterial chaperonins GroEL and GroES.
GroEL and GroES prevent aggregation by encapsulating individual chains within the so-called 'Anfinsen cage' provided by the GroEL-GroES complex, where they can fold in isolation from one another. GroEL consists of two heptameric rings of identical ATPase subunits stacked back to back, containing a cage in each ring. Each subunit consists of three domains. The equatorial domain contains the nucleotide binding site and is connected by a flexible intermediate domain with the apical domain. The latter presents several hydrophobic amino-acid side chains at the top of the ring, orientated towards the cavity of the cage. These side chains are involved in binding either a partially folded polypeptide chain or a single molecule of GroES.
The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins, which include 10 kDa and 60 kDa proteins. These are found in abundance in prokaryotes, chloroplasts and mitochondria. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions.
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The cpn10 and cpn60 oligomers also require Mg2+-ATP in order to interact to form a functional complex, although the mechanism of this interaction is as yet unknown. This chaperonin complex is essential for the correct folding and assembly of polypeptides into oligomeric structures, of which the chaperonins themselves are not a part. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease, and is thought to play a role in the protection of the Legionella bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A. The TCP-1 family of proteins act as molecular chaperones for tubulin, actin and probably some other proteins. They are weakly, but significantly, related to the cpn60/groEL chaperonin family.
This entry represents a multi-helical structural domain consisting of two structural repeats (duplication) of a 3-helical motif. This domain can be found in both eukaryotic and prokaryotic haem oxygenases, in TENA/THI-4 proteins that lack the haem-binding site, and in coenzyme PQQ (pyrrolo-quinoline-quinone) biosynthesis protein C (PqqC).
Haem oxygenase (HO) is the microsomal enzyme that, in animals, carries out the oxidation of haem, cleaving the haem ring at the alpha-methene bridge to form biliverdin and carbon monoxide. Biliverdin is subsequently converted to bilirubin by biliverdin reductase. In mammals there are three isozymes of haem oxygenase: HO-1 to HO-3. The first two isozymes differ in their tissue expression and their inducibility: HO-1 is highly inducible by its substrate haem and by various non-haem substances, while HO-2 is non-inducible. Haem oxygenase is also present in certain bacteria, where it is involved in the acquisition of iron from the host haem.
The THI-4 protein is involved in thiamine biosynthesis, while TENA is one of a number of proteins that enhance the expression of extracellular enzymes, such as alkaline protease, neutral protease and levansucrase.
Coenzyme PQQ (pyrrolo-quinoline-quinone) biosynthesis protein C (PqqC; is required for the synthesis of PQQ, where PQQ is a prosthetic group found in several bacterial enzymes, including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins are very basic. About 50 residues long, they are the smallest proteins of eukaryotic-type ribosomes.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family constitutes the mitochondrial ATP synthase epsilon subunit, which is distinct from the bacterial epsilon subunit (the latter being homologous to the mitochondrial delta subunit). The mitochondrial epsilon subunit is located in the stalk region of the F1 complex, and acts as an inhibitor of the ATPase catalytic core. The epsilon subunit can assume two conformations, contracted and extended, where the latter inhibits ATP hydrolysis. The conformation of the epsilon subunit is determined by the direction of rotation of the gamma subunit, and possibly by the presence of ADP. The extended epsilon subunit is thought to become extended in the presence of ADP, thereby acting as a safety lock to prevent wasteful ATP hydrolysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.
Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-, beta- and gamma-adaptin from AP clathrin adaptor complexes, the GAE (gamma-adaptin ear) domain of GGA adaptor proteins, and the appendage domain of the gamma subunit of coatomer complexes. These domains have an immunoglobulin-like beta-sandwich fold containing 7 or 8 strands in 2 beta-sheets in a Greek key topology. Although the appendage domains from AP / GGA adaptins and coatomers share a similar fold, there is little sequence identity between them. However, they also share similar motif-based cargo recognition and accessory factor recruitment mechanisms.
More information about these proteins can be found at Protein of the Month: Clathrin.
The PapD-like superfamily of periplasmic chaperones directs the assembly of over 30 diverse adhesive surface organelles that mediate the attachment of many different pathogenic bacteria to host tissues, a critical early step in the development of disease. PapD, the prototypical chaperone, is necessary for the assembly of P pili. P pili contain the adhesin PapG, which mediates the attachment of uropathogenic Escherichia coli to Gal(alpha) Gal receptors present on kidney cells and are critical for the initiation of pyelonephritis. The PapD-like chaperones consist of two Ig-like domains oriented toward each other, forming L-shaped molecules. In the chaperone-subunit complex, the G1beta strand of the chaperone completes an atypical Ig fold in the subunit by occupying the groove and running parallel to the subunit C-terminal F strand. This donor strand complementation interaction simultaneously stabilizes pilus subunits and caps their interactive surfaces, preventing their premature oligomerisation in the periplasm. During pilus biogenesis, the highly conserved N-terminal extension of one subunit has been proposed to displace the chaperone G1beta strand from its neighbouring subunit in a mechanism termed donor strand exchange.
This entry represents the immunoglobulin (Ig)-like beta-sandwich domain found in PapD, as well as in other periplasmic chaperone proteins that include FimC and SfaE from E. coli, and Caf1m from Yersinia pestis. In addition, major sperm proteins (MSP) and other related sperm proteins (such as WR4 and SSP-19) contain an Ig-like domain with a similar structural fold to PapD. Major sperm proteins are central components in molecular interactions underlying sperm motility, with many isoforms existing in Caenorhabditis elegans.
This domain is found in a number of transcription factors, including p53, NFATC, TonEBP, STAT-1, and NFkappaB, where it is responsible for DNA-binding. These transcription factors play diverse roles in the regulation of cellular functions: the p53 tumour suppressor upregulates the expression of genes involved in cell cycle arrest and apoptosis; NFATC regulates the production of effector proteins involved in coordinating the immune response; TonEBP regulates gene expression induced by osmotic stress and helps regulate intracellular volume during cell growth; STAT-1 plays an important role in B lymphocyte growth and function; and NFkappaB is involved in the inflammatory response. The DNA-binding domain acts to clamp, or in the case of TonEBP, encircle the DNA target in order to stabilize the protein-DNA complex. Protein interactions may also serve to stabilize the protein-DNA complex, for example in the STAT-1 dimer the SH2 (Src homology 2) domain in each monomer is coupled to the DNA-binding domain to increase stability. The DNA-binding domain consists of a beta-sandwich formed of 9 strands in 2 sheets with a Greek-key topology. This structure is found in many transcription factors, often within the DNA-binding domain.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents the C-terminal domain of the mu subunit from various clathrin adaptors (AP1, AP2 and AP3). The C-teminal domain has an immunoglobulin-like beta-sandwich fold consisting of 9 strands in 2 sheets with a Greek key topology, similar to that found in cytochrome f and certain transcription factors. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle.
More information about these proteins can be found at Protein of the Month: Clathrin.
The Escherichia coli Hsp40 DnaJ and Hsp70 DnaK cooperate in the binding of proteins at intermediate stages of folding, assembly, and translocation across membranes. Binding of protein substrates to the DnaK C-terminal domain is controlled by ATP binding and hydrolysis in the N-terminal ATPase domain. The interaction of DnaJ with DnaK is mediated at least in part by the highly conserved N-terminal J-domain of DnaJ. The J-domain interaction is localized to the ATPase domain of DnaK and is likely to be dominated by electrostatic interactions. J-domain may tether DnaK to DnaJ-bound substrates, which DnaK then binds with its C-terminal peptide-binding domain. The peptide-binding domain of DnaJ is comprised of a beta sandwich made up of 6 beta-strands divided into 2 sheets.
Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as Cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.
This entry represents cupredoxin proteins, as well as structural homologues to cupredoxin. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Proteins with a cupredoxin-type fold are found in the following family groups:
The Ca2+-dependent, lipid-binding domain (CaLB) has been identified in a number of proteins, for example the amino-terminal, 138 amino acid C2 domain of cytosolic phospholipase A2 (cPLA2-C2) which mediates an initial step in the production of lipid mediators of inflammation: the Ca2+-dependent translocation of the enzyme to intracellular membranes with subsequent liberation of arachidonic acid. The domain is composed of eight antiparallel beta-strands with six interconnecting loops that fits the "type II" topology for C2 domains. The structure has been identified as a beta-sandwich in the "Greek key" motif.
The tumour necrosis factor receptor (TNFR) associated factors (TRAFs) act as signal transducers for both TNFRs and interleukin-1/Toll-like receptors. TRAFs function in immunity, embryonic development, stress response and bone metabolism through their induction of cell proliferation, differentiation, and apoptosis. TRAFs are characterised by two domains: an N-terminal domain containing RING and zinc finger motifs that is essential for the activation of downstream effectors, and a C-terminal TRAF domain that is essential for self-association and receptor interaction. The TRAF-domain like fold is a beta-sandwich consisting of 8 strands in 2 beta sheets and has a circularly permuted greek-key immunoglobulin-fold topology that contains an extra strand.
The substrate-binding domain (SBD) of the SIAH (seven in absentia homolog) family of proteins is structurally highly similar to the TRAF domain. The SIAH SBD interacts with a number of proteins, and is involved in TNF-alpha-mediated NFkappaB activation.
Lipoxygenases are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes. Sequence data is available for the following lipoxygenases:
The iron atom in lipoxygenases is bound by four ligands, three of which are histidine residues. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown to be important for the activity of lipoxygenases.
This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.
All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:
Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.
Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.
Hsp20 is a mammalian small heat-shock protein family that occurs most abundantly in skeletal muscle and heart. It has a tendency to form dimers, via a disulphide linkage formed by an N-terminal cysteine, low heat stability and a poor chaperoning ability in comparison with other family members. Structurally, this and related proteins contain a beta-sandwich fold consisting of 8 strands in 2 beta-sheets in a greek-key topology.
The PEBP (PhosphatidylEthanolamine-Binding Protein) family is a highly conserved group of proteins that have been identified in numerous tissues in a wide variety of organisms, including bacteria, yeast, nematodes, plants, drosophila and mammals. The various functions described for members of this family include lipid binding, neuronal development, serine protease inhibition, the control of the morphological switch between shoot growth and flower structures, and the regulation of several signalling pathways such as the MAP kinase pathway, and the NF-kappaB pathway. The control of the latter two pathways involves the PEBP protein RKIP, which interacts with MEK and Raf-1 to inhibit the MAP kinase pathway, and with TAK1, NIK, IKKalpha and IKKbeta to inhibit the NF-kappaB pathway. Other PEBP-like proteins that show strong structural homology to PEBP include Escherichia coli YBHB and YBCL, the Rattus norvegicus (Rat) neuropeptide HCNP, and Antirrhinum majus (Garden snapdragon) protein centroradialis (CEN).
Structures have been determined for several members of the PEBP-like family, all of which show extensive fold conservation. The structure consists of a large central beta-sheet flanked by a smaller beta-sheet on one side, and an alpha helix on the other. Sequence alignments show two conserved central regions, CR1 and CR2, that form a consensus signature for the PEBP family. These two regions form part of the ligand-binding site, which can accommodate various anionic groups. The N- and C-terminal regions are the least conserved, and may be involved in interactions with different protein partners. The N-terminal residues 2-12 form the natural cleavage peptide HCNP involved in neuronal development. The C-terminal region is deleted in plant and bacterial PEBP homologues, and may help control accessibility to the active site.
Proteins containing a galactose-binding domain-like fold can be found in several different protein families, in both eukaryotes and prokaryotes. The common function of these domains is to bind to specific ligands, such as cell-surface-attached carbohydrate substrates for galactose oxidase and sialidase, phospholipids on the outer side of the mammalian cell membrane for coagulation factor Va, membrane-anchored ephrin for the Eph family of receptor tyrosine kinases, and a complex of broken single-stranded DNA and DNA polymerase beta for XRCC1.
The structure of the galactose-binding domain-like members consists of a beta-sandwich, in which the strands making up the sheets exhibit a jellyroll fold. There is a high degree of similarity in the beta-sandwich and in the loops between different family members, despite an often low level of sequence similarity.
FHA and SMAD (MH2) domains share a common structure consisting of a sandwich of eleven beta strands in two sheets with Greek key topology. Forkhead-associated (FHA) domains were originally identified as a sequence profile of about 75 amino acids, whereas the full-length domain is closer to about 150 amino acids. FHA domains are found in transcription factors, kinesin motors, and in a variety of other signalling molecules in organisms ranging from eubacteria to humans. FHA domains are protein-protein interaction domains that are specific for phosphoproteins. FHA-containing proteins function in maintaining cell-cycle checkpoints, DNA repair and transcriptional regulation. FHA domain proteins include the Chk2/Rad53/Cds1 family of proteins that contain one or more FHA domains, as well as a Ser/Thr kinase domain.
SMAD (Mothers against decapentaplegic (MAD) homolog) domain proteins are found in a range of species from nematodes to humans. These highly conserved proteins contain an N-terminal MH1 domain that contacts DNA, and is separated by a short linker region from the C-terminal MH2 domain, the later showing a striking similarity to FHA domains. SMAD proteins mediate signalling by the TGF-beta/activin/BMP-2/4 cytokines from receptor Ser/Thr protein kinases at the cell surface to the nucleus. SMAD proteins fall into three functional classes: the receptor-regulated SMADs (R-SMADs), including SMAD1, -2, -3, -5, and -8, each of which is involved in a ligand-specific signalling pathway; the comediator SMADs (co-SMADs), including SMAD4, which interact with R-SMADs to participate in signalling; and the inhibitory SMADs (I-SMADs), including SMAD6 and -7, which block the activation of R-SMADs and Co-SMADs, thereby negatively regulating signalling pathways.
Domains with this fold are also found as the transactivation domain of interferon regulatory factor 3 (IRF3), which has a weak homology to SMAD domains, and the N-terminal domain of EssC protein in Staphylococcus aureus.
Lectins and glucanases exhibit the common property of reversibly binding to specific complex carbohydrates. The lectins/glucanases are a diverse group of proteins found in a wide range of species from prokaryotes to humans. The different family members all contain a concanavalin A-like domain, which consists of a sandwich of 12-14 beta strands in two sheets with a complex topology. Members of this family are diverse, and include the lectins: legume lectins, cereal lectins, viral lectins, and animal lectins. Plant lectins function in the storage and transport of carbohydrates in seeds, the binding of nitrogen-fixing bacteria to root hairs, the inhibition of fungal growth or insect feeding, and in hormonally regulated plant growth. Protein members include concanavalin A (Con A), favin, isolectin I, lectin IV, soybean agglutinin and lentil lectin. Animal lectins include the galectins, which are S-type lactose-binding and IgE-binding proteins such as S-lectin, CLC protein, galectin1, galectin2, galectin3 CRD, and Congerin I.
Other members with a Con A-like domain include the glucanases and xylanases. Bacterial and fungal beta-glucanases, such as Bacillus 1-3,1-4-beta-glucanse, carry out the acid catalysis of beta-glucans found in microorganisms and plants. Similarly, kappa-Carrageenase degrades kappa-carrageenans from marine red algae cell walls. Xylanase and cellobiohydrolase I degrade hemicellulose and cellulose, respectively.
There are many Con A-like domains found in proteins involved in cell recognition and adhesion. For example, several viral and bacterial toxins carry Con A-like domains. Examples include the Clostridium neurotoxins responsible for the neuroparalytic effects of botulism and tetanus. The Pseudomonas exotoxin A, a virulence factor which is highly toxic to eukaryotic cells, causing the arrest of protein synthesis, contains a Con A-like domain involved in receptor binding. Cholerae neuraminidase can bind to cell surfaces, possibly through their Con A-like domains, where they function as part of a mucinase complex to degrade the mucin layer of the gastrointestinal tract. The rotaviral outer capsid protein, VP4, has a Con A-like sialic acid binding domain, which functions in cell attachment and membrane penetration.
Con A-like domains also play a role in cell recognition in eukaryotes. Proteins containing a Con A-like domain include the sex hormone-binding globulins which transport sex steroids in blood and regulate their access to target tissues, laminins which are large heterotrimeric glycoproteins involved in basement membrane architecture and function, neurexins which are expressed in hundreds of isoforms on the neuronal cell surface, where they may function as cell recognition molecules and sialidases that are found in both microorganisms and animals, and function in cell adhesion and signal transduction.
Other proteins containing a Con A-like domain include pentraxins and calnexins. The pentraxin PTX3 is a TNFalpha-induced, secreted protein of adipose cells produced during inflammation. The calnexin family of molecular chaperones is conserved among plants, fungi, and animals. Family members include Calnexin, a type-I integral membrane protein in the endoplasmic reticulum which coordinates the processing of newly synthesized N-linked glycoproteins with their productive folding, calmegin, a type-I membrane protein expressed mainly in the spermatids of the testis, and calreticulin, a soluble ER lumenal paralog.
Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential, which is linked to ATP synthesis. The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur 'Rieske' subunit, which contains a high potential 2Fe-2S cluster.The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), present in cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits.
The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines that one Fe ion while the other Fe ion is coordinated by two conserved histidines. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.
The fundamental activity of the ribosome is two-fold: to decode the message of the mRNA in the small subunit, and to form a peptide bond between peptidyl-tRNA and aminoacyl-tRNA by a peptidyl transferase activity in the large subunit. Several prokaryotic and eukaryotic proteins that are involved in the translation process contain an SH3-like domain. The structure of the translation protein SH3-like domain is a partly opened beta barrel, where the last strand is interrupted by a 3-10 helical turn. The structure of the RNA-binding C-terminal domain of the Bacillus stearothermophilus (Geobacillus stearothermophilus) ribosomal protein L2 has been shown to adopt the SH3-like barrel topology. The L2 protein is located near the peptidyl transferase centre in the large ribosomal subunit where it may contribute to peptidyl transferase activity, and is involved in the assembly of the 23SrRNA. Likewise, the N-terminal domain of the ubiquitous eukaryotic initiation translation factor 5a (IF-5A) protein adopts the SH3-like barrel topology. IF-5A is involved in the initial step of peptide bond formation in translation and in cell-cycle regulation. IF-5A acts as a cofactor of the Rev protein in HIV-1-infected cells and of the Rex protein in T-cell leukaemia virus 1-infected cells.
GroES (chaperonin 10) is an oligomeric molecular chaperone, which functions in protein folding and possibly in intercellular signalling, being found on the surface of various prokaryotic and eukaryotic cells, as well as being released from cells. Secreted chaperonins are thought to act as intercellular signals, interacting with a variety of cell types, including leukocytes, vascular endothelial cells and epithelial cells, as well as activating key cellular activities such as the synthesis of cytokines and adhesion proteins. GroES works as a co-chaperone with GroEL (chaperonin 60) during protein folding. The polypeptide substrate is captured by GroEL, which bind the co-chaperone GroES and ATP, and discharges the substrate into a unique microenvironment inside of the chaperone, which promotes productive folding. After hydrolysis of ATP, the polypeptide is released into solution. GP31 from Bacteriophage T4 is functionally equivalent to GroES. GroES folds as a partly opened beta-barrel.
The N-terminal domain of alcohol dehydrogenase-like proteins have a GroES-like fold, the C-terminal domain having a classical Rossman-fold. These proteins include, alcohol dehydrogenase, which contains a zinc-finger subdomain within the GroES-like domain, ketose reductase (sorbitol dehydrogenase), formaldehyde dehydrogenase, quinone oxidoreductase and 2,4-dienoyl-CoA reductase.
PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 µM. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.
PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.
This domain is found as the core structure in Lsm (like-Sm) proteins and bacterial Lsm-related Hfq proteins, and as the middle domain of the mechanosensitive channel protein MscS. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology.
Lsm proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. These snRNPs consist of seven Sm proteins (B/BÂ, D1, D2, D3, E, F and G), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Other snRNPs, such as U7 snRNP, can contain different Lsm proteins. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins.
The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.
The middle domain of the mechanosensitive channel of small conductance protein (MscS or YggB) structurally resembles an Lsm protein. MscS is a mechanosensitive channel present in the membrane of bacteria, archaea and eukarya that responds both to stretching of the cell membrane and to membrane depolarisation. MscS folds as a homo-heptamer with a cylindrical shape, and can be divided into transmembrane and extramembrane regions: an N-terminal periplasmic region, a transmembrane region, and a C-terminal cytoplasmic region. The C-terminal cytoplasmic region can be further divided into middle and C-terminal domains, which together create a framework that connects to the cytoplasm through distinct openings. The middle domain exhibits an Lsm-like structure, consisting of five beta-strands that pack together with those of other subunits to form a barrel-like sheet extending around the entire protein.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins. L14 is a protein of 119 to 137 amino-acid residues.
Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has a multi-domain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.
SNase domains have an OB-fold consisting of a closed or partly open beta-barrel with Greek key topology.
A five-stranded beta-barrel was first noted as a common structure among four proteins binding single-stranded nucleic acids (staphylococcal nuclease and aspartyl-tRNA synthetase) or oligosaccharides (B subunits of enterotoxin and verotoxin-1), and has been termed the oligonucleotide/oligosaccharide binding motif, or OB fold, a five-stranded beta-sheet coiled to form a closed beta-barrel capped by an alpha helix located between the third and fourth strands. Two ribosomal proteins, S17 and S1, are members of this class, and have different variations of the OB fold theme. Comparisons with other OB fold nucleic acid binding proteins suggest somewhat different mechanisms of nucleic acid recognition in each case.
There are many nucleic acid-binding proteins that contain domains with this OB-fold structure, including anticodon-binding tRNA synthetases, ssDNA-binding proteins (CDC13, telomere-end binding proteins), phage ssDNA-binding proteins (gp32, gp2.5, gpV), cold shock proteins, DNA ligases, RNA-capping enzymes, DNA replication initiators and RNA polymerase subunit RBP8.
Inorganic pyrophosphatase (PPase) is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.
The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.
The plant cytotoxin ricin is a heterodimer. The A chain, known to be a specific N-glycosidase, has a prominent active site cleft. The B chain is a two-domain lectin, which arose from the replication of a primitive sugar binding peptide. The B chain subunit of ricin (RTB)1 binds to mammalian cell membranes by recognising galactose-containing receptors. RTB has two domains each with three subdomains; tripeptide kinks in the loops from subdomains 1alpha, 1beta, 2alpha, and 2gamma may interact with galactosides. Each of these subdomains has aromatic residues that can interact with the nonpolar face of galactose, and three of the four subdomain folds (1alpha, 1beta, and 2gamma) have polar residues for hydrogen bond formation to the sugar hydroxyls.
The family 10 xylanase from Streptomyces olivaceoviridis (Streptomyces corchorusii) E-86 contains a (beta/alpha)(8)-barrel as a catalytic domain, a family 13 carbohydrate binding module as a xylan binding domain (XBD) and a Gly/Pro-rich linker between them. The crystal structure of this enzyme showed that XBD has three similar subdomains, as indicated by the presence of a triple-repeated sequence, forming a galactose binding lectin fold similar to that found in the ricin toxin B-chain.
A beta barrel of circularly permuted topology is found in many transcription proteins, including initiation and elongation factors, and also some ribosomal proteins, although in these cases the fold is elaborated with additional structures. The beta barrel domain is represented by domain 2 of the elongation factors EF-Tu and eEF1A, both of which function to recognize and transport aminoacyl-tRNA to the acceptor (A) site of the ribosome during the elongation process, and of EF-G, which functions in translocating the peptidyl tRNA from the A site to the peptidyl (P) site. This domain is also present in initiation factors, in domain 2 of eIF2 gamma subunit, and domains 2 and 4 of IF2/eIF5B, both of which function to transport the initiator methionyl-tRNA to the ribosome. This beta barrel domain may be involved in interactions with the switch 2 region to stabilise the relative orientations of the domains, which undergo functionally important conformational changes between GTP- and GDP-bound states.
More information about translation elongation factors can be found at Protein of the Month: Elongation Factors.
A beta barrel of circularly permuted topology is found in the C-terminus of many translation elongation and initiation factors. This domain is found in the elongation factors EF1A (or EF-Tu) of both eukaryotes and prokaryotes, which functions to recognize and transport aminoacyl-tRNA to the acceptor (A) site of the ribosome during the elongation process. This domain is also found in the initiation factor IF2 gamma subunit of eukaryotes, which functions to transport the initiator methionyl-tRNA to the ribosome. The C-terminal extension of mitochondrial EF1A (or EF-Tu) has structural similarities with DNA recognising zinc fingers, suggesting that the extension may be involved in recognition of RNA.
More information about EF1A proteins can be found at Protein of the Month: Elongation Factors.
Methionyl-tRNA formyltransferase (FMT) transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. This family also includes formyltetrahydrofolate dehydrogenases, which produce formate from formyl-tetrahydrofolate. These enzymes contain an N-terminal domain in common with other formyl transferase enzymes. The C-terminal domain has an open beta-barrel fold.
The C-terminal domain of FMT structurally resembles methylpurine-DNA glycosylases (MPG). Human 3-methyladenine DNA glycosylase (AAG) catalyses the first step of base excision repair by cleaving damaged bases from DNA, excising a chemically diverse selection of substrate bases damaged by alkylation or deamination.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This signature recognises a large group of serine and cysteine peptidases (including prokaryotic, eukaryotic and viral), which share a common closed beta barrel structure.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
This entry represents the alpha and beta subunits found in the F1, and A1 complexes of F- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases), as well as the alpha subunit from certain V1-ATPasea. The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .
In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.
In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.
The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain. This entry represents the N-terminal domain of the alpha/A/beta/B subunits, which forms a closed beta-barrel with Greek-key topology.
More information about this protein can be found at Protein of the Month: ATP Synthases.
This entry represents a beta-barrel domain found at the C-terminal of alanine racemase and in group IV pyridoxal-5'-phosphate (PLP)-dependent decarboxylases, such as eukaryotic ornithine decarboxylase, arginine decarboxylase and diaminopimelate decarboxylase. These enzymes belong to the same structural family.
Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins contains this domain are found in both prokaryotic and eukaryotic proteins. The molecular structure of alanine racemase from Bacillus stearothermophilus (Geobacillus stearothermophilus) was determined by X-ray crystallography to a resolution of 1.9 A. The alanine racemase monomer is composed of two domains, an eight-stranded alpha/beta barrel at the N-terminus, and a C-terminal domain essentially composed of beta-strand. The pyridoxal 5'-phosphate (PLP) cofactor lies in and above the mouth of the alpha/beta barrel and is covalently linked via an aldimine linkage to a lysine residue, which is at the C-terminus of the first beta-strand of the alpha/beta barrel.
Eukaryotic ornithine decarboxylase (ODC) acts as a homodimer to produce putrescine (1,4-diaminobutane) from ornithine, where putrescine is the precursor of other polyamines in animals, plants, and bacteria. Arginine decarboxylase is also involved in putrescine biosynthesis. This is the first committed step in polyamine biosynthesis. Alanine racemase is a structurally homologous enzyme. Both proteins share a common alpha/beta barrel that binds the cofactor via a Schiff base on the C-terminal end of the barrel.
Diaminopimelate decarboxylase (DapDC) catalyzes the final step of lysine biosynthesis in bacteria.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
These aspartate proteases all contain a common closed beta barrel structure, which includes pepsin, cathepsin, chymosin, beta-secretase, plasmepsin, plant acid proteases and retroviral proteases.
Certain aminoacyl-tRNA synthetases prevent potential errors in protein synthesis through deacylation of mischarged tRNAs. The close homologs isoleucyl-tRNA synthetase (IleRS) and valyl-tRNA synthetase (ValRS) deacylate Val-tRNAIle and Thr-tRNAVal, respectively. These reactions strictly require the presence of the cognate tRNA. In the absence of tRNA, the enzymatically generated misactivated adenylates remain in the active site, sequestered from hydrolysis. Upon addition of cognate tRNA the misactivated amino acids are hydrolyzed, regenerating the free tRNA and amino acid, while converting 1 equivalent of ATP to AMP. A prominent mechanism for editing misactivated amino acids is the rapid hydrolysis of transiently mischarged tRNA. This reaction is catalyzed at a second active site on IleRS and ValRS. This site is located within a large insertion (termed CP1) into the canonical class I aminoacyl-tRNA synthetase active-site fold. The CP1 domain as an isolated polypeptide hydrolyzes its cognate mischarged tRNA.
Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair.
In many cases, including aspartate decarboxylase and aspartic proteinases, strands 1 and 4 are each bent and consist of two sections. The two sections normally make a right angle; sometimes their hydrogen-bond patterns are disrupted at the corner by a bulge or even by a large insertion. In these cases, the barrel can also be viewed as a pair of orthogonally packed sheets, each with four strands.
The bacterial ribosomal protein L25 is bound to 5S rRNA along with L5 and L18, forming a separate domain of the ribosome. The solution structure of protein L25 uncomplexed with RNA shows two significantly disordered loops and a closed beta-barrel domain with a complex topology that has significant structural similarities to the N-terminal domain of the Thermus thermophilus ribosomal protein TL5, to the general stress protein CTC, and to the C-terminal anticodon-binding domain of Escherichia coli glutaminyl-tRNA synthetase (GlnRS). GlnRS contains a duplication consisting of two L25-like beta-barrels domains with the swapping of N-terminal strands.
Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP, and can dissociate HMGB1 already bound to TBP/TATA-box.
Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2.
This entry represents the beta-barrel domain found at the C-terminal of both TOA1 (or alpha/beta) and TOA2 (or gamma) subunits of TFIIA, and their homologues.
Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:
ADP + phosphoenolpyruvate = ATP + pyruvate
The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.
PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.
The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.
This entry represents the beta-barrel domain (note: it does not include the beta/alpha-barrel it is inserted into). This domain has a similar topology to the beta-strand-rich C-terminal domain of molybdenum cofactor (MOCO) sulphurase (MOSC domain). MOSC domains are found alone in bacterial YiiM proteins, or fused to other domains, such as a NifS-like catalytic domain in MOCO sulphurase. The MOSC domain is predicted to be a sulphur-carrier domain that receives sulphur abstracted from pyridoxal phosphate-dependent NifS-like enzymes, using it for the formation of diverse sulphur-metal clusters.
This entry represents the core beta-barrel (8,10) domain found in cyclophilin (peptidylprolyl isomerise). This domain is related to a beta-barrel domain found in several outer membrane proteins, usually at the C-terminus; in these proteins, the beta-barrel (7,10) lacks the N-terminal strand of the cyclophilin domain, but remains closed.
Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalyzing the cis-trans isomerization of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.
Mannose-6-phosphate receptors (MPRs) are transmembrane proteins involved in the transport of lysosomal enzymes from the Golgi complex and the cell surface to lysosomes. Lysosomal enzymes bearing phosphomannosyl residues bind specifically to MPRs in the Golgi apparatus and the resulting receptor-ligand complex is transported to an acidic prelysosomal compartment, where the low pH mediates dissociation of the complex. There are two distinct MPRs that function in the recognition of mannose-6-phosphate-containing proteins: the cation-dependent MPR (CD-MPR) and the cation-independent MPR (CI-MPR). The CI-MPR is also known as the insulin-like growth factor II receptor, a multi-functional protein implicated in tumour suppression.
The crystal structure of the N-terminal, extracytoplasmic, receptor-binding domain of bovine CD-MPR (excluding the signal sequence) reveals structural similarity to the fifteen homologous, repeating domains comprising the extracellular region of human CI-MPR. The structure consists of a partly opened, nine-stranded, beta-barrel.
This entry represents a beta-propeller domain found in galactose oxidase and in Kelch repeat-containing proteins.
The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.
Galactose oxidase is a monomeric enzyme that contains a single copper ion and catalyses the stereospecific oxidation of primary alcohols to their corresponding aldehyde. The protein contains an unusual covalent thioether bond between a tyrosine and a cysteine that forms during its maturation. Galactose oxidase is a three-domain protein: the N-terminal domain forms a jelly-roll sandwich, the central domain forms a seven 4-bladed beta-propeller, and the C-terminal domain has an immunoglobulin-like fold.
Quinohemoprotein amine dehydrogenase (QHNDH) from Paracoccus denitrificans is a heterotrimer consisting of alpha, beta and gamma chains. The alpha chain has a four-domain structure that includes a dihaem cytochrome c, the beta chain forms a 7-bladed beta-propeller that is part of the enzyme active site, and the gamma chain contains the redox factor cysteine tryptophylquinone (CTQ).
The beta chain of QHNDH structurally resembles the 7-bladed beta propeller of the H chain of the periplasmic quinoprotein methylamine dehydrogenase (MADH), found in methylotrophic bacteria. MADH is a heterotetramer consisting of two heavy (H) chains and two light (L) chains, and contains the redox cofactor tryptophan tryptophylquinone (TTQ). There is no similarity between the quinone-containing chains of MAD and QHNDH.
The beta-propeller structure found in MAD and QHNDH is similar to the YVTN (Tyr-Val-Thr-Asn) repeat that folds into a beta-propeller found in the N-terminal domain of archaeal surface layer proteins, which help protect cells from extreme environments.
WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.
The structures of several WD40 repeat-containing proteins have been determined, including the beta-1 subunit of the signal-transducing G protein heterotrimer, the C-terminal domain of yeast Tup1, the C-terminal domain of Groucho/tle1, the Cdc4 propeller domain, the bovine Arp2/3 complex 41 kDa subunit ARPC1, and actin interacting protein 1.
The beta-lactamase-inhibitor protein II (BLIP-II) is a secreted protein produced by the soil bacteria Streptomyces exfoliates SMF19. BLIP-II acts as a potent inhibitor of beta-lactamases such as TEM-1, which is the most widespread resistance enzyme to penicillin antibiotics. BLIP-II binds competitively to TEM-1, but no direct contacts are made with TEM-1 active site residues. BLIP-II shows no sequence similarity with BLIP, even though both bind to and inhibit TEM-1. However, BLIP-II does share significant sequence identity with the regulator of chromosome condensation (RCC1) family of proteins. These two families are clearly related, both having a seven-bladed beta-propeller structure, although they differ in the number of strands per blade, BLIP-II having three antiparallel beta-strands per blade, while RCC1 has four-stranded blades. RCC1 is a eukaryotic nuclear protein that acts as a guanine nucleotide exchange factor for Ran, a member of the Ras GTPase family. RCC1 mediates a Ran-GTP gradient necessary for the regulation of spindle formation and nuclear assembly during mitosis, as well as for the transport of macromolecules across the nuclear membrane during interphase.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.
This entry represents the N-terminal beta-propeller region of clathrin heavy chains that extends away from the hub of triskelia, and which are responsible for peptide binding.
More information about these proteins can be found at Protein of the Month: Clathrin.
Cytochrome cd1 (cyt cd1) nitrite reductase is a dimeric enzyme of the bacterial periplasm that plays a key role in denitrification, the respiratory reduction of nitrite to nitric oxide in the nitrogen cycle. Each subunit of the cyt cd1 dimer contains one cytochrome c and one d1 haem group. The active site contains a specialised d1 haem, where the nitrite substrate is bound and reduced. This d1 haem is bound in an 8-bladed beta-propeller, which is also found in some members of the WD40 repeat-containing proteins.
Synonym(s): Rsp5 or WWP domain
The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. It is frequently associated with other domains typical for proteins in signal transduction processes.
A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein and others.
In prokaryotes, the nucleotide exchange factor GrpE and the chaperone DnaJ are required for nucleotide binding of the molecular chaperone DnaK. The DnaK reaction cycle involves rapid peptide binding and release, which is dependent upon nucleotide binding. DnaJ accelerates the hydrolysis of ATP by DnaK, which enables the ADP-bound DnaK to tightly bind peptide. GrpE catalyses the release of ADP from DnaK, which is required for peptide release. In eukaryotes, GrpE is essential for mitochondrial Hsp70 function, however the cytosolic Hsp70 homologues are GrpE-independent.
GrpE binds as a homodimer to the ATPase domain of DnaK, and may interact with the peptide-binding domain of DnaK. GrpE accomplishes nucleotide exchange by opening the nucleotide-binding cleft of DnaK. GrpE is comprised of two domains, the N-terminal coiled coil domain, which may facilitate peptide release, and the C-terminal head domain, which forms part of the contact surface with the ATPase domain of DnaK. The head domain is comprised of six short beta strands with a limited hydrophobic core.
Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.
This entry represents alpha class carbonic anhydrases.
More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.
Serralysin is a bacterial Zn-endopeptidase that acts as a virulence factor to cause tissue damage and anaphylactic response. Many Zn-endopeptidases contain the metal binding motif HexxHxxGxxH; in addition to these coordinated histidine residues, serralysin contains a coordinated tyrosine residue that is unique to the astacin-like Zn enzymes. The Zn-endopeptidases containing the histidine motif are structurally similar to one another, containing an N-terminal catalytic domain that belongs to the zincin family, and a C-terminal beta-helix metal-binding domain. These peptidase include the astacin family, snake venom Zn-endopeptidases, the extracellular metalloproteases from Serratia sp., Pseudomonas sp. and Erwinia sp., and the matrixins.
This domain is characterised by trimeric LpxA-like enzymes that display a single-stranded left-handed beta-helix fold, composed of tandem repeats of a hexapeptide, as represented by the Bacterial transferase hexapeptide repeat, where the hexapeptide repeats correspond to individual strands. Many bacterial transferases contain this domain. The structures of several proteins with this domain have been determined, including UDP N-acetylglucosamine acyltransferase (LpxA) from Escherichia coli, the first enzyme in the lipid A biosynthetic pathway; galactoside acetyltransferase (GAT, LacA) from E. coli, a gene product of the lac operon that may assist cellular detoxification; gamma-class Archaeon carbonic anhydrase, a zinc-containing enzyme that catalyses the reversible hydration of carbon dioxide; tetrahydrodipicolinate-N-succinlytransferase (DapD) from Mycobacterium bovis, an enzyme from the lysine biosynthetic pathway that contains an extra N-terminal 3-helical domain; and the C-terminal domain of N-acetylglucosamine 1-phosphate uridyltransferase (GlmU) from E. coli, a trimeric bifunctional enzyme that catalyses the last two sequential reactions in the de novo biosynthetic pathway for UDP-N-acetylglucosamine, an essential precursor for many biomolecules.
RmlC (dTDP (deoxythimodone diphosphates)-4-dehydrorhamnose 3,5-epimerase; is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria. RmlC is a dimer, each monomer being formed from two beta-sheets arranged in a beta-sandwich, where the substrate-binding site is located between the two sheets of both monomers.
Other protein families contain domains that share this fold, including glucose-6-phosphate isomerase; germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities; auxin-binding protein; seed storage protein 7S; acireductone dioxygenase; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase, phosphomannose isomerase and homogentisate dioxygenase, the last three sharing a 2-domain fold with storage protein 7s.
The single hybrid motif has a beta-barrel sandwich hybrid fold, consisting of a sandwich of half-barrel shaped beta-sheets. This motif is found in biotinyl/lipoyl-carrier proteins and domains, where the biotin and lipoic acid moieties act as covalently attached coenzyme cofactors in enzymes that catalyse metabolic reactions. For example, this motif can be found in the biotinyl domain of Escherichia coli acetyl-CoA carboxylase, protein H of the glycine cleavage system in Pisum sativum (Garden pea), the ipoyl domain of dihydrolipoamide acetyltransferase, which is a component of the pyruvate dehydrogenase complex, the lipoyl domain of the 2-oxoglutarate dehydrogenase complex, and the lipoyl domain f the mitochondrial branched-chain alpha-ketoacid dehydrogenase.
The rudiment single hybrid motif has a beta-barrel sandwich hybrid motif, consisting of a sandwich of half-barrel shaped beta-sheets. This motif is found in the small domain of cytochrome f, as well as in the C-terminal domain of the biotin carboxylase subunit of acetyl-CoA carboxylase, and its family members, such as glycinamide ribonucleotide synthetase C-terminal domain, N5-carboxyaminoimidazole ribonucleotide synthetase PurK C-terminal domain, and glycinamide ribonucleotide transformylase PurT C-terminal domain.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This signature is associated with serine peptidases belong to MEROPS peptidase families: S24 (LexA family, clan SF); S26A (signal peptidase I), S26B (signalase) and S26C TraF peptidase.
The S26 family includes Escherichia coli signal peptidase, SPase, which is a membrane-bound endopeptidase, with two N-terminal transmembrane segments and a C-terminal catalytic region. SPase functions to release proteins that have been translocated into the inner membrane from the cell interior, by cleaving off their signal peptides.
The S24 family includes:
All of these proteins, with the possible exception of RulA, interact with RecA, which activates self cleavage either derepressing transcription in the case of CI and LexA or activating the lesion-bypass polymerase in the case of UmuD and MucA. UmuD'2, is the homodimeric component of DNA pol V, which is produced from UmuD by RecA-facilitated self-cleavage. The first 24 N-terminal residues of UmuD are removed; UmuD'2 is a DNA lesion bypass polymerase. MucA, like UmuD, is a plasmid encoded a DNA polymerase (pol RI) which is converted into the active lesion-bypass polymerase by a self-cleavage reaction involving RecA
This group of proteins also contains proteins not recognised as peptidases as well as those classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.
This entry represents a structural domain with a complex fold consisting of several coiled beta-sheets. This domain exists as a duplication, consisting of a tandem repeat of two similar structural motifs. These domains can be found in:
Mss4 is a conserved accessory factor for Rab GTPases, which function as ubiquitous regulators of intracellular membrane trafficking. Mss4 acts to promote nucleotide release from exocytic but not endocytic Rab GTPases. Mss4 has a complex fold made of several coiled beta-sheets, and consists of a duplication of tandem repeats of two similar structural motifs. It contains a zinc-binding site.
Other proteins that show structure similarity to Mss4 include the translationally controlled tumour-associated proteins TCTPs, which contains an insertion of an alpha helical hairpin, and lacks the zinc-binding site. TCTPs are a highly conserved and abundantly expressed family of eukaryotic proteins that are implicated in both cell growth and the human acute allergic response.
The C-terminal MsrB domain of peptide methionine sulphoxide reductase PilB is structurally similar to Mss4. Methionine sulphoxide reductases protect against oxidative damage that can contribute to cell death. The tandem Msr domains (MsrA and MsrB) of the pilB protein from Neisseria gonorrhoeae each reduce different epimeric forms of methionine sulphoxide.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit). In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Triosephosphate isomerase (TIM) is the glycolytic enzyme that catalyses the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. TIM plays an important role in several metabolic pathways and is essential for efficient energy production. It is a dimer of identical subunits, each of which is made up of about 250 amino-acid residues. A glutamic acid residue is involved in the catalytic mechanism. The sequence around the active site residue is perfectly conserved in all known TIM's. Deficiencies in TIM are associated with haemolytic anaemia coupled with a progressive, severe neurological disorder.
The ribulose-phosphate binding barrel consists of a parallel beta-sheet barrel fold containing a phosphate-binding site. Several proteins display this fold, including histidine biosynthesis enzymes, tryptophan biosynthesis enzymes, D-ribulose-5-phosphate 3-epimerase, and decarboxylases.
Thiamine monophosphate synthase (TMP) catalyzes the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl)thiazole phosphate to yield thiamine phosphate in the thiamine biosynthesis pathway.
TENI, a protein from Bacillus subtilis that regulates the production of several extracellular enzymes by reducing alkaline protease production belongs to this group.
The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others. All possess a similar structure, with a beta-alpha-beta fold characteristic of nucleotide binding proteins. The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones.
Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases.
Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity.
O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.
This entry represents the catalytic TIM beta/alpha barrel common to many different families of glycosyl hydrolases. Structures have been determined for several proteins containing this domain, including family 13 glycosyl hydrolases (such as alpha-amylase), beta-glycanases, family 1 glycosyl hydrolases (such as beta-glucosidase), type II chitinases, 1,4-beta-N-acetylmuraminidases, and beta-N-acetylhexosaminidases.
More information about this protein can be found at Protein of the Month: alpha-Amylase.
Pyruvate kinase controls the exit from the glysolysis pathway, catalysing the transfer of phosphate from phosphooenolpyruvate (PEP) to ADP. Mammalian pyruvate kinase is a homotetramer, where each polypeptide subunit consists of four domains: N-terminal, A domain, B domain and C-terminal. Activation of the enzyme is believed to occur via the clamping down of the B domain onto the A domain to dehydrate the active site cleft. The N- and C-terminal domains are situated at inter-subunit contact sites, and could be involved in assembly and communication within the complex. The N-terminal domain has a TIM beta/alpha-barrel structure. Homologous TIM-barrel domains are found in the following proteins:
This entry represents a structural motif with a beta/alpha TIM barrel found in several proteins families:
These proteins share similar, but not identical, metal-binding sites. In addition, xylose isomerase and L-rhamnose isomerase each have additional alpha-helical domains involved in tetramer formation. This entry differs from IPR012307 in having a wider coverage of TIM-barrel protein families.
This entry represents a structural domain consisting of a TIM beta/alpha-barrel. These domains are found in several phospholipase C (PLC) like phosphodiesterases, including:
Phospholipase C (PLC) isozymes are directly activated by heterotrimeric G proteins and Ras-like GTPases to hydrolyze phosphatidylinositol 4,5-bisphosphate into the second messengers diacylglycerol and inositol 1,4,5-trisphosphate. PLC enzymes often play central roles in various signalling cascades.
All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. The enzyme 7,8-dihydropteroate synthase (DHPS) catalyses the condensation of para-aminobenzoic acid (pABA) with 6-hydroxymethyl-7, 8-dihydropterin-pyrophosphate to form 7,8-dihydropteroate and pyrophosphate. DHPS is essential for the de novo synthesis of folate in prokaryotes, lower eukaryotes, and in plants, but is absent in mammals. By contrast, higher vertebrates possess an active transport system that enables them to use dietary folates. DHPS is the target of sulphonamides, which are substrate analogues that compete with pABA, but which do not affect vertebrates as they lack the DHPS enzyme. DHPS is a single domain protein that forms an eight-stranded TIM alpha/beta barrel, where the 7,8-dihydropterin pyrophosphate substrate binds in a deep cleft in the barrel. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).
Other proteins contain a DHPS-like domain, including members of the methyltetrahydrofolate (corrinoid iron-sulphur protein methyltransferase (MeTr)) family. MeTr catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. Other members of this family that contain a DHPS-like domain include methionine synthase and methanogenic enzymes that activate the methyl group of methyltetrahydromethano(or -sarcino)pterin.
This entry represents NAD- and NADP-binding domains with a core Rossmann-type fold, which consists of 3-layers alpha/beta/alpha, where the six beta strands are parallel in the order 321456. Many different enzymes contain an NAD/NADP-binding domain, including:
3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents the 'swivel' domain found at the C-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the N-terminal region following the HEAT-like domain in bacterial AcnB. This domain has a three layer beta/beta/alpha structure, and in cytosolic Acn is known to rotate between the cAcn and IRP1 forms of the enzyme. This domain is also found in the small subunit of isopropylmalate dehydratase (LeuD).
More information about these proteins can be found at Protein of the Month: Aconitase.
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the N-terminal domain of the small subunit of carbamoyl phosphate synthase. The small subunit catalyses the hydrolysis of glutamine to ammonia, which in turn used by the large chain to synthesize carbamoyl phosphate. The small subunit has a 3-layer beta/beta/alpha structure, and is thought to be mobile in most proteins that carry it. The C-terminal domain of the small subunit of CPSase has glutamine amidotransferase activity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The L32e family consists of proteins that have 135 to 240 amino-acid residues.
Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.
The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.
A chitin biosynthesis protein from yeast also seems to belong to this group.
This entry represents various uracil-DNA glycosylases and related DNA glycosylases, such as uracil-DNA glycosylase, thermophilic uracil-DNA glycosylase, G:T/U mismatch-specific DNA glycosylase (Mug), and single-strand selective monofunctional uracil-DNA glycosylase (SMUG1). These proteins have a 3-layer alpha/beta/alpha structure. Uracil-DNA glycosylases are DNA repair enzymes that excise uracil residues from DNA by cleaving the N-glycosylic bond, initiating the base excision repair pathway. Uracil in DNA can arise either through the deamination of cytosine to form mutagenic U:G mispairs, or through the incorporation of dUMP by DNA polymerase to form U:A pairs. These aberrant uracil residues are genotoxic. The sequence of uracil-DNA glycosylase is extremely well conserved in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localization, but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures to act as a general base in the catalytic mechanism.
This entry represents a structural domain with a 3-layer alpha/beta/alpha topology. This domain can be found in acyl transferases such as bacterial malonyl-CoA ACP transacylase (FabD) and the homologous domain from eukaryotic fatty acid synthase . This domain is also found in lysophospholipases such as cytosolic phospholipase A2 (which has additional structural features), and in patatin proteins, which are plant glycoproteins that act as non-specific lipid acyl hydrolases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family includes ribosomal L4/L1 from eukaryotes and plants and L4 from bacteria. L4 from yeast has been shown to bind rRNA. These proteins have 246 (plant) to 427 (human) amino acids.
This entry represents a structural domain consisting of 3-layers, alpha/beta/alpha. This domain is found in both the alpha and beta chains of succinyl-CoA synthase GDP-forming) and(ADP-forming)). This domain can also be found in ATP citrate synthase (), malate-CoA ligase () and acetate-CoA ligase (or acetyl-CoA synthase) (), as well as bacterial Fdr. Some members of the domain utilise ATP others use GTP.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal S2 proteins have been shown to belong to a family that includes 40S ribosomal subunit 40kDa proteins, putative laminin-binding proteins, NAB-1 protein and 29.3kDa protein from Haloarcula marismortui. The laminin-receptor proteins are thus predicted to be the eukaryotic homologue of the eubacterial S2 risosomal proteins.
DNA photolyases are enzymes that bind to DNA containing pyrimidine dimers: on absorption of visible light, they catalyse dimer splitting into the constituent monomers, a process called photoreactivation. This is a DNA repair mechanism, repairing mismatched pyrimidine dimers induced by exposure to ultra-violet light. The precise mechanisms involved in substrate binding, conversion of light energy to the mechanical energy needed to rupture the cyclobutane ring, and subsequent release of the product are uncertain. Analysis of DNA lyases has revealed the presence of an intrinsic chromophore, all monomers containing a reduced FAD moiety, and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a second chromophore. Either chromophore may act as the primary photon acceptor, peak absorptions occurring in the blue region of the spectrum and in the UV-B region, at a wavelength around 290nm.
This domain binds a light harvesting cofactor.
The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyse the formation of amide bonds, catalysing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, glutathionylspermidine synthase, amongst others. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates.
This entry represents the pre-ATP-grasp structural domain, which precedes the ATP-grasp domain in all superfamily members, and which usually occurs at the N-terminus of the protein. The structure of the pre-ATP-grasp domain consists of alpha/beta/alpha in three layers, and is possibly a rudiment form of the Rossmann-fold. This domain can have a substrate-binding function.
This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.
The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity. Some subtilisins are mosaic proteins, and others contain N- and C-terminal extensions that show no sequence similarity to any other known protein. Based on sequence homology, a subdivision into six families has been proposed.
The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site. Members of the kexin family, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of Cys-173 near to the active histidine.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish.
Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease.
Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including
Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.
Several biological processes regulate the activity of target proteins through changes in the redox state of thiol groups (S2 to SH2), where a hydrogen donor is linked to an intermediary disulphide protein. Such processes include the ferredoxin/thioredoxin system, the NADP/thioredoxin system, and the glutathione/glutaredoxin system. Several of these disulphide proteins share a common structure, consisting of a three-layer alpha/beta/alpha core. Proteins that contain domains with a thioredoxin fold include:
Transketolase C-terminal-like domains can be found in a number of different enzymes, including the C-terminal domain of the pyruvate dehydrogenase E1 component, the C-terminal domain of branched-chain alpha-keto acid dehydrogenases, and domain II of pyruvate-ferredoxin oxidoreductase (PFOR). Structural studies reveal this domain to comprise of three layers alpha/beta/alpha. The mixed beta sheet consists of five strands in the order 13245, where strand 1 is antiparallel to the others.
Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:
ADP + phosphoenolpyruvate = ATP + pyruvate
The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.
PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.
The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.
This entry represents the 3-layer alpha/beta/alpha sandwich domain. This domain has a similar topology to the archaeal hypothetical protein, MTH1675 from Methanobacterium thermoautotrophicum.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Type II restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin. However, there is still considerable diversity amongst restriction endonucleases. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone.
There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements, as summarised below:
This entry represents the core structure found in most type II restriction endonucleases, consisting of a 3-layer alpha/beta/alpha topology with mixed beta-sheets. This core structure can be found in the restriction endonucleases EcoRI, EcoRV, BamHI, BglI, BglII, BstyI, PvuII, MunI, NseI, NgoIV, BsobI, HincII, MspI, FokI (C-terminal), EcoO109IR, as well as in lamba exonuclease, DNA mismatch repair protein MutH, VSR (very short repair) endonucleases, TnsA endonucleases (N-terminal), endonucleases I (Holliday junction resolvase), Hjc-like enzymes, XPF/Rad1/Mus81 nucleases, RecB and RecC exodeoxyribonuclease V (C-terminal), and RecU-like enzymes.
This entry represents a 3-layer alpha/beta/alpha domain found as the catalytic domain at the C-terminal in homotetrameric tRNA-intron endonucleases, and as domains 2 and 4 (C-terminal) in the homodimeric enzymes. tRNA-intron endonucleases remove tRNA introns by cleaving pre-tRNA at the 5'- and 3'-splice sites to release the intron. The products are an intron and two tRNA half-molecules bearing 2',3' cyclic phosphate and 5'-hydroxyl termini. These enzymes recognise a pseudosymmetric substrate in which 2 bulged loops of 3 bases are separated by a stem of 4 bp. Although homotetrameric enzymes contain four active sites, only two participate in the cleavage, and should therefore, be considered as a dimer of dimers.
Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).
This entry represents the N-terminal domain of eukaryotic RPB5, which has a core structure consisting of 3 layers alpha/beta/alpha. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA.
The catalytic domain of several polynucleotidyl transferases share a similar structure, consisting of a 3-layer alpha/beta/alpha fold that contains mixed beta sheets, suggesting that they share a similar mechanism of catalysis. Polynucleotidyl transferases containing this domain include ribonuclease H class I (RNase HI) and class II (RNase HII), HIV RNase (reverse transcriptase domain), retroviral integrase (catalytic domain), Mu transposase (core domain), transposase inhibitor Tn5 (containing additional all-alpha subdomains), DnaQ-like 3Â-5Â exonucleases (exonuclease domains), RuvC resolvase, and mitochondrial resolvase ydc2 (catalytic domain).
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the connector domain (domain 2) found in proteins of the MutS family. The structure of the MutS connector domain consists of a parallel beta-sheet surrounded by four alpha helices, which is similar to the structure of the Holliday junction resolvase ruvC.
The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:
Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.
This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes the C-terminal domain of folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).
The C-terminal domain is almost always associated with the cytoplasmic peptidoglycan synthetases, N-terminal domain.
The trifunctional glycinamide ribonucleotide synthetase-aminoimidazole ribonucleotide synthetase-glycinamide ribonucleotide transformylase catalyses the second, third and fifth steps in de novo purine biosynthesis. The glycinamide ribonucleotide transformylase belongs to this group.
Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.
PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.
This entry represents the major region of PLP-dependent transferases. This domain has a three layer alpha/beta/alpha sandwich topology, with mixed beta-sheets of 7 strands. The major region can be found in the following PLP-dependent transferase families:
The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:
Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.
This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).
This entry represents a structural domain with a core 3-layer alpha/beta/alpha structure, which can sometimes contain additional subdomains (also covered by this entry). These domains form the core domain of alkaline phosphatases. This structural domain is found in:
This family contains two related enzymes:
It has been shown that these two enzymes are evolutionary related. The predicted secondary structure of both enzymes are similar and there are some regions of sequence similarities. One of these regions includes three residues which have been shown, by crystallographic studies, to be implicated in binding the phosphoryl group of carbamoyl phosphate.
3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents a region containing 3 domains, each with a 3-layer alpha/beta/alpha topology. This regions represents the [4Fe-4S] cluster-binding region found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB. This domain is also found in the large subunit of isopropylmalate dehydratase (LeuC).
More information about these proteins can be found at Protein of the Month: Aconitase.
The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.
Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.
The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.
This entry represents domains I, II and III found in alpha-D-phosphohexomutase enzymes. All three domains share a 3-layer alpha/beta/alpha topology.
Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.
PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.
Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.
This entry represents the full PGK enzyme.
PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6- bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP, as the 2 products are now further apart. These conformations are thought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react.
Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting, muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise. Sufferers are usually able to lead a reasonably ordinary life by learning to adjust activity levels.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
This entry represents several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:
This entry represents a structural domain with a thiolase-like 3-layer alpha/beta/alpha topology. This domain usually occurs in two similar copies that are related by a pseudo-dyad, and which arose through duplication. The proteins in this entry can be split into two groups: those related to thiolase, and those related to chalcone synthase. The thiolase-like enzymes include:
The chalcone synthase-like enzymes include:
The iron-only hydrogenases catalyse the two-electron reduction of two protons to yield dihydrogen, as part of an energy cycle. Fe-only hydrogenases are restricted to strictly anaerobic microbes, and are often very sensitive to molecular oxygen. The cytoplasmic monomeric Fe hydrogenases are involved in hydrogen production, while the periplasmic, heterodimeric Fe hydrogenases are involved in hydrogen uptake. Fe hydrogenases consist of two, intertwined domains, the catalytic domain and the larger subunit C domain. The larger subunit C domain can be divided into three subdomains. There are five distinct metal clusters, one in the catalytic domain. This entry represents the two intertwined domains, the catalytic domain and the large subunit C domain.
This entry represents a structural domain with a core fold consisting of alpha-beta(2)-(alpha-beta)2 that folds into three layers, a/b/a; some members may contain an extra C-terminal strand. This domain is found in several types of proteins, including:
This entry represents a chromo (CHRromatin Organization MOdifier) domain-like structural domain, which consists of an SH3-like beta-barrel capped by a C-terminal helix. Chromo domains are conserved modules of around 60 amino acids that are implicated in the recognition of lysine-methylated histone tails and nucleic acids. Chromo domains were originally identified in Drosophila modifiers of variegation, proteins that alter the structure of chromatin to the condensed morphology of heterochromatin. Domains with a chromo domain structural fold include:
Chromo domains can be found in various nuclear proteins, including heterochromatin protein 1 (HP1) (N-terminal chromo domain and C-terminal chromo shadow domain), where the chromo domain recognises histone tails with specifically methylated lysines; polycomb protein Pc, which is essential for maintaining the silencing state of homeotic genes during development (chromo domain important for chromatin targeting); histone methyltransferase clr4, which regulates silencing and switching at the mating-type loci and to affect chromatin structure at centromeres; and the ATP-dependent helicase CHD1, which regulates ATP-dependent nucleosome assembly and mobilisation through conserved double chromo domains and a SWI2/SNF2 helicase/ATPase domain..
Chromo barrel domains are found in various histone acetyltransferases, such as MYST1 from Mus musculus (Mouse) and MOF from Drosophila melanogaster (Fruit fly). This domain can also be found in the human mortality factor 4-like protein, MRG15.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Both the L23 and L15e ribosomal proteins have a core domain consisting of a beta-(alpha)-beta-alpha-beta(2) structure folded into three layers, alpha/beta/alpha, where the beta-sheets are antiparallel.
The histidine triad motif (HIT) is related to the sequence H-phi-H-phi-H-phi-phi (where phi is a hydrophobic amino acid). Proteins containing HIT domains form a superfamily of nucleotide hydrolases and transferases that act on the alpha-phosphate of ribonucleotides. HIT-containing proteins fall into three families:
The ferredoxin protein family are electron carrier proteins with an iron-sulphur cofactor that act in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s) and according to sequence similarities.
This entry represents members of the 2Fe-2S ferredoxin family that have a general core structure consisting of beta(2)-alpha-beta(2), which includes putidaredoxin and terpredoxin, and adrenodoxin. They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes and in multidomain proteins, such as aldehyde oxidoreductase (N-terminal), xanthine oxidase (N-terminal), phthalate dioxygenase reductase (C-terminal), succinate dehydrogenase iron-sulphur protein (N-terminal), and methane monooxygenase reductase (N-terminal).
This entry represents a ssDNA-binding transcriptional regulator domain consisting of a helix-swapped dimer of beta(4)-alpha motifs. This domain is found as a C-terminal domain in the transciptional co-activator PC4 (where it is a dimer of two separate motifs), and in the plant transciprional regulator PBF-2 (where it is a single chain domain formed by a tandem repeat of two motifs).
Transcriptional regulators play a critical role in controlling the level of transcription from specific genes in response to different stimuli. Members of this family of transcriptional regulators, which preferentially bind single-stranded DNA, include PBF-2 from plants, the mammalian nuclear factor 1-X (NF1-X), and positive cofactor 4 (PC4). These proteins are structurally similar, consisting of a helix-swapped dimer of beta(4)-alpha motifs.
The plant defence transcription factor PBF-2 is comprised of four p24 subunits that interact through a helix-loop-helix motif to produce a central pore. PBF-2 functions as part of the plantÂs defence system in response to the detection of a pathogen. Upon stimulation, PBF-2 induces several signal transduction pathways leading to changes in the expression of defence genes, including the pathogenesis-related (PR) genes.
NF1-X is one of several NF1 proteins that function as transcription factors. NF1-X consists of two functionally distinct domains: a conserved N-terminal DNA-binding domain and a C-terminal transcriptional regulatory domain. NF1-X binds to the promoter for the 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA) reductase gene.
PC4 (or P15) possess the ability to co-activate and suppress transcription via its DNA-binding activity. PC4 has been shown to stimulate transcription in vitro with diverse activators, including VP16, thyroid hormone receptor, BRCA-1, often involving TFIIA. PC4 and TFIIA are thought to facilitate the assembly of the pre-initiation complex. The repressive activity of PC4 can be alleviated by the transcription factor TFIIH, which protects promoters from PC4 repression. PC4 consists of two domains: an N-terminal regulatory domain and a C-terminal cryptic DNA-binding domain. The protein acts as a dimer with two ssDNA binding channels running in opposite directions to each other.
This entry represents a structural domain with an alpha-beta(4)-alpha(3) core fold. Domains of this structure are found in:
Tubby, an autosomal recessive mutation, mapping to mouse chromosome 7, was recently found to be the result of a splicing defect in a novel gene with unknown function. This mutation maps to the tub gene. The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and sensory deficits. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member, although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutated tub gene. TUB is the founding-member of the tubby-like proteins, the TULPs. TULPs are found in multicellular organisms from both the plant and animal kingdoms. Ablation of members of this protein family cause disease phenotypes that are indicative of their importance in nervous-system function and development.
Mammalian TUB is a hydrophilic protein of ~500 residues. The N-terminal portion of the protein is conserved neither in length nor sequence, but, in TUB, contains the nuclear localisation signal and may have transcriptional-activation activity. The C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The crystal structure of the C-terminal core domain from mouse tubby has been determined to 1.9A resolution. This domain is arranged as a 12-stranded, all anti-parallel, closed beta-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core. Structural analyses suggest that TULPs constitute a unique family of bipartite transcription factors.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 87 to 128 amino-acid residues. This family consists of:
Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.
Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family.
Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents a structural domain with an alpha/beta-hammerhead fold, where the beta-hammerhead motif is similar to that in barrel-sandwich hybrids. Domains of this structure can be found in ribosomal proteins L10e and L16.
The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is a versatile protein-protein interaction motif involved in many cellular functions, including transcriptional regulation, cytoskeleton dynamics, ion channel assembly and gating, and targeting proteins for ubiquitination. The BTB domain can occur alongside other domains: BTB-zinc finger (BTB-ZF), BTB-BACK-Kelch (BBK), voltage-gated potassium channel T1 (T1-Kv), MATH-BTB, BTB-NPH3 and BTB-BACK-PHR (BBP). Other proteins, such as Skp1 and ElonginC, consist almost exclusively of the core BTB fold. In all of these protein families, the BTB core fold is structurally conserved, consisting of a 2-layer alpha/beta topology where a cluster of alpha helices is flanked by short beta-sheets. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN.
This entry differs from IPR000210 in including POZ-containing Skp1 proteins.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents the C-terminal dimerisation domain found primarily in EF-Tu (EF1A) proteins from bacteria, mitochondria and chloroplasts.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.
The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents a domain found at the C-terminus of ribosomal proteins L7 and L12, and also in the adaptor protein ClpS, forming an alpha/beta sandwich.
The L7 and L12 ribosomal proteins are part of the large 50S ribosomal subunit, and occur in four copies organised as two dimers. The L7/L12 dimer probably interacts with EF-Tu. L7 and L12 only differ in a single post-translational modification of the addition of an acetyl group to the N terminus of L7.
ClpS is an adaptor protein that influences protein degradation through its binding to the N-terminal domain of the chaperone ClpA in the ClpAP chaperone-protease pair. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents both the 9 kDa SRP9 and the 14 kDa SRP14 components. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.
The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.
Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.
This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).
The K homology domain is a common RNA-binding motif present in one or multiple copies in both prokaryotic and eukaryotic regulatory proteins. The KH motifs may act cooperatively to bind RNA in the case of multiple motifs, or independently in the case of single KH motif proteins. Prokaryotic (pKH) and eukaryotic (eKH) KH domains share a KH-motif, but have different topologies. The pKH domain has been found in a number of proteins, including the N-terminal domain of the S3 ribosomal protein, the C-terminal domain of Era GTPase and the two C-terminal domains of the NusA transcription factor. The structure of the pKH domain consists of a two-layer alpha/beta fold in the arrangement alpha/beta(2)/alpha/beta.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. This family of ribosomal proteins includes S3 from bacteria, algae and plant chloroplast, cyanelle, archaebacteria, plant mitochondria, vertebrates, insects, Caenorhabditis elegans and yeast. This entry is the C-terminal domain.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L22 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L22 is known to bind 23S rRNA. It belongs to a family of ribosomal proteins which includes: bacterial L22; algal and plant chloroplast L22 (in legumes L22 is encoded in the nucleus instead of the chloroplast); cyanelle L22; archaebacterial L22; mammalian L17; plant L17 and yeast YL17.
This entry represents a structural domain found in the nitrogen regulatory protein PII, in ATP phosphribosyltransferases (C-terminal domain), in the divalent ion tolerance protein CutA1, and in some bacterial hypothetical proteins. This domain consists of a ferredoxin-like alpha/beta sandwich, which forms trimeric structures with orthogonally packed beta-sheets around a three-fold axis.
PII is a tetrameric protein encoded by glnB that functions as a component of the adenylation cascade involved in the regulation of GS activity. PII helps regulate the level of glutamine synthetase in response to nitrogen source availability. In nitrogen-limiting conditions, PII is uridylylated to form PII-UMP, which allows the deadenylation of glutamine synthetase, thus activating the enzyme. Conversely, in nitrogen excess, PI-UMP is deuridylated to PII, promoting the adenylation and deactivation of glutamine synthetase.
ATP phosphoribosyltransferase is the first enzyme of the histidine pathway. It is allosterically regulated, controlling the flow of intermediates through the pathway. The C-terminal domain is the regulatory region of the protein, which binds the allosteric inhibitor histidine.
CutA1 functions in divalent ion tolerance in bacteria, plants and animals. Divalent metal ions play key roles in all living organisms, serving as cofactors for many proteins involved in a variety of electron-transfer activities. In Escherichia coli it is thought to be involved in copper ion tolerance, excessive copper ions being toxic.
Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.
In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.
NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.
NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
EF2 (or EFG) participates in the elongation phase of protein synthesis by promoting the GTP-dependent translocation of the peptidyl tRNA of the nascent protein chain from the A-site (acceptor site) to the P-site (peptidyl tRNA site) of the ribosome. EF2 also has a role after the termination phase of translation, where, together with the ribosomal recycling factor, it facilitates the release of tRNA and mRNA from the ribosome, and the splitting of the ribosome into two subunits. EF2 is folded into five domains, with domains I and II forming the N-terminal block, domains IV and V forming the C-terminal block, and domain III providing the covalently-linked flexible connection between the two. Domains III and V have the same fold (although they are not completely superimposable and domain III lacks some of the superfamily characteristics), consisting of an alpha/beta sandwich with an antiparallel beta-sheet in a (beta/alpha/beta)x2 topology. This double split beta/alpha/beta fold is also seen in a number of ribonucleotide binding proteins. It is the most common motif occurring in the translation system and is referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif.
This domain is found in EF2 proteins from both prokaryotes and eukaryotes, as well as in some tetracycline resistance proteins, peptide chain release factors, and in the C-terminal region of the bacterial hypothetical protein, YigZ.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma).
More information about these proteins can be found at Protein of the Month: Elongation Factors.
This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S6 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S6 is known to bind together with S18 to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, red algal chloroplast and cyanelle S6 ribosomal proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor. Experimental evidence has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.
Nucleotidytransferases can be divided into two classes based on highly conserved features of the nucleotidyltransferase motif. Class I enzymes include eukaryotic poly(A) polymerase (PAP), archaeal tRNA CCA-adding enzyme and possibly DNA polymerase beta, while class II enzymes include eukaryotic and eubacterial tRNA CCA-adding enzymes. This entry represents the C-terminal domain of class I nucleotidyltransferases. The C-terminal domain has an alpha/beta sandwich fold, although the archaeal tRNA CCA-adding enzyme has a large insertion; this fold is reminiscent of the RNA-recognition motif fold.
Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA. In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. The catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Furthermore, conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.
The archaeal CCA-adding enzyme builds and repairs the 3 ' end of tRNA. A single active site (nucleotidyltransferase motif) adds both CTP and ATP. This enzyme is the only RNA polymerase that can build or rebuild a specific nucleic acid sequence without using a nucleic acid template.
Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.
In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L30 is one of the proteins from the large ribosomal subunit. L30 belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria and archaea L30, yeast mitochondrial L33, and Drosophila melanogaster, Dictyostelium discoideum (Slime mold), fungal and mammalian L7 ribosomal proteins. L30 from bacteria are small proteins of about 60 residues, those from archaea are proteins of about 150 residues, and eukaryotic L7 are proteins of about 250 to 270 residues.
This entry represents a domain with a ferredoxin-like fold, with a core structure consisting of core: beta-alpha-beta-alpha-beta. This domain is found in prokaryotic ribosomal protein L30 (short-chain member of the family), as well as in archaeal L30 (L30a) (long-chain member of the family), the later containing an additional C-terminal (sub)domain).
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry represents a structural domain containing a two-layer core alpha/beta structure: alpha-beta(2)-alpha-beta(2). This domain is thought to be a putative editing domain found in the N-terminal part of threonyl-tRNA synthetase (ThrRS), the C-terminal of alanyl-tRNA synthetase (AlaRS), and as the stand-alone hypothetical proteinfrom the archaea Pyrococcus horikoshii; probable circular permutation of LuxS.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This domain is found at the N-terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition.
The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth. Thus ribosomes are 'recycled' and ready for another round of protein synthesis.
Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a fidelity factor during the assembly of the ternary initiation complex which consist of the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic protein that binds to the 30S ribosomal subunit. The chloroplast initiation factor IF-3(chl) is a protein that enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal 30s subunits in which the central section is evolutionary related to the sequence of bacterial IF-3.
This entry represents an alpha/beta domain consisting of alternating beta-strands and alpha helices in two layer. This domain is found in RNA 3'-terminal phosphate cyclase (RPTC), where it occurs as a duplication of three repeats of this fold packed together around a pseudo three-fold axis. RNA cyclases are a family of RNA-modifying enzymes that catalyse the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA. These cyclases contain an insert alpha/beta domain with a thioredoxin topology.
This domain is also found in enolpyruvate transferase, where it occurs as a duplication of six repeats of this fold organised into two RPTC-like domains. Enolpyruvate transferase is the first enzyme in bacterial peptidoglycan biosynthesis, catalysing the transfer of enolpyruvate from phosphoenolpyruvate to UDP-N-acetyl-glucosamine.
DCoH is the dimerisation cofactor of hepatocyte nuclear factor 1 (HNF-1) that functions as both a transcriptional coactivator and a pterin dehydratase. X-ray crystallographic studies have shown that the ligand binds at four sites per tetrameric enzyme, with little apparent conformational change in the protein.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and the prokaryotic RNAP alpha subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/betaÂ) can interact. These different subunits share regions of homology. Rpb11 contains a domain (Rpb11-like domain) that is required for dimerisation, and binds to a homologous region on Rpb3. The Rpb11-like domain in Rpb11 and archaeal L subunits is contiguous, whereas in Rpb3, archaeal D, and prokaryotic alpha subunits, the Rpb11-like domain is interrupted by an insert domain. In the prokaryotic RNAP alpha subunit, the Rpb11-like domain and the insert domain form two subregions of the N-terminal domain.
The structure of the Rpb11-like domain consists of a two-layer alpha/beta fold consisting of beta(2)-alpha-beta(2)-alpha. Rpb3 and Rpb11 in yeast RNAP have been shown to share a high degree of sequence and structural similarity to the alpha subunit of bacterial RNAP.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed beta-sheet surrounded by three alpha-helices, which is similar to the structure of tRNA endonuclease.
The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eukaryotic proteins of unknown function . It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition. Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand-binding site. There is limited homology within the C-terminal 20-30 amino acids of various GYF domains, supporting the idea that this part of the domain is structurally but not functionally important.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
L5 is a protein of about 180 amino-acid residues.
Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).
This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2).
This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain.
This entry represents a structural domain with a core structure consisting of beta-alpha-beta-alpha-beta(2), which is found in two enzymes of the purine biosynthetic pathway: at the N-terminal of aminoimidazole ribonucleotide (AIR) synthetase (PurM), as well as the N1 and N2 domains of formylglycinamide ribonucleotide (FGAR) amidotransferase (PurL) (PurM-like module). PurM and PurL utilise ATP to activate the oxygen of an amide within their substrate toward nucleophilic attack by a nitrogen. PurM uses the product of PurL, formylglycinamidine ribonucleotide (FGAM) and ATP to make AIR, ADP and P(i). It is also found as domains 1 and 3 in phosphoribosylformylglycinamidine synthase II (smPurL) (carries a duplication: tandem repeats of two PurM-like units arranged like the PurM subunits in the dimer).
This domain is also found at the N-terminal of thiamine monophosphate kinase (ThiL). ThiL phosphorylates thiamin monophosphate to form thiamin pyrophosphate, an essential cofactor that is synthsised de novo by Salmonella typhimurium.
This entry represents a dimerisation domain that is usually found at the C-terminal of FAD and NAD-linked reductases. This domain has a core alpha+beta sandwich structure consisting of beta(3,4)-alpha(3). The first two domains are of the same beta/beta/alpha fold. This domain can be found in the following proteins:
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle progression, and are required for the G1 and G2 stages of cell division. The proteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS), which is essential for their function. This regulatory subunit is a small protein of 79 to 150 residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known, while mammals have two highly related isoforms. The regulatory subunits exist as hexamers, formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual 12-stranded beta-barrel structure. Through the barrel centre runs a 12A diameter tunnel, lined by 6 exposed helix pairs. Six kinase units can be modelled to bind the hexameric structure, which may thus act as a hub for cyclin-dependent protein kinase multimerisation.
The N-terminal domain of the ribosomal protein L9 is a regulatory RNA-binding module that binds to 23rRNA. L9 is composed of two domains and functions as a structural protein in the large subunit of the ribosome.
The N-terminal domain of eukaryotic RNase HI, which is lacking in retroviral and prokaryotic enzymes, shows a striking structural similarity to the L9 N-terminal domain, and may also function as a regulatory RNA-binding module. Eukaryotic RNases HI possess either one or two copies of the small N-terminal domain, in addition to the well-conserved catalytic RNase H domain. RNase HI belongs to the family of ribonuclease H enzymes that recognise RNA:DNA hybrids and degrade the RNA component.
The structures of both the L9 and the RNase HI N-terminal domains consist of a three-stranded antiparallel beta-sheet sandwiched between two short alpha-helices. The hydrophobic core of the domain is formed by the conserved residues that are involved in the packing of the alpha-helices onto the beta-sheet. The (beta)2/alpha/beta/alpha topology of the domain differs from the structures of known RNA binding domains such as the double-stranded RNA binding domain (dsRBD), the hnRNP K homology (KH) domain and the RNP motif.
The PH (phosphorolytic) domain is responsible for 3'-5' exoribonuclease activity, although in some proteins this domain has lost its catalytic function. An active PH domain uses inorganic phosphate as a nucleophile, adding it across the phosphodiester bond between the end two nucleotides in order to release ribonucleoside 5'-diphosphate (rNDP) from the 3' end of the RNA substrate.
PH domains can be found in bacterial/organelle RNases and PNPases (polynucleotide phosphorylases), as well as in archaeal and eukaryotic RNA exosomes, the later acting as nano-compartments for the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA). Bacterial/organelle PNPases share a common barrel structure with RNA exosomes, consisting of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber . Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel.
This entry represents the phosphorolytic (PH) domain 2, which has a core 3-layer alpha/beta/alpha structure. This domain is found in bacterial/organelle PNPases and in archaeal/eukaryotic exosomes..
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.
Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
The alpha and beta2 adaptor subunits can each be divided into a trunk domain and the appendage domain (or ear domain), separated by a linker region. Clathrin polymerisation is promoted by its binding to the beta2 appendage and hinge domains. The alpha appendage domain interacts with a number of accessory proteins, including eps15, epsin, amphiphysin, AP180, auxilin, numb, and Dab2, thereby regulating the translocation of these proteins to the bud site.
This entry represents a subdomain of the appendage (ear) domain of alpha- and beta-adaptin from AP clathrin adaptor complexes, and the appendage domain of the gamma subunit of coatomer complexes. These domains have a three-layer arrangement, alpha-beta-alpha, with a bifurcated antiparallel beta-sheet. Although the appendage domains from AP adaptins and coatomers share a similar fold, there is little sequence identity between them. However, they also share similar motif-based cargo recognition and accessory factor recruitment mechanisms.
More information about these proteins can be found at Protein of the Month: Clathrin.
This entry represents a structural domain consisting of a 3-layer alpha/beta/alpha fold. The beta layer is composed of seven beta-sheets, and the overall order is: (beta-hairpin)-beta(3)-alpha-beta(4)-alpha. Domains with this structure are found in the following protein families:
This entry represents a structural domain found in several acyl-CoA acyltransferase enzymes. This domain has a 3-layer alpha/beta/alpha structure that contains mixed beta-sheets, and can be found in the following proteins:
Several proteins carry a duplication of this domain, which consists of two NAT-like domains swapped with the C-terminal strands, including:
A protein structurally similar to profilin is present in the genome of Variola virus and Vaccinia virus (gene A42R).
Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.
The allergens in this family include allergens with the following designations: Ara t 8, Bet v 2, Cyn d 12, Hel a 2, Mer a 1 and Phl p 11.
The generic name 'NUDIX hydrolases' (NUcleoside DIphosphate linked to some other moiety X) has been coined for this domain family. The family can be divided into a number of subgroups, of which MutT anti- mutagenic activity represents only one type; most of the rest hydrolyse diverse nucleoside diphosphate derivatives (including ADP-ribose, GDP- mannose, TDP-glucose, NADH, UDP-sugars, dNTP and NTP).
This entry represents a structural domain consisting of segregated alpha and beta regions in 3-layers. Homologous domains with this structure are found in:
DHBP synthase RibB catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate, the latter serving as the biosynthetic precursor for the xylene ring of riboflavin. In Photobacterium leiognathi, the riboflavin synthesis genes ribB (DHBP synthase), ribE (riboflavin synthase), ribH (lumazone synthase) and ribA (GTP cyclohydrolase II) all reside in the lux operon. RibB is sometimes found as a bifunctional enzyme with GTP cyclohydrolase II that catalyses the first committed step in the biosynthesis of riboflavin. No sequences with significant homology to DHBP synthase are found in the metazoa.
The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases.
5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMPThis provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process cause various biological and genetic abnormalities, such as thymineless death. The enzyme is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.
The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.
This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.
The entry also contains proteins that have lost catalytic activity, for example Spt16 , which is a component of the FACT complex. The crystal structure of the N terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails.
The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( ) and Pob3p (). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.
Transcription factor TFIID (also known as TATA-binding protein, TBP) is a general factor that plays a central role in the activation of eukaryotic genes transcribed by RNA polymerase II. TFIID binds specifically to the TATA-box promoter element, which lies close to the position of transcription initiation. The C-terminal domain (~180 residues) of eukaryotic TFIID sequences is highly conserved and is involved in TATA-box binding. The most striking feature of the domain is the presence of 2 conserved 77 amino-acid repeats. The symmetrical disposition of these features generates a saddle-shaped structure that straddles the DNA.
DNA glycosylases are involved in the repair of damaged bases in DNA, acting to cleave the bond between the damaged, modified base and the deoxyribose sugar backbone of the DNA. These DNA repair activities are conserved from bacteria to man. Different DNA glycosylases can have different overall folds, even though many of them work by a common mechanism, involving bending the DNA and clamping on to the damaged base to excise it. This entry is represented by 3-methyladenine DNA glycosylase II (AlkA) from Escherichia coli and human 8-oxoguanine glycosylase, whose N-terminal domains display a beta-alpha-beta(4)-alpha fold similar to that found in the C-terminal domain of TFIID. However, unlike TFIID, which contains a duplication of this fold, these DNA glycosylases carry only a single copy of this fold.
S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.
In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.
The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.
This entry represents the B3/B4 domain found in tRNA synthetase beta subunits as well as in some non-tRNA synthetase proteins. This domain has a 3-layer structure, and contains a beta-sandwich fold of unusual topology, and contains a putative tRNA-binding structural motif. In Thermus thermophilus, both the catalytic alpha- and the non-catalytic beta-subunits comprise the characteristic fold of the class II active-site domains. The presence of an RNA-binding domain, similar to that of the U1A spliceosomal protein, in the beta-subunit of tRNA synthetase indicates structural relationships among different families of RNA-binding proteins.
Aminoacyl-tRNA synthetases can catalyse editing reactions to correct errors produced during amino acid activation and tRNA esterification, in order to prevent the attachment of incorrect amino acids to tRNA. The B3/B4 domain of the beta subunit contains an editing site, which lies close to the active site on the alpha subunit. Disruption of this site abolished tRNA editing, a process that is essential for faithful translation of the genetic code.
This entry includes Hydrogen expression/formation protein, HypE, which may be involved in the maturation of NifE hydrogenase; AIR synthase and FGAM synthase, which are involved in de novo purine biosynthesis; and selenide, water dikinase, an enzyme which synthesizes selenophosphate from selenide and ATP.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S8 is known to bind directly to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups eubacterial, algal and plant chloroplast, cyanelle, archaebacterial and Marchantia polymorpha mitochondrial S8; mammalian and plant S15A; and yeast S22 (S24) ribosomal proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.
Protein kinases catalyze the phosphotransfer reaction fundamental to most signalling and regulatory processes in the eukaryotic cell. The catalytic subunit contains a core that is common to both serine/threonine and tyrosine protein kinases. The catalytic domain contains the nucleotide-binding site and the catalytic apparatus in an inter-lobe cleft. Structurally it shares functional and structural similarities with the ATP-grasp fold, which is found in enzymes that catalyse the formation of an amide bond, and with PIPK (phosphoinositol phosphate kinase). The three-dimensional fold of the protein kinase catalytic domain is similar to domains found in several other proteins. These include the catalytic domain of actin-fragmin kinase, an atypical protein kinase that regulates the F-actin capping activity in plasmodia; the catalytic domain of phosphoinositide-3-kinase (PI3K), which phosphorylates phosphoinositides and as such is involved in a number of fundamental cellular processes such as apoptosis, proliferation, motility and adhesion; the catalytic domain of the MHCK/EF2 kinase, an atypical protein kinase that includes the TRP (transient channel potential) calcium-channel kinase involved in the modulation of calcium channels in eukaryotic cells in response to external signals; choline kinase, which catalyses the ATP-dependent phosphorylation of choline during the biosynthesis of phosphatidylcholine; and 3',5'-aminoglycoside phosphotransferase type IIIa, a bacterial enzyme that confers resistance to a range of aminoglycoside antibiotics.
The name HECT comes from 'Homologous to the E6-AP Carboxyl Terminus'. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.
These proteins transfer the 4'-phosphopantetheine (4'-PP) moiety from coenzyme A (CoA) to the invariant serine of pp-binding. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4'-PP. This superfamily consists of two subtypes: The ACPS type such as ACPS_ECOLI and the Sfp type such as SFP_BACSU. The structure of the Sfp type is known, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion.
This domain is found in a large number of proteins including magnesium dependent endonucleases and phosphatases involved in intracellular signalling. Proteins this domain is found in include: AP endonuclease proteins, DNase I proteins, Synaptojanin an inositol-1,4,5-trisphosphate phosphatase and Sphingomyelinase.
S-adenosylmethionine decarboxylase (AdoMetDC) catalyzes the removal of the carboxylate group of S-adenosylmethionine to form S-adenosyl-5'-3-methylpropylamine which then acts as the n-propylamine group donor in the synthesis of the polyamines spermidine and spermine from putrescine.
The catalytic mechanism of AdoMetDC involves a covalently-bound pyruvoyl group. This group is post-translationally generated by a self-catalyzed intramolecular proteolytic cleavage reaction between a glutamate and a serine. This cleavage generates two chains, beta (N-terminal) and alpha (C-terminal). The N-terminal serine residue of the alpha chain is then converted by nonhydrolytic serinolysis into a pyruvyol group.
chorismate + l-glutamine = anthranilate + pyruvate + l-glutamate.The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that catalyses the formation of anthranilate using ammonia rather than glutamine, while component II provides glutamine amidotransferase activity
This entry represents a structural motif found at the C-terminal of lactate dehydrogenaseand malate dehydrogenases, as well as at the C-terminal of family 4 glycoside hydrolases. These domains have an unusual fold consisting of segregated alpha-helical and beta-sheet regions, although they contain predominantly anti-parallel beta-sheets.
L-lactate dehydrogenases are metabolic enzymes that catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.
O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Glycoside hydrolase family 4comprises enzymes with several known activities; 6-phospho-beta-glucosidase; 6-phospho-alpha-glucosidase; alpha-galactosidase.
Phage integrases are enzymes that mediate unidirectional site-specific recombination between two DNA recognition sequences, the phage attachment site, attP, and the bacterial attachment site, attB. Integrases may be grouped into two major families, the tyrosine recombinases and the serine recombinases, based on their mode of catalysis. Tyrosine family integrases, such as Bacteriophage lambda integrase, utilise a catalytic tyrosine to mediate strand cleavage, tend to recognize longer attP sequences, and require other proteins encoded by the phage or the host bacteria.
The 356 amino acid lambda integrase consists of two domains: an N-terminal domain that includes residues 1-64 and is responsible for binding the arm-type sites of attP, and a C-terminal domain (CTD) that binds the lower affinity core-type sites and contains the catalytic site. The CTD can be further divided into the core-type binding domain (residues 65-169) and the catalytic core domain (170-356), the later representing this entry. The catalytic core adopts an alpha3-beta3-alpha4 fold, where one side of the beta sheet is exposed.
The recombinases Cre from phage P1, XerD from Escherichia coli and Flp from yeast are members of the tyrosine recombinase family, and have a two-domain motif resembling that of lambda integrase, as well as sharing a conserved binding mechanism. The structural fold of their catalytic core domains resemble that of Lambda integrase
The catalytic core of the eukaryotic DNA topoisomerase I shares significant structural similarity with the bacteriophage family of DNA integrases. Topoisomerases I promote the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination.
Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::
N-formyl-L-methionine + H2O = formate + methionyl peptideCatalytic efficiency strongly depends on the identity of the bound metal.
The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .
The egg peptide speract receptor is a transmembrane glycoprotein. Other members of this family include the macrophage scavenger receptor type I (a membrane glycoprotein implicated in the pathologic deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell proliferation).
Fibrinogen plays key roles in both blood clotting and platelet aggregation. During blood clot formation, the conversion of soluble fibrinogen to insoluble fibrin is triggered by thrombin, resulting in the polymerisation of fibrin, which forms a soft clot; this is then converted to a hard clot by factor XIIIA, which cross-links fibrin molecules. Platelet aggregation involves the binding of the platelet protein receptor integrin alpha(IIb)-beta(3) to the C-terminal D domain of fibrinogen. In addition to platelet aggregation, platelet-fibrinogen interaction mediates both adhesion and fibrin clot retraction.
Fibrinogen occurs as a dimer, where each monomer is composed of three non-identical chains, alpha, beta and gamma, linked together by several disulphide bonds. The N-terminals of all six chains come together to form the centre of the molecule (E domain), from which the monomers extend in opposite directions as coiled coils, followed by C-terminal globular domains (D domains). Therefore, the domain composition is: D-coil-E-coil-D. At each end, the C-terminal of the alpha chain extends beyond the D domain as a protuberance that is important for cross-linking the molecule.
During clot formation, the N-terminal fragments of the alpha and beta chains (within the E domain) in fibrinogen are cleaved by thrombin, releasing fibrinopeptides A and B, respectively, and producing fibrin. This cleavage results in the exposure of four binding sites on the E domain, each of which can bind to a D domain from different fibrin molecules. The binding of fibrin molecules produces a polymer consisting of a lattice network of fibrins that form a long, branching, flexible fibre. Fibrin fibres interact with platelets to increase the size of the clot, as well as with several different proteins and cells, thereby promoting the inflammatory response and concentrating the cells required for wound repair at the site of damage.
This entry represents the C-terminal globular D domain of the alpha, beta and gamma chains. These domains are related to domains in other proteins: in the Parastichopus parvimensis (Sea cucumber) fibrogen-like FreP-A and FreP-B proteins; in the C-terminus of the Drosophila scabrous protein that is involved in the regulation of neurogenesis, possibly through the inhibition of R8 cell differentiation; and in ficolin proteins, which display lectin activity towards N-acetylglucosamine through their fibrogen-like domains.
More information about these proteins can be found at Protein of the Month: Fibrinogen.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha (RpoA) subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta') can interact.
The dimerisation domains differ between the different subunit families. In eukaryotic Rpb3, archaeal D and bacterial RpoA subunits, the dimerisation domain is comprised of a central insert domain, which interrupts an Rpb11-like domain, dividing it into two halves. In eukaryotic Rpb11 and archaeal L subunits, the insert domain is lacking, leaving the Rpb11-like domain intact and contiguous.
This entry represents a beta-lactamase structural motif, which contins a cluster of alpha-helices and an alpha/beta sandwich. In addition to beta-lactamases, this domain is also found in D-ala carboxypeptidase/transpeptidase, esterase (EstB), the penicillin receptor BlaR (C-terminal domain), D-aminopeptidase (N-terminal domain), penicillin-biding proteins (e.g. PBP2x, PBP5), and in glutaminase (GlnA). Beta-lactamases are the most common bacterial resistance mechanism against beta-lactam antibiotics. Beta-lactamases appear to have evolved from DD-transpeptidases, which are penicillin-binding proteins involved in cell wall biosynthesis, and as such are one of the main targets of beta-lactam antibiotics.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry describes the core region of type IA topoisomerases, which are highly conserved enzymes that are structurally distinct from type IB enzymes. The structures of both topoisomerases I and III have been elucidated, and consist of four domains that together form a toroidal molecule with a central hole that is large enough to accommodate single- and double-stranded DNA. It is believed that the domains transiently separate from one another to allow the entrance and exit of DNA strands.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents the C-terminal of subunit B (gyrB and parE) and the N-terminal of subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent central region in eukaryotic topoisomerase II composed of a single polypeptide.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
This entry represents Spo11, a meiotic recombination protein found in eukaryotes, and subunit A of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea. These two types of proteins share structural homology.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. They can be divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA. Topoisomerase VI is a type IIB enzymes that assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein.
Spo11 is a meiosis-specific protein that is responsible for the initiation of recombination through the formation of DNA double-strand breaks by a type II DNA topoisomerase-like activity. Spo11 acts in conjunction with several other proteins, including Rec102 in yeast, to bring about meiotic recombination.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry represents the N-terminal DNA-binding domain found in eukaryotic topoisomerase I, which is a type IB enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies.
One of these, called class-IV, currently consists of proteins of about 270 to 415 amino-acid residues that share a few regions of sequence similarity. Surprisingly, the best conserved region does not include the lysine residue to which the pyridoxal-phosphate group is known to be attached, in ilvE, but is located some 40 residues at the C terminus side of the pyridoxal-phosphate-lysine. The D-amino acid transferases (D-AAT), which are among the members of this entry, are required by bacteria to catalyse the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the pr