AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors..
AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades.
They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it.
They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE).
The functional variety seen between AAA ATPases is in part due to their extensive number of accessory domains and factors, and to their variable organisation within oligomeric assemblies, in addition to changes in key functional residues within the ATPase domain itself.
More information about these proteins can be found at Protein of the Month: AAA ATPases.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.
ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.
The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.
The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.
The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
On the basis of sequence similarities a family of related ATP-binding proteins has been characterised.
The proteins belonging to this family also contain one or two copies of the 'A' consensus sequence or the 'P-loop'.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases), as well as flagellar ATPase and the termination factor Rho. The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .
In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.
In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.
The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function. The central domain contains the nucleotide-binding residues that make direct contact with the ADP/ATP molecule.
More information about this protein can be found at Protein of the Month: ATP Synthases.
In both prokaryotes and eukaryotes, there are three distinct types of elongation factors, EF-1alpha (EF-Tu), which binds GTP and an aminoacyl-tRNAand delivers the latter to the A site of ribosomes; EF-1beta (EF-Ts), which interacts with EF-1a/EF-Tu to displace GDP and thus allows the regeneration of GTP-EF-1a; and EF-2 (EF-G), which binds GTP and peptidyl-tRNA and translocates the latter from the A site to the P site. In EF-1-alpha, a specific region has been shown to be involved in a conformational change mediated by the hydrolysis of GTP to GDP. This region is conserved in both EF-1alpha/EF-Tu as well as EF-2/EF-G and thus seems typical for GTP-dependent proteins which bind non-initiator tRNAs to the ribosome. The GTP-binding protein synthesis factor family also includes the eukaryotic peptide chain release factor GTP-binding subunits and prokaryotic peptide chain release factor 3 (RF-3); the prokaryotic GTP-binding protein lepA and its homolog in yeast (GUF1) and Caenorhabditis elegans (ZK1236.1); yeast HBS1; rat statin S1; and the prokaryotic selenocysteine-specific elongation factor selB.
Prokaryotic and eukaryotic organisms respond to heat shock or other environmental stress by inducing the synthesis of proteins collectively known as heat-shock proteins (hsp). Amongst them is a family of proteins with an average molecular weight of 20 Kd, known as the hsp20 proteins. These seem to act as chaperones that can protect other proteins against heat-induced denaturation and aggregation. Hsp20 proteins seem to form large heterooligomeric aggregates. Structurally, this family is characterised by the presence of a conserved C-terminal domain of about 100 residues.
Heat shock proteins, Hsp70 chaperones help to fold many proteins. Hsp70 assisted folding involves repeated cycles of substrate binding and release. Hsp70 activity is ATP dependent. Hsp70 proteins are made up of two regions: the amino terminus is the ATPase domain and the carboxyl terminus is the substrate binding region.
Hsp70 proteins have an average molecular weight of 70 kDa. In most species,there are many proteins that belong to the hsp70 family. Some of these are only expressed under stress conditions (strictly inducible), while some are present in cells under normal growth conditions and are not heat-inducible (constitutive or cognate). Hsp70 proteins can be found in different cellular compartments(nuclear, cytosolic, mitochondrial, endoplasmic reticulum, for example).
The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins. It has been shown to bind RNA. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitro RNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently.
According to structural analysis the KH domain can be separated in two groups. The first group or type-1 contain a beta-alpha-alpha-beta-beta-alpha structure, whereas in the type-2 the two last beta-sheet are located in the N terminal part of the domain (alpha-beta-beta-alpha-alpha-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-1 KH domain include bacterial polyribonucleotide nucleotidyltransferases; vertebrate fragile X mental retardation protein 1 (FMR1); eukaryotic heterogeneous nuclear ribonucleoprotein K (hnRNP K), one of at least 20 major proteins that are part of hnRNP particles in mammalian cells; mammalian poly(rC) binding proteins; Artemia salina glycine-rich protein GRP33; yeast PAB1-binding protein 2 (PBP2); vertebrate vigilin; and human high-density lipoprotein binding protein (HDL-binding protein).
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
A number of proteins, some of which are known to be receptors for growth factors have been found to contain a cysteine-rich domain at the N-terminal region that can be subdivided into four (or in some cases, three) repeats containing six conserved cysteines all of which are involved in intrachain disulphide bonds.
CD27 (also called S152 or T14) mediates a co-stimulatory signal for T and B cell activation and is involved in murine T cell development. Tyrosine-phosphorylation of ZAP-70 following CD27 ligation of T cells has been reported, but not confirmed independently. CD30 was originally identified as Ki-1, an antigen expressed on Reed-Sternberg cells in Hodgkin's lymphomas and other non-Hodgkin's lymphomas, particularly diffuse large-cell lymphoma and immunoblastic lymphoma. CD30 has pleiotropic effects on CD30-positive lymphoma cell lines ranging from cell proliferation to cell death. It is thought to be involved in negative selection of T-cells in the thymus and is involved in TCR-mediated cell death. CD30 is a member of the TNFR family of molecules, activate NFkB through interaction with TRAF2 and TRAF5. CD40 (Bp50) plays a central role in the regulation of cell-mediated immunity as well as antibody mediated immunity. It is central to T cell dependent (TD)-responses and may influence survival of B cell lymphomas.
CD95 (also called APO-1, fas antigen, Fas tumour necrosis factor receptor superfamily, member 6, TNFRSF6 or apoptosis antigen 1, APT1) is expressed, typically at high levels, on activated T and B cells. It is involved in the mediation of apoptosis-inducing signals.
Other proteins known to belong to this family are, tumour Necrosis Factor type I and type II receptors (TNFR), Rabbit fibroma virus soluble TNF receptor (protein T2), lymphotoxin alpha/beta receptor, low-affinity nerve growth factor receptor (LA-NGFR) (p75), T-cell antigen OX40, Wsl-1, a receptor (for a yet undefined ligand) that mediates apoptosis and Vaccinia virus protein A53 (SalF19R).
CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).
Actin is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins.
In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.
Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.
The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.
The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.
It has been shown that, the N-terminal N domains of members of the plasminogen/hepatocyte growth factor family, the apple domains of the plasma prekallikrein/coagulation factor XI family, and domains of various nematode proteins belong to the same module superfamily, the PAN module. PAN contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge that links the N and C termini of the domain. The domain is found in diverse proteins, in some the domain mediates protein-protein interactions, in others it mediates protein-carbohydrate interactions.
The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.
The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 ÂupÂ, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .
More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation.
Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event.
This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.
In the mitochondrion of eukaryotes and in aerobic prokaryotes, cytochrome b is a component of respiratory chain complex III - also known as the bc1 complex or ubiquinol-cytochrome c reductase. In plant chloroplasts and cyanobacteria, there is a analogous protein, cytochrome b6, a component of the plastoquinone-plastocyanin reductase , also known as the b6f complex.
Cytochrome b/b6 is an integral membrane protein of approximately 400 amino acid residues that probably has 8 transmembrane segments. In plants and cyanobacteria, cytochrome b6 consists of two subunits encoded by the petB and petD genes. The sequence of petB is colinear with the N-terminal part of mitochondrial cytochrome b, while petD corresponds to the C-terminal part. Cytochrome b/b6 non-covalently binds two haem groups, known as b562 and b566. Four conserved histidine residues are postulated to be the ligands of the iron atoms of these two haem groups.
Apart from regions around some of the histidine haem ligands, there are a few conserved regions in the sequence of b/b6. The best conserved of these regions includes an invariant P-E-W triplet which lies in the loop that separates the fifth and sixth transmembrane segments. It seems to be important for electron transfer at the ubiquinone redox site - called Qz or Qo (where o stands for outside) - located on the outer side of the membrane. This entry is the C-terminus of these proteins.
In the mitochondrion of eukaryotes and in aerobic prokaryotes, cytochrome b is a component of respiratory chain complex III - also known as the bc1 complex or ubiquinol-cytochrome c reductase. In plant chloroplasts and cyanobacteria, there is a analogous protein, cytochrome b6, a component of the plastoquinone-plastocyanin reductase , also known as the b6f complex.
Cytochrome b/b6 is an integral membrane protein of approximately 400 amino acid residues that probably has 8 transmembrane segments. In plants and cyanobacteria, cytochrome b6 consists of two subunits encoded by the petB and petD genes. The sequence of petB is colinear with the N-terminal part of mitochondrial cytochrome b, while petD corresponds to the C-terminal part. Cytochrome b/b6 non-covalently binds two haem groups, known as b562 and b566. Four conserved histidine residues are postulated to be the ligands of the iron atoms of these two haem groups.
Apart from regions around some of the histidine haem ligands, there are a few conserved regions in the sequence of b/b6. The best conserved of these regions includes an invariant P-E-W triplet which lies in the loop that separates the fifth and sixth transmembrane segments. It seems to be important for electron transfer at the ubiquinone redox site - called Qz or Qo (where o stands for outside) - located on the outer side of the membrane. This entry is the N-terminus of these proteins.
Cytochromes c (cytC) can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more generally, two thioether bonds involving sulphydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes.
Ambler recognised four classes of cytC.
Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N-terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C-terminus. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cytC and prokaryotic 'short' cyt c2 exemplified by Rhodopila globiformis cyt c2; class IA includes 'long' cyt c2, such as Rhodospirillum rubrum cyt c2 and Aquaspirillum itersonii cyt c-550, which have several extra loops by comparison with class IB cytC.
Ferredoxins are iron-sulphur proteins that mediate electron transfer in a range of metabolic reactions; they fall into several subgroups according to the nature of their iron-sulphur cluster(s). One group, originally found in bacteria, has been termed "bacterial-type", in which the active centre is a 4Fe-4S cluster. 4Fe-4S ferredoxins may in turn be subdivided into further groups, based on their sequence properties. Most contain at least one conserved domain, including four Cys residues that bind to a 4Fe-4S centre.
During the evolution of bacterial-type ferredoxins, intrasequence gene duplication, transposition and fusion events occured, resulting in the appearance of proteins with multiple iron-sulphur centres: e.g. dicluster- type (2[4Fe-4S]) and polyferredoxins, iron-sulphur subunits of bacterial succinate dehydrogenase/fumarate reductase, formate hydrogenlyase and formate dehydrogenase complexes, pyruvate-flavodoxin oxidoreductase, NADH:ubiquinone reductase and others. In some bacterial ferredoxins, one of the duplicated domains has lost one or more of the four conserved Cys residues. These domains have either lost their iron-sulphur binding property, or bind to a 3Fe-4S centre instead of a 4Fe-4S centre. 3D structures are now known both for a number of monocluster-type and dicluster-type 4Fe-4S ferredoxins.
CAUTION: PRINTS signature in the current entry is known to miss protein matches and should be updated in the near future.
In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST.
Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.
Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain. This entry is the C-terminal domain.
Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme.
GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.
GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington's disease, Alzheimer's disease, Parkinson's disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins.
L-lactate dehydrogenases are metabolic enzymes which catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.
Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains, 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy. The 3-D structure of the head portion of myosin has been determined and a model for actin-myosin complex has been constructed.
The globular head is well conserved, some highly-conserved regions possibly relating to functional and structural domains. The rod-like tail starts with an invariant proline residue, and contains many repeats of a 28 residue region, interrupted at 4 regularly-spaced points known as skip residues. Although the sequence of the tail is not well conserved, the chemical character is, hydrophobic, charged and skip residues occuring in a highly ordered and repeated fashion.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Eukaryotic protein kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue which is important for the catalytic activity of the enzyme. This entry includes protein kinases from eukaryotes and viruses and may include some bacterial hits too.This entry describes a small NADH binding domain within a larger FAD binding domain described by It is found in both class I and class II oxidoreductases.
FAD flavoproteins belonging to the family of pyridine nucleotide-disulphide oxidoreductases (glutathione reductase, trypanothione reductase, lipoamide dehydrogenase, mercuric reductase, thioredoxin reductase, alkyl hydroperoxide reductase) share sequence similarity with a number of other flavoprotein oxidoreductases, in particular with ferredoxin-NAD+ reductases involved in oxidative metabolism of a variety of hydrocarbons (rubredoxin reductase, putidaredoxin reductase, terpredoxin reductase, ferredoxin-NAD+ reductase components of benzene 1,2-dioxygenase, toluene 1,2-dioxygenase, chlorobenzene dioxygenase, biphenyl dioxygenase), NADH oxidase and NADH peroxidase. Comparison of the crystal structures of human glutathione reductase and Escherichia coli thioredoxin reductase reveals different locations of their active sites, suggesting that the enzymes diverged from an ancestral FAD/NAD(P)H reductase and acquired their disulphide reductase activities independently.
Despite functional similarities, oxidoreductases of this family show no sequence similarity with adrenodoxin reductases and flavoprotein pyridine nucleotide cytochrome reductases (FPNCR). Assuming that disulphide reductase activity emerged later, during divergent evolution, the family can be referred to as FAD-dependent pyridine nucleotide reductases, FADPNR.
To date, 3D structures of glutathione reductase, thioredoxin reductase, mercuric reductase, lipoamide dehydrogenase, trypanothione reductase and NADH peroxidase have been solved. The enzymes share similar tertiary structures based on a doubly-wound alpha/beta fold, but the relative orientations of their FAD- and NAD(P)H-binding domains may vary significantly. By contrast with the FPNCR family, the folds of the FAD- and NAD(P)H-binding domains are similar, suggesting that the domains evolved by gene duplication.
Many members of the Ras superfamily of GTPases have been implicated in the regulation of hematopoietic cells, with roles in growth, survival, differentiation, cytokine production, chemotaxis, vesicle-trafficking, and phagocytosis. The Ras superfamily of proteins now includes over 150 small GTPases (distinguished from the large, heterotrimeric GTPases, the G-proteins). It comprises six subfamilies, the Ras, Rho, Ran, Rab, Arf, and Kir/Rem/Rad subfamilies. They exhibit remarkable overall amino acid identities, especially in the regions interacting with the guanine nucleotide exchange factors that catalyze their activation.
The RNase H domain is responsible for hydrolysis of the RNA portion of RNA x DNA hybrids, and this activity requires the presence of divalent cations (Mg2+ or Mn2+) that bind its active site. This domain is a part of a large family of homologous RNase H enzymes of which the RNase HI protein from Escherichia coli is the best characterised. Secondary structure predictions for the enzymes from E. coli, yeast, human liver and diverse retroviruses (such as Rous sarcoma virus and the Foamy viruses) supported, in every case, the five beta-strands (1 to 5) and four or five alpha-helices (A, B/C, D, E) that have been identified by crystallography in the RNase H domain of Human immunodeficiency virus 1 (HIV-1) reverse transcriptase and in E. coli RNase H. Reverse transcriptase (RT) is a modular enzyme carrying polymerase and ribonuclease H (RNase H) activities in separable domains. Reverse transcriptase (RT) converts the single-stranded RNA genome of a retrovirus into a double-stranded DNA copy for integration into the host genome. This process requires ribonuclease H as well as RNA- and DNA-directed DNA polymerase activities.
Retroviral RNase H is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. Bacterial RNase Hcatalyses endonucleolytic cleavage to 5'-phosphomonoester acting on RNA-DNA hybrids.
The 3D structure of the RNase H domain from diverse bacteria and retroviruses has been solved. All have four beta strands and four to five alpha helices. The E. coli RNase H1 protein binds a single Mg2+ ion cofactor in the active site of the enzyme. The divalent cation is bound by the carboxyl groups of four acidic residues, Asp-10, Glu-48, Asp-70, and Asp-134. The first three acidic residues are highly conserved in all bacterial and retroviral RNase H sequences.
Many eukaryotic proteins containing one or more copies of a putative RNA-binding domain of about 90 amino acids are known to bind single-stranded RNAs. The largest group of single strand RNA-binding proteins is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence. RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu). The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) appears to have two RRM-like domains with specialised features for protein recognition. The motif also appears in a few single stranded DNA binding proteins.
The typical RRM consists of four anti-parallel beta-strands and two alpha-helices arranged in a beta-alpha-beta-beta-alpha-beta fold with side chains that stack with RNA bases. Specificity of RNA binding is determined by multiple contacts with surrounding amino acids. A third helix is present during RNA binding in some cases. The RRM is reviewed in a number of publications.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses).
Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.
Retroviral reverse transcriptase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The discovery of retroelements in the prokaryotes raises intriguing questions concerning their roles in bacteria and the origin and evolution of reverse transcriptases and whether the bacterial reverse transcriptases are older than eukaryotic reverse transcriptases.
Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.
The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.
The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity. Some subtilisins are mosaic proteins, and others contain N- and C-terminal extensions that show no sequence similarity to any other known protein. Based on sequence homology, a subdivision into six families has been proposed.
The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site. Members of the kexin family, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of Cys-173 near to the active histidine.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish.
Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease.
Recent genome-sequencing data and a wealth of biochemical and molecular genetic investigations have revealed the occurrence of dozens of families of primary and secondary transporters. Two such families have been found to occur ubiquitously in all classifications of living organisms. These are the ATP-binding cassette (ABC) superfamily and the major facilitator superfamily (MFS), also called the uniporter-symporter-antiporter family. While ABC family permeases are in general multicomponent primary active transporters, capable of transporting both small molecules and macromolecules in response to ATP hydrolysis the MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients. Although well over 100 families of transporters have now been recognized and classified, the ABC superfamily and MFS account for nearly half of the solute transporters encoded within the genomes of microorganisms. They are also prevalent in higher organisms. The importance of these two families of transport systems to living organisms can therefore not be overestimated.
The MFS was originally believed to function primarily in the uptake of sugars but subsequent studies revealed that drug efflux systems, Krebs cycle metabolites, organophosphate:phosphate exchangers, oligosaccharide:H1 symport permeases, and bacterial aromatic acid permeases were all members of the MFS. These observations led to the probability that the MFS is far more widespread in nature and far more diverse in function than had been thought previously. 17 subgroups of the MFS have been identified.
Evidence suggests that the MFS permeases arose by a tandem intragenic duplication event in the early prokaryotes. This event generated a 2-transmembrane-spanner (TMS) protein topology from a primordial 6-TMS unit. Surprisingly, all currently recognized MFS permeases retain the two six-TMS units within a single polypeptide chain, although in 3 of the 17 MFS families, an additional two TMSs are found. Moreover, the well-conserved MFS specific motif between TMS2 and TMS3 and the related but less well conserved motif between TMS8 and TMS9 prove to be a characteristic of virtually all of the more than 300 MFS proteins identified.
Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein .
Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.
A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity. The various forms of PDI which are currently known are:
Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:
This entry represents the thioredoxin domain.
Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain , three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers . EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.
This repeat was first described in 1986 by Lawler and Hynes. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) as well as extracellular matrix protein like mindin, F-spondin, SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium contain one or more instance of this repeat. It has been involved in cell-cell interraction, inhibition of angiogenesis and apoptosis.
The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat.
This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea.
C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the classical C2H2 type zinc finger domain.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:
where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
alcohol + NAD = aldehyde or ketone + NADHCurrently three structurally and catalytically different types of alcohol dehydrogenases are known:
In addition, this family includes NADP-dependent quinone oxidoreductase, an enzyme found in bacteria (gene qor), in yeast and in mammals where, in some species such as rodents, it has been recruited as an eye lens protein and is known as zeta-crystallin . The sequence of quinone oxidoreductase is distantly related to that other zinc-containing alcohol dehydrogenases and it lacks the zinc-ligand residues. The torpedo fish and mammalian synaptic vesicle membrane protein vat-1 is related to qor.
This entry represents the cofactor-binding domain of these enzymes, which is normally found towards the C-terminus. Structural studies indicate that it forms a classical Rossman fold that reversibly binds NAD(H).
Two different types of thiolase are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase and 3-ketoacyl-CoA thiolase. 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.
In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.
There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.
Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases.
Beta-ketoacyl-ACP synthase(KAS) is the enzyme that catalyzes the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component of a number of enzymatic systems, including fatty acid synthetase (FAS), which catalyzes the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH; the multi-functional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum, which is involved in the biosynthesis of a polyketide antibiotic; polyketide antibiotic synthase enzyme systems; Emericella nidulans multifunctional protein Wa, which is involved in the biosynthesis of conidial green pigment; Rhizobium nodulation protein nodE, which probably acts as a beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain; and yeast mitochondrial protein CEM1. The condensation reaction is a two step process, first the acyl component of an activated acyl primer is transferred to a cysteine residue of the enzyme and is then condensed with an activated malonyl donor with the concomitant release of carbon dioxide.
This entry represents the N-terminal domain of beta-ketoacyl-ACP synthases.
The ferredoxin protein family are electron carrier proteins with an iron-sulphur cofactor that act in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s) and according to sequence similarities.
This entry represents members of the 2Fe-2S ferredoxin family that have a general core structure consisting of beta(2)-alpha-beta(2), which includes putidaredoxin and terpredoxin, and adrenodoxin. They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes and in multidomain proteins, such as aldehyde oxidoreductase (N-terminal), xanthine oxidase (N-terminal), phthalate dioxygenase reductase (C-terminal), succinate dehydrogenase iron-sulphur protein (N-terminal), and methane monooxygenase reductase (N-terminal).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.
The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity. Members of the papain family are widespread, found in baculovirus, eubacteria, yeast, and practically all protozoa, plants and mammals. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate.
The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159.
Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.
Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.
The enzyme complex consists of 3-4 subunits (prokaryotes) up to 13 polypeptides (mammals) of which only the catalytic subunit (equivalent to mammalian subunit I (CO I)) is found in all haem-copper respiratory oxidases. The presence of a bimetallic centre (formed by a high-spin haem and copper B) as well as a low-spin haem, both ligated to six conserved histidine residues near the outer side of four transmembrane spans within CO I is common to all family members. In contrast to eukaryotes the respiratory chain of prokaryotes is branched to multiple terminal oxidases. The enzyme complexes vary in haem and copper composition, substrate type and substrate affinity. The different respiratory oxidases allow the cells to customize their respiratory systems according to a variety of environmental growth conditions.
It has been shown that eubacterial quinol oxidase was derived from cytochrome c oxidase in Gram-positive bacteria and that archaebacterial quinol oxidase has an independent origin. A considerable amount of evidence suggests that proteobacteria (Purple bacteria) acquired quinol oxidase through a lateral gene transfer from Gram-positive bacteria.
Nitric oxide reductase (NOR) exists in denitrifying species of archae and eubacteria and is a heterodimer of cytochromes b and c. Phenazine methosulphate can act as acceptor. The prosite signature in this entry recognises the haem-copper site of the nitric oxidases.
Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).
Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.
Glutamine amidotransferase (GATase) activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. This activity is found in a range of biosynthetic enzymes, including glutamine amidotransferase, anthranilate synthase component II, p-aminobenzoate, and glutamine-dependent carbamoyl-transferase (CPSase). Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. On the basis of sequence similarities two classes of GATase domains have been identified, class-I (also known as trpG-type) and class-II (also known as purF-type). Class-I GATase domains are defined by a conserved catalytic triad consisting of cysteine, histidine and glutamate. Class-I GPTase domains have been found in the following enzymes, the second component of anthranilate synthase and 4-amino-4-deoxychorismate (ADC) synthase; CTP synthase; GMP synthase; glutamine-dependent carbamoyl-phosphate synthase; phosphoribosylformylglycinamidine synthase II; and the histidine amidotransferase hisH.
These signatures also detect peptidases belonging to MEROPS peptidase family C26 (gamma-glutamyl hydrolase), and non-peptidase homologs belonging to family C56 (PfpI endopeptidase) both of which are members of clan PC(C). Other members of family C56 are found in
Partially folded polypeptide chains, either newly made by ribosomes or emerging from mature proteins unfolded by stress, run the risk of aggregating with one another to the detriment of the organism. Folding of newly synthesised polypeptides in the crowded cellular environment requires the assistance of molecular chaperone proteins, such as the large bacterial chaperonins GroEL and GroES.
GroEL and GroES prevent aggregation by encapsulating individual chains within the so-called 'Anfinsen cage' provided by the GroEL-GroES complex, where they can fold in isolation from one another. GroEL consists of two heptameric rings of identical ATPase subunits stacked back to back, containing a cage in each ring. Each subunit consists of three domains. The equatorial domain contains the nucleotide binding site and is connected by a flexible intermediate domain with the apical domain. The latter presents several hydrophobic amino-acid side chains at the top of the ring, orientated towards the cavity of the cage. These side chains are involved in binding either a partially folded polypeptide chain or a single molecule of GroES.
The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins, which include 10 kDa and 60 kDa proteins. These are found in abundance in prokaryotes, chloroplasts and mitochondria. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions.
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The cpn10 and cpn60 oligomers also require Mg2+-ATP in order to interact to form a functional complex, although the mechanism of this interaction is as yet unknown. This chaperonin complex is essential for the correct folding and assembly of polypeptides into oligomeric structures, of which the chaperonins themselves are not a part. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease, and is thought to play a role in the protection of the Legionella bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A. The TCP-1 family of proteins act as molecular chaperones for tubulin, actin and probably some other proteins. They are weakly, but significantly, related to the cpn60/groEL chaperonin family.
Glutamine synthetase (GS) plays an essential role in the metabolism of nitrogen by catalyzing the condensation of glutamate and ammonia to form glutamine.
There seem to be three different classes of GS:
While the three classes of GS's are clearly structurally related, the sequence similarities are not so extensive.
Triosephosphate isomerase (TIM) is the glycolytic enzyme that catalyses the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. TIM plays an important role in several metabolic pathways and is essential for efficient energy production. It is a dimer of identical subunits, each of which is made up of about 250 amino-acid residues. A glutamic acid residue is involved in the catalytic mechanism. The sequence around the active site residue is perfectly conserved in all known TIM's. Deficiencies in TIM are associated with haemolytic anaemia coupled with a progressive, severe neurological disorder.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.
This entry represents an ATPase-associated region found in P-type ATPases. P-type (or E1-E2-type) ATPases that form an aspartyl phosphate intermediate in the course of ATP hydrolysis, can be divided into 4 major groups: (1) Ca2+-transporting ATPases; (2) Na+/K+- and gastric H+/K+-transporting ATPases; (3) plasma membrane H+-transporting ATPases (proton pumps) of plants, fungi and lower eukaryotes; and (4) all bacterial P-type ATPases, except the g2+-ATPase of Salmonella typhimurium, which is more similar to the eukaryotic sequences. However, great variety of sequence analysis methods results in diversity of classification.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The core histones together with some other DNA binding proteins appear to form a superfamily defined by a common fold and distant sequence similarities, . Some proteins contain local homology domains related to the histone fold.
Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumour promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC). Phorbol esters can directly stimulate PKC. The N-terminal region of PKC, known as C1, has been shown to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain, which is about 50 amino-acid residues long, and which is essential for DAG/PE-binding. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain.
A variety of bacterial transferases contain a repeat structure composed of tandem repeats of a [LIV]-G-X(4) hexapeptide, which, in the tertiary structure of LpxA (UDP N-acetylglucosamine acyltransferase), has been shown to form a left-handed parallel beta helix. A number of different transferase protein families contain this repeat, such as galactoside acetyltransferase-like proteins, the gamma-class of carbonic anhydrases, and tetrahydrodipicolinate-N-succinlytransferases (DapD), the latter containing an extra N-terminal 3-helical domain.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
Cyclins contain two domains of similar all-alpha fold, of which this entry is associated with the N-terminal domain.DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation. A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known, however, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.
This region of DNA polymerase B appears to consist of more than one structural domain, possibly including elongation, DNA-binding and dNTP binding activities.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The F-ATPases (or F1F0-ATPases) and V-ATPases (or V1V0-ATPases) are each composed of two linked complexes: the F1 or V1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0 or V0 complex that forms the membrane-spanning pore. The F- and V-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .
This entry represents subunit C (also called subunit 9, or proteolipid in F-ATPases, or the 16 kDa proteolipid in V-ATPases) found in the F0 or V0 complex of F- and V-ATPases, respectively. In F-ATPases, ten C subunits form an oligomeric ring that makes up the F0 rotor. The flux of protons through the ATPase channel drives the rotation of the C subunit ring, which in turn is coupled to the rotation of the F1 complex gamma subunit rotor due to the permanent binding between the gamma and epsilon subunits of F1 and the C subunit ring of F0. The sequential protonation and deprotonation of Asp61 of subunit C is coupled to the stepwise movement of the rotor.
In V-ATPases, there are three proteolipid subunits (c, c and cÂÂ) that form part of the proton-conducting pore, each containing a buried glutamic acid residue that is essential for proton transport, and together they form a hexameric ring spanning the membrane.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Protein phosphorylation plays a central role in the regulation of cell functions, causing the activation or inhibition of many enzymes involved in various biochemical pathways. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental.
The metallo-phosphoesterase motif is found in a large number of proteins invoved in phosphoryation. These include serine/threonine phosphatases, DNA polymerase, exonucleases, and other phosphatases.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry includes the asparagine, aspartic acid and lysine tRNA synthetases.
A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.
The name PRT comes from phosphoribosyltransferase (PRTase) enzymes, which carry out phosphoryl transfer reactions on 5-phosphoribosyl-alpha1-pyrophosphate PRPP, an activated form of ribose-5-phosphate. Members of Phosphoribosyltransferase (PRT) are catalytic and are regulatory proteins involved in nucleotide synthesis and salvage. This includes a range of diverse phosphoribosyl transferase enzymes including adenine phosphoribosyltransferase; hypoxanthine-guanine-xanthine phosphoribosyltransferase; hypoxanthine phosphoribosyltransferase; ribose-phosphate pyrophosphokinase; amidophosphoribosyltransferase; orotate phosphoribosyltransferase;uracil phosphoribosyltransferase; and xanthine-guanine phosphoribosyltransferase .
Not all PRT proteins are enzymes. For example, in some bacteria PRT proteins regulate the expression of purine and pyrimidine synthetic genes.
Members of PRT are defined by the protein fold and by a short 13-residue sequence motif, The motif consists of four hydrophobic amino acids, two acidic amino acids and seven amino acids of variable character, usually including glycine and threonine. The motif has been predicted to be a PRPP-binding site in advance of structural information. Apart of this motif, different PRT proteins have a low level of sequence identity, less than 15%. The PRT sequence motif is only found in PRTases from the nucleotide synthesis and salvage pathways. Other PRTases, from the tryptophan, histidine and nicotinamide synthetic and salvage pathways, lack the PRT sequence motif and appear to be unrelated to each other and unrelated to the PRT family.
Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.
Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.
PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.
Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.
This entry represents the full PGK enzyme.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S4 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S4 is known to bind directly to 16S ribosomal RNA. Mutations in S4 have been shown to increase translational error frequencies. S4 is a protein of 171 to 205 amino-acid residues (except for NAM9, which is much larger). The crystal structure of a bacterial S4 protein revealed a two domain molecule. The first domain is composed of four helices in the known structure. The second domain is in the middle of the first one and displays some structural homology with the ETS DNA binding domain. This family includes small ribosomal subunit S4 from prokaryotes and S9 from animals.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This protein is known typically as S12 in bacteria, S23 in eukaryotes and as either S12 or S23 in the Archaea.
Bacterial S12 molecules contain a conserved aspartic acid residue which undergoes a novel post-translational modification, beta-methylthiolation, to form the corresponding 3-methylthioaspartic acid.
The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins . These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10).
The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.
Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.
The 3D structure of the C2 domain of synaptotagmin has been reported, the domain forms an eight-stranded beta sandwich constructed around a conserved 4-stranded motif, designated a C2 key. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel beta sandwich.
The 'pleckstrin homology' (PH) domain is a domain of about 100 residues that occurs in a wide range of proteins involved in intracellular signalling or as constituents of the cytoskeleton.
The function of this domain is not clear, several putative functions have been suggested:
It is possible that different PH domains have totally different ligand requirements.
The 3D structure of several PH domains has been determined. All known cases have a common structure consisting of two perpendicular anti-parallel beta sheets, followed by a C-terminal amphipathic helix. The loops connecting the beta-strands differ greatly in length, making the PH domain relatively difficult to detect. There are no totally invariant residues within the PH domain.
Proteins reported to contain one more PH domains belong to the following families:
The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.
Bacterial ferredoxin-NADP+ reductase may be bound to the thylakoid membrane or anchored to the thylakoid-bound phycobilisomes. Chloroplast ferredoxin-NADP+ reductase may play a key role in regulating the relative amounts of cyclic and non-cyclic electron flow to meet the demands of the plant for ATP and reducing power. It is involved in the final step in the linear photosynthetic electron transport chain and has also been implicated in cyclic electron flow around photosystem I where its role would be to return electrons from ferredoxin to the cytochrome B-F complex.
This domain is present in a variety of proteins that include, bacterial flavohemoprotein, mammalian NADH-cytochrome b5 reductase, eukaryotic NADPH-cytochrome P450 reductase, nitrate reductase from plants, nitric-oxide synthase, bacterial vanillate demethylase and others.
This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1), DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1). SNF2 functions as the ATPase component of the SNF2/SWI multisubunit complex, which utilises energy derived from ATP hydrolysis to disrupt histone-DNA interactions, resulting in the increased accessibility of DNA to transcription factors.
Proteins that contain this domain appear to be distantly related to the DEAX box helicases however no helicase activity has ever been demonstrated for these proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the basis of sequence similarities. The structure for S7 is known.
The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.
There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.
Isocitrate dehydrogenase (IDH) is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ or on NADP+. In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.
3-isopropylmalate dehydrogenase (IMDH) catalyses the third step in the biosynthesis of leucine in bacteria and fungi, the oxidative decarboxylation of 3-isopropylmalate into 2-oxo-4-methylvalerate. Tartrate dehydrogenase catalyses the reduction of tartrate to oxaloglycolate.
These enzymes are evolutionary related. The best conserved region of these enzymes is a glycine-rich stretch of residues located in the C-terminal section.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Prokaryotes and eukaryotes respond to heat shock and other forms of environmental stress by inducing synthesis of heat-shock proteins (hsp). The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1Â2% of cellular proteins under non-stress conditions. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.
The sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal domain by a highly-acidic, flexible linker region.
This family contains two related enzymes:
Dihydrofolate reductase (DHFR) catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate, an essential step in de novo synthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis), and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate. Although DHFR is found ubiquitously in prokaryotes and eukaryotes, and is found in all dividing cells, maintaining levels of fully reduced folate coenzymes, the catabolic steps are still not well understood.
Bacterial species possesses distinct DHFR enzymes (based on their pattern of binding diaminoheterocyclic molecules), but mammalian DHFRs are highly similar. The active site is situated in the N-terminal half of the sequence, which includes a conserved Pro-Trp dipeptide; the tryptophan has been shown to be involved in the binding of substrate by the enzyme. Its central role in DNA precursor synthesis, coupled with its inhibition by antagonists such as trimethoprim and methotrexate, which are used as anti-bacterial or anti-cancer agents, has made DHFR a target of anticancer chemotherapy. However, resistance has developed against some drugs, as a result of changes in DHFR itself.
Aminotransferases share certain mechanistic features with other pyridoxalphosphate-dependent enzymes, such as the covalent binding of the pyridoxalphosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies. One of these, called class-III, includes acetylornithine aminotransferase, which catalyzes the transfer of an amino group from acetylornithine to alpha-ketoglutarate, yielding N-acetyl-glutamic-5-semi-aldehyde and glutamic acid; ornithine aminotransferase, which catalyzes the transfer of an amino group from ornithine to alpha-ketoglutarate, yielding glutamic-5-semi-aldehyde and glutamic acid; omega-amino acid--pyruvate aminotransferase, which catalyzes transamination between a variety of omega-amino acids, mono- and diamines, and pyruvate; 4-aminobutyrate aminotransferase (GABA transaminase), which catalyzes the transfer of an amino group from GABA to alpha-ketoglutarate, yielding succinate semialdehyde and glutamic acid; DAPA aminotransferase, a bacterial enzyme (bioA), which catalyzes an intermediate step in the biosynthesis of biotin, the transamination of 7-keto-8-aminopelargonic acid to form 7,8-diaminopelargonic acid; 2,2-dialkylglycine decarboxylase, a Burkholderia cepacia (Pseudomonas cepacia) enzyme (dgdA) that catalyzes the decarboxylating amino transfer of 2,2-dialkylglycine and pyruvate to dialkyl ketone, alanine and carbon dioxide; glutamate-1-semialdehyde aminotransferase (GSA); Bacillus subtilis aminotransferases yhxA and yodT; Haemophilus influenzae aminotransferase HI0949; and Caenorhabditis elegans aminotransferase T01B11.2.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions, domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents the second domain found in subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
A number of enzymes require thiamine pyrophosphate (TPP) (vitamin B1) as a cofactor. It has been shown that some of these enzymes are structurally related. This central domain of TPP enzymes contains a 2-fold Rossman fold.
A number of enzymes, belonging to the lyase class, for which fumarate is a substrate, have been shown to share a short conserved sequence around a methionine which is probably involved in the catalytic activity of this type of enzymes. The following are examples of members of this family:
Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.
Glutamate dehydrogenases (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids.
Leucine dehydrogenase (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.
Phenylalanine dehydrogenase (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate.
Valine dehydrogenase (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate.
This entry represents the C-terminal domain of these proteins.
Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.
In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .
More information about this protein can be found at Protein of the Month: ATP Synthases.
Orotidine 5'-phosphate decarboxylase (OMPdecase) catalyses the last step in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher eukaryotes OMPdecase is part, with orotate phosphoribosyltransferase, of a bifunctional enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein.
Some parts of the sequence of OMPdecase are well conserved across species. The best conserved region is located in the N-terminal half of OMPdecases and is centred around a lysine residue which is essential for the catalytic function of the enzyme.
Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:
ADP + phosphoenolpyruvate = ATP + pyruvate
The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.
PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.
The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.
This entry represents the two barrel domains, the beta/alpha-barrel, and the beta-barrel inserted within it.
Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.
The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.
A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:
The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.
The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.
The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.
Such a structure is shown in the following schematic representation:
It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.
This group contains threonine peptidases and non-peptidase homologs belong to MEROPS peptidase family T1 (proteasome family, clan PB(T)). The family consists of the protease components of the archaeal and bacterial proteasomes and the alpha and beta subunits of the eukaryotic proteasome.
ATP-dependent protease complexes are present in all three kingdoms of life, where they rid the cell of misfolded or damaged proteins and control the level of certain regulatory proteins. They include the proteasome in Eukaryotes, Archaea, and Actinomycetales and the HslVU (ClpQY, clpXP) complex in other eubacteria. Genes homologous to eubacterial HslV (ClpQ) and HslU (ClpY, clpX) have also been demonstrated in to be present in the genome of trypanosomatid protozoa..
The proteasome (or macropain) is a multicatalytic proteinase complex that is involved in an ATP/ubiquitin-dependent non-lysosomal proteolytic pathway. In eukaryotes the proteasome is composed of about 28 distinct subunits, which form a highly ordered ring-shaped structure (20S ring) of about 700 kDa. Most proteasome subunits can be classified, on the basis on sequence similarities into two groups, A and B. In eukaryotic organisms there are up to seven different types of beta subunits, three of which may carry the N-terminal threonine residues that are the nucleophiles in catalysis, and show different specificities. The molecule is barrel-shaped, and the active sites are on the inner surfaces. Terminal apertures restrict access of substrates to the active sites.
The prokaryotes the ATP-dependent proteasome is coded for by the heat-shock locus VU (HslVU). It consists of HslV, the protease (MEROPS peptidase subfamily T1B), and HslU the ATPase and chaperone belonging to the AAA/Clp/Hsp100 family. The crystal structure ofThermotoga maritima HslV has been determined to 2.1-A resolution. The structure of the dodecameric enzyme is well conserved compared to those from Escherichia coli and Haemophilus influenzae.
A number of transmembrane (TM) channel proteins can be grouped together on the basis of sequence similarities.
These include:
MIP family proteins are thought to contain 6 TM domains. Sequence analysis suggests that the proteins may have arisen through tandem, intragenic duplication from an ancestral protein that contained 3 TM domains.
Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates a ttached to lipids or proteins. Aquaporin-CHIP (Aquaporin 1) belo ngs to the Colton blood group system and is associated with Co(a/b) antigen.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The cyclic nucleotide phosphodiesterases (PDE) comprise a group of enzymes that degrade the phosphodiester bond in the second messenger molecules cAMP and cGMP. They are divided into 11 families. They regulate the localisation, duration and amplitude of cyclic nucleotide signalling within subcellular domains. PDEs are therefore important for signal transduction.
PDE enzymes are often targets for pharmacological inhibition due to their unique tissue distribution, structural properties, and functional properties. Inhibitors include: Roflumilast for chronic obstructive pulmonary disease and asthma, Sildenafil for erectile dysfunction and Cilostazol for peripheral arterial occlusive disease, amongst others.
Retinal 3',5'-cGMP phosphodiesterase is located in photoreceptor outer segments: it is light activated, playing a pivotal role in signal transduction. In rod cells, PDE is oligomeric, comprising an alpha-, a beta- and 2 gamma-subunits, while in cones, PDE is a homodimer of alpha chains, which are associated with several smaller subunits. Both rod and cone PDEs catalyse the hydrolysis of cAMP or cGMP to the corresponding nucleoside 5' monophosphates, both enzymes also binding cGMP with high affinity. The cGMP-binding sites are located in the N-terminal half of the protein sequence, while the catalytic core resides in the C-terminal portion.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L22 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L22 is known to bind 23S rRNA. It belongs to a family of ribosomal proteins which includes: bacterial L22; algal and plant chloroplast L22 (in legumes L22 is encoded in the nucleus instead of the chloroplast); cyanelle L22; archaebacterial L22; mammalian L17; plant L17 and yeast YL17.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins. L14 is a protein of 119 to 137 amino-acid residues.
Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1, SUMO1, NEDD8, Rad23, Elongin B and Parkin, the latter being involved in Parkinson's disease.
Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation.
The actin-depolymerising factor homology (ADF-H) domain is an ~150-amino acid motif that is present in three phylogenetically distinct classes of eukaryotic actin-binding proteins:
Although these proteins are biochemically distinct and play different roles in actin dynamics, they all appear to use the ADF-H domain for their interactions with actin.
The ADF-H domain consists of a six-stranded mixed beta-sheet in which the four central strands (beta2-beta5) are anti-parallel and the two edge strands (beta1 and beta6) run parallel with the neighbouring strands. The sheet is surrounded by two alpha-helices on each side .
The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.
14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of sequences contain a diverse range of gene families, which include metallopeptidases belonging to MEROPS peptidase family M14 (carboxypeptidase A, clan MC), subfamilies M14A and M14B.
The carboxypeptidase A family can be divided into two subfamilies: carboxypeptidase H (regulatory) and carboxypeptidase A (digestive). Members of the H family have longer C-termini than those of family A, and carboxypeptidase M (a member of the H family) is bound to the membrane by a glycosylphosphatidylinositol anchor, unlike the majority of the M14 family, which are soluble.
The zinc ligands have been determined as two histidines and a glutamate, and the catalytic residue has been identified as a C-terminal glutamate, but these do not form the characteristic metalloprotease HEXXH motif. Members of the carboxypeptidase A family are synthesised as inactive molecules with propeptides that must be cleaved to activate the enzyme. Structural studies of carboxypeptidases A and B reveal the propeptide to exist as a globular domain, followed by an extended alpha-helix; this shields the catalytic site, without specifically binding to it, while the substrate-binding site is blocked by making specific contacts.
Other examples of protein families in this entry include:
The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others. All possess a similar structure, with a beta-alpha-beta fold characteristic of nucleotide binding proteins. The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones.
Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases.
Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents a structural domain with an alpha/beta-hammerhead fold, where the beta-hammerhead motif is similar to that in barrel-sandwich hybrids. Domains of this structure can be found in ribosomal proteins L10e and L16.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
S14 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S14 is known to be required for the assembly of 30S particles and may also be responsible for determining the conformation of 16S rRNA at the A site. It belongs to a family of ribosomal proteins that include, bacterial, algal and plant chloroplast, yeast mitochondrial, cyanelle and archael, Methanococcus vannielii S14's, as well as yeast mitochondrial MRP2, yeast YS29A/B and mammalian S29.
Synonym(s): Peptidylprolyl cis-trans isomerase
FKBP-type peptidylprolyl isomerases in vertebrates, are receptors for the two immunosuppressants, FK506 and rapamycin. The drugs inhibit T cell proliferation by arresting two distinct cytoplasmic signal transmission pathways. Peptidylprolyl isomerases accelerate protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. These proteins are found in a variety of organisms.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L15 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L15 is known to bind the 23S rRNA. Ribosomal protein, L15 from bacteria and plant chloroplasts (nuclear-encoded) belong to this family. Vertebrate L27a, Tetrahymena thermophila L29 and fungal L27a (L29, CRP-1, CYH2) also are members of this group.
Ribosomal L18E protein from a number of archebacteria show homology to both the eukaryotic L18 and eubacterial ribosomal protein L15, an observation which has been seen to substantiate the belief that archaea represent an evolutionary stage between bacteria and eukaryotes.
This domain is found in a number of proteins including flavodoxin and nitric-oxide synthase. Flavodoxins are electron-transfer proteins that function in various electron transport systems. They bind one FMN molecule, which serves as a redox-active prosthetic group and are functionally interchangeable with ferredoxins. They have been isolated from prokaryotes, cyanobacteria, and some eukaryotic algae. Nitric oxide synthase produces nitric oxide from L-arginie and NADPH. Nitric oxide acts as a messenger molecule in the body.
Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.
The allergens in this family include allergens with the following designations: Met e 1.
Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides:
2'-deoxyribonucleoside diphosphate + oxidized thioredoxin + H2O = ribonucleoside diphosphate + reduced thioredoxinIt provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.
Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain. The small chain binds two iron atoms (three Glu, one Asp, and two His are involved in metal binding) and contains an active site tyrosine radical. The regions of the sequence that contain the metal-binding residues and the active site tyrosine are conserved in ribonucleotide reductase small chain from prokaryotes, eukaryotes and viruses. We have selected one of these regions as a signature pattern. It contains the active site residue as well as a glutamate and a histidine involved in the binding of iron.
Members of this family include the DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.
The domain, which defines this group of proteins is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.
The eukaryotic translation initiation factor 4A (eIF4A) is a member of the DEA(D/H)-box RNA helicase family This is a diverse group of proteins that couples an ATPase activity to RNA binding and unwinding. The structure of the carboxyl-terminal domain of eIF4A has been determined to 1.75 A resolution; it has a parallel alpha-beta topology that superimposes, with minor variations, on the structures and conserved motifs of the equivalent domain in other, distantly related helicases.
Fructose-bisphosphate aldolase is a glycolytic enzyme that catalyses the reversible aldol cleavage or condensation of fructose-1,6-bisphosphate into dihydroxyacetone-phosphate and glyceraldehyde 3-phosphate. There are two classes of fructose-bisphosphate aldolases with different catalytic mechanisms: class I enzymes are found in animals, do not require a metal ion, and are characterised by the formation of a Schiff base intermediate between a highly conserved active site lysine and a substrate carbonyl group, while the class II enzymes are produced in bacteria and fungi, and require an active-site divalent metal ion. This entry represents the class I enzymes.
In vertebrates, three forms of this enzyme are found: aldolase A is expressed in muscle, aldolase B in liver, kidney, stomach and intestine, and aldolase C in brain, heart and ovary. The different isozymes have different catalytic functions: aldolases A and C are mainly involved in glycolysis, while aldolase B is involved in both glycolysis and gluconeogenesis. Defects in aldolase B result in hereditary fructose intolerance.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This domain is found in both eukaryotic L25 and prokaryotic and eukaryotic L23 proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
L5 is a protein of about 180 amino-acid residues.
Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).
Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.
This entry represents types I and II citrate synthase enzymes, as well as the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive. ATP citrate synthase (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial produced citrate.
The galacto-, homoserine, mevalonate and phosphomevalonate kinases contain, in their N-terminal section, a conserved Gly/Ser-rich region which is probably involved in the binding of ATP. This group of kinases has been called 'GHMP' (from the first letter of their substrates).
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the N-terminal domain of the large subunit of carbamoyl phosphate synthase. This domain can also be found in certain other related proteins.
The generic name 'NUDIX hydrolases' (NUcleoside DIphosphate linked to some other moiety X) has been coined for this domain family. The family can be divided into a number of subgroups, of which MutT anti- mutagenic activity represents only one type; most of the rest hydrolyse diverse nucleoside diphosphate derivatives (including ADP-ribose, GDP- mannose, TDP-glucose, NADH, UDP-sugars, dNTP and NTP).
This entry includes a variety of carbohydrate and pyrimidine kinases. The family includes phosphomethylpyrimidine kinase. This enzyme is part of the Thiamine pyrophosphate (TPP) synthesis pathway, TPP is an essential cofactor for many enzymes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L3 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L3 is known to bind to the 23S rRNA and may participate in the formation of the peptidyltransferase centre of the ribosome. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities includes bacterial, red algal, cyanelle, mammalian, yeast and Arabidopsis thaliana L3 proteins; archaeal Haloarcula marismortui HmaL3 (HL1), and yeast mitochondrial YmL9.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.
Phosphoglycerate mutase (PGAM) and bisphosphoglycerate mutase (BPGM) are structurally related enzymes that catalyse reactions involving the transfer of phospho groups between the three carbon atoms of phosphoglycerate. Both enzymes can catalyse three different reactions with different specificities, the isomerization of 2-phosphoglycerate (2-PGA) to 3-phosphoglycerate (3-PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction, the synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer and the degradation of 2,3-DPG to 3-PGA (phosphataseactivity).
In mammals, PGAM is a dimeric protein with two isoforms, the M (muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein.
BPGM is a dimeric protein and is found mainly in erythrocytes where it plays a major role in regulating haemoglobin oxygen affinity as a consequence of controlling 2,3-DPG concentration. The catalytic mechanism of both PGAM and BPGM involves the formation of a phosphohistidine intermediate.
A number of other proteins including, the bifunctional enzyme 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase that catalyses both the synthesis and the degradation of fructose-2,6-bisphosphate and bacterial alpha-ribazole-5'-phosphate phosphatase, which is involved in cobalamin biosynthesis, contain this domain.
5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMPThis provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process cause various biological and genetic abnormalities, such as thymineless death. The enzyme is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases). The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .
In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.
In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.
The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain. This entry represents the C-terminal domain of the alpha/A/beta/B subunits, which forms a left-handed superhelix composed of 4-5 individual helices. The C-terminal domain can vary between the alpha and beta subunits, and between different ATPases .
More information about this protein can be found at Protein of the Month: ATP Synthases.
The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins. It comprises the following groups of actin-binding domains:
A comprehensive review of proteins containing this type of actin-binding domains is given in.
The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene have only a single copy. The structure of an example CH-domain has recently been solved.
A large group of biosynthetic enzymes are able to catalyse the removal of the ammonia group from glutamine and then to transfer this group to a substrate to form a new carbon-nitrogen group. This catalytic activity is known as glutamine amidotransferase (GATase). The GATase domain exists either as a separate polypeptidic subunit or as part of a larger polypeptide fused in different ways to a synthase domain. On the basis of sequence similarities two classes of GATase domains have been identified, class-I (also known as trpG-type) and class-II (also known as purF-type). Enzymes containing Class-II GATase domains include amido phosphoribosyltransferase (glutamine phosphoribosylpyrophosphate amidotransferase), which catalyses the first step in purine biosynthesis (gene purF in bacteria, ADE4 in yeast); glucosamine--fructose-6-phosphate aminotransferase, which catalyses the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine (gene glmS in Escherichia coli, nodM in Rhizobium, GFA1 in yeast); and asparagine synthetase (glutamine-hydrolizing), which is responsible for the synthesis of asparagine from aspartate and glutamine. A cysteine is present at the N-terminal extremity of the mature form of all these enzymes.
This domain is found in a number of cysteine peptidases belonging to MEROPS peptidase family C44 and their non-peptidase homologs.
Phosphoenolpyruvate carboxylase (PEPCase), an enzyme found in all multicellular plants, catalyses the formation of oxaloacetate from phosphoenolpyruvate (PEP) and a hydrocarbonate ion. This reaction is harnessed by C4 plants to capture and concentrate carbon dioxide into the photosynthetic bundle sheath cells. It also plays a key role in the nitrogen fixation pathway in legume root nodules: here it functions in concert with glutamine, glutamate and asparagine synthetases and aspartate amido transferase, to synthesise aspartate and asparagine, the major nitrogen transport compounds in various amine-transporting plant species.
PEPCase also plays an antipleurotic role in bacteria and plant cells, supplying oxaloacetate to the TCA cycle, which requires continuous input of C4 molecules in order to replenish the intermediates removed for amino acid biosynthesis. The C-terminus of the enzyme contains the active site that includes a conserved lysine residue, involved in substrate binding, and other conserved residues important for the catalytic mechanism.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S15 is one of the proteins from the small ribosomal subunit. In Escherichia coli, this protein binds to 16S ribosomal RNA and functions at early steps in ribosome assembly. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities,], groups bacterial and plant chloroplast S15; archaeal Haloarcula marismortui HmaS15 (HS11); yeast mitochondrial S28; and mammalian, yeast, Brugia pahangi and Wuchereria bancrofti S13. S15 is a protein of 80 to 250 amino-acid residues.
Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.
Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain.
The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals, the function of each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction process by hydrogen atom abstraction from the ribonucleotide. The radical-based reaction involves five cysteines: two of these are located at adjacent anti-parallel strands in a new type of ten-stranded alpha/beta-barrel; two others reside at the carboxyl end in a flexible arm; and the fifth, in a loop in the centre of the barrel, is positioned to initiate the radical reaction. There are several regions of similarity in the sequence of the large chain of prokaryotes, eukaryotes and viruses spread across 3 domains: an N-terminal domain common to the mammalian and bacterial enzymes; a C-terminal domain common to the mammalian and viral ribonucleotide reductases; and a central domain common to all three.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal S2 proteins have been shown to belong to a family that includes 40S ribosomal subunit 40kDa proteins, putative laminin-binding proteins, NAB-1 protein and 29.3kDa protein from Haloarcula marismortui. The laminin-receptor proteins are thus predicted to be the eukaryotic homologue of the eubacterial S2 risosomal proteins.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This domain covers the active site serine of the serine peptidases belonging to MEROPS peptidase family S9 (prolyl oligopeptidase family, clan SC). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. Examples of protein families containing this domain are:
These proteins belong to MEROPS peptidase families S9A, S9B and S9C.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L30 is one of the proteins from the large ribosomal subunit. L30 belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria and archaea L30, yeast mitochondrial L33, and Drosophila melanogaster, Dictyostelium discoideum (Slime mold), fungal and mammalian L7 ribosomal proteins. L30 from bacteria are small proteins of about 60 residues, those from archaea are proteins of about 150 residues, and eukaryotic L7 are proteins of about 250 to 270 residues.
This entry represents the core domain of prokaryotic L30 and eukaryotic L7.
Acid phosphatases are a heterogeneous group of proteins that hydrolyse phosphate esters, optimally at low pH. It has been shown that a number of acid phosphatases, from both prokaryotes and eukaryotes, share two regions of sequence similarity, each centred around a conserved histidine residue. These two histidines seem to be involved in the enzymes' catalytic mechanism. The first histidine is located in the N-terminal section and forms a phosphohistidine intermediate while the second is located in the C-terminal section and possibly acts as proton donor. Enzymes belonging to this family are called 'histidine acid phosphatases' and include:
3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents a region containing 3 domains, each with a 3-layer alpha/beta/alpha topology. This regions represents the [4Fe-4S] cluster-binding region found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB. This domain is also found in the large subunit of isopropylmalate dehydratase (LeuC).
More information about these proteins can be found at Protein of the Month: Aconitase.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S5 is one of the proteins from the small ribosomal subunit, and is a protein of 166 to 254 amino-acid residues. In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, cyanelle, red algal chloroplast, archaeal and fungal mitochondrial S5; mammalian, Caenorhabditis elegans, Drosophila and plant S2; and yeast S4 (SUP44).
This entry represents the N-terminal domain of ribosomal protein S5, which has an alpha-beta(3)-alpha structure that folds into two layers, alpha/beta.
Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.
In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.
NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.
NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor. Experimental evidence has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.
Phosphoglucose isomerase (PGI) is a dimeric enzyme that catalyses the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the Entner-Doudouroff pathway. The multifunctional protein, PGI, is also known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumour-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, it catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine.
PGI from Bacillus stearothermophilus has an open twisted alpha/beta structural motif consisting of two globular domains and two protruding parts. It has been suggested that the top part of the large domain together with one of the protruding loops might participate in inducing the neurotrophic activity. The structure of rabbit muscle phosphoglucose isomerase complexed with various inhibitors shows that the enzyme is a dimer with two alpha/beta-sandwich domains in each subunit. The location of the bound D-gluconate 6-phosphate inhibitor leads to the identification of residues involved in substrate specificity. In addition, the positions of amino acid residues that are substituted in the genetic disease nonspherocytic hemolytic anemia suggest how these substitutions can result in altered catalysis or protein stability.
Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.
The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices.
The eubacterial secY protein interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.
Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export. Homologs of secY are found in archaebacteria. SecY is also encoded in the chloroplast genome of some algae where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.
Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.
The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.
The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.
Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins, and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.
The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.
This entry represents the GTPase domain, containing the GTP-binding motifs that are needed for guanine-nucleotide binding and hydrolysis. The conservation of these motifs is absolute except for the the final motif in guanylate-binding proteins. The GTPase catalytic activity can be stimulated by oligomerisation of the protein, which is mediated by interactions between the GTPase domain, the middle domain and the GED.
The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins.
The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box and interacts with transcription factors and regulatory proteins . By contrast, the N-terminal region varies in both length and sequence.
Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential, which is linked to ATP synthesis. The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur 'Rieske' subunit, which contains a high potential 2Fe-2S cluster.The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), present in cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits.
The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines that one Fe ion while the other Fe ion is coordinated by two conserved histidines. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.
PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6- bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP, as the 2 products are now further apart. These conformations are thought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react.
Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting, muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise. Sufferers are usually able to lead a reasonably ordinary life by learning to adjust activity levels.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The ribosomal proteins catalyse ribosome assembly and stabilise the rRNA, tuning the structure of the ribosome for optimal function. Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S17 is known to bind specifically to the 5' end of 16S ribosomal RNA in Escherichia coli (primary rRNA binding protein), and is thought to be involved in the recognition of termination codons. Experimental evidence has revealed that S17 has virtually no groups exposed on the ribosomal surface.
This entry represents the N-terminal domain of these proteins. It adopts a ribonuclease H-like fold and is structurally related to the C-terminal domain.
The crotonase superfamily is comprised of mechanistically diverse proteins that share a conserved trimeric quaternary structure (sometimes a hexamer consisting of a dimer of trimers), the core of which consists of 4 turns of a (beta/beta/alpha)n superhelix. Some enzymes in the superfamily have been shown to display dehalogenase, hydratase, and isomerase activities, while others have been implicated in carbon-carbon bond formation and cleavage as well as the hydrolysis of thioesters. However, these different enzymes share the need to stabilise an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two structurally conserved peptidic NH groups that provide hydrogen bonds to the carbonyl moieties of the acyl-CoA substrates and form an "oxyanion hole". The CoA thioester derivatives bind in a characteristic hooked shape and a conserved tunnel binds the pantetheine group of CoA, which links the 3'-phosphate ADP binding site to the site of reaction. Enzymes in the crotonase superfamily include:
This entry represents the core domain found in crotonase superfamily members.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S9 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial; algal chloroplast; cyanelle and archaeal S9 proteins; and mammalian; plant; and yeast mitochondrial ribosomal S9 proteins.
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.
Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.
In eukaryotes, transcription initiation of all protein encoding genes involves the polymerase II system. This sytem is modulated by both general and specific transcription factors. The general factors (which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIG and TFIIH) operate through common promoter elements, such as the TATA box. Transcription factor IIB (TFIIB) is of central importance in transcription of class II genes. It associates with TFIID-TFIIA bound to DNA (the DA complex) to form a ternary TFIID-IIA-IBB (DAB) complex, which is recognized by RNA polymerase II. TFIIB comprises ~315-340 residues and contains an imperfect C-terminal repeat of a 75-residue domain that may contribute to the symmetry of the folded protein. The basal archaeal transcription machinery resembles that of the eukaryotic polymerase II system and includes a homologue of TFIIB.
This entry represents a cyclin-like domain which is found repeated in the C-terminal region of a variety of eukaryotic TFIIB's and their archaeal counterparts. These domains individually form the typical cyclin fold, and in the transcription complex they straddle the C-terminal region of the TATA-binding protein - an interaction essential for the formation of the transcription initiation complex.
Cytidine deaminase (cytidine aminohydrolase) catalyzes the hydrolysis of cytidine into uridine and ammonia while deoxycytidylate deaminase (dCMP deaminase) hydrolyzes dCMP into dUMP. Both enzymes are known to bind zinc and to require it for their catalytic activity. These two enzymes do not share any sequence similarity with the exception of a region that contains three conserved histidine and cysteine residues which are thought to be involved in the binding of the catalytic zinc ion.
Such a region is also found in other proteins:
Phosphatidylinositol-specific phospholipase C, an eukaryotic intracellular enzyme, plays an important role in signal transduction processes (see. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins.
In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC.
All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' (see and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain (see possibly involved in Ca-dependent membrane attachment.
6-Phosphogluconate dehydrogenase (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP). Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequences are highly conserved. The protein is a homodimer in which the monomers act independently: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet. NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket.
This entry represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each. The NAD-binding domain is described in
Synonym(s): Rsp5 or WWP domain
The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. It is frequently associated with other domains typical for proteins in signal transduction processes.
A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein and others.
This family of proteins include rRNA adenine dimethylases (e.g. KsgA) and the Erythromycin resistance methylases (Erm).
The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.
The crystal structure of KsgA from Escherichia coli has been solved to a resolution of 2.1A. It bears a strong similarity to the crystal structure of ErmC' from Bacillus stearothermophilus and a lesser similarity to the yeast mitochondrial transcription factor, sc-mtTFB.
The Erm family of RNA methyltransferases, which methylate a single adenosine base in 23S rRNA confer resistance to the MLS-B group of antibiotics. Despite their sequence similarity, the two enzyme families have strikingly different levels of regulation that remain to be elucidated. Other orthologs, of this family include the yeast and Homo sapiens (Human) mitochondrial transcription factors (MTF1 and h-mtTFB respectively), which are nuclear encoded. Human-mtTFB is able to stimulate transcription in vitro independently of its S-adenosylmethionine binding and rRNA methyltransferase activity.
WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.
AMP + MgATP = ADP + MgADPan essential reaction for many processes in living cells. Two ADK isozymes have been identified in mammalian cells. These specifically bind AMP and favour binding to ATP over other nucleotide triphosphates (AK1 is cytosolic and AK2 is located in the mitochondria). A third ADK has been identified in bovine heart and human cells, this is a mitochondrial GTP:AMP phosphotransferase, also specific for the phosphorylation of AMP, but can only use GTP or ITP as a substrate. ADK has also been identified in different bacterial species and in yeast . Two further enzymes are known to be related to the ADK family, i.e. yeast uridine monophosphokinase and slime mold UMP-CMP kinase. Within the ADK family there are several conserved regions, including the ATP-binding domains. One of the most conserved areas includes an Arg residue, whose modification inactivates the enzyme, together with an Asp that resides in the catalytic cleft of the enzyme and participates in a salt bridge.
The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.
Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.
The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.
This entry represents the C-terminal domain alpha-D-phosphohexomutase enzymes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S8 is known to bind directly to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups eubacterial, algal and plant chloroplast, cyanelle, archaebacterial and Marchantia polymorpha mitochondrial S8; mammalian and plant S15A; and yeast S22 (S24) ribosomal proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S11 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins.The regulator of chromosome condensation (RCC1) is a eukaryotic protein which binds to chromatin and interacts with ran, a nuclear GTP-binding protein to promote the loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS). The interaction of RCC1 with ran probably plays an important role in the regulation of gene expression.
RCC1, known as PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in Drosophila, is a protein that contains seven tandem repeats of a domain of about 50 to 60 amino acids. As shown in the following schematic representation, the repeats make up the major part of the length of the protein. Outside the repeat region, there is just a small N-terminal domain of about 40 to 50 residues and, in the Drosophila protein only, a C-terminal domain of about 130 residues.
The RCC1-type of repeat is also found in the X-linked retinitis pigmentosa GTPase regulator. The RCC repeats form a beta-propeller structure.Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. It is a basic protein of 115 to 177 amino-acid residues. This family of ribosomal proteins is present in procaryotes and eukaryotes.
chorismate + l-glutamine = anthranilate + pyruvate + l-glutamate.The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that catalyses the formation of anthranilate using ammonia rather than glutamine, while component II provides glutamine amidotransferase activity
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The 60S acidic ribosomal protein plays an important role in the elongation step of protein synthesis. This family includes archaebacterial L12, eukaryotic P0, P1 and P2.
Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.
The allergens in this family include allergens with the following designations: Alt a 6, Alt a 12, Cla h 3, Cla h 4 and Cla h 12.
The beta subunit of the farnesyltransferases is responsible for peptide binding. Squalene-hopene cyclase is a bacterial enzyme that catalyzes the cyclization of squalene into hopene, a key step in hopanoid (triterpenoid) metabolism. Lanosterol synthase (oxidosqualene-lanosterol cyclase) catalyzes the cyclization of (S)-2,3-epoxysqualene to lanosterol, the initial precursor of cholesterol, steroid hormones and vitamin D in vertebrates and of ergosterol in fungi. Cycloartenol synthase (2,3-epoxysqualene-cycloartenol cyclase) is a plant enzyme that catalyzes the cyclization of (S)-2,3-epoxysqualene to cycloartenol.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
This domain is found in a large variety of protein kinases with different functions and dependencies. Protein kinase C, for example, is a calcium-activated, phospholipid-dependent serine- and threonine-specific enzyme. It is activated by diacylglycerol which, in turn, phosphorylates a range of cellular proteins. This domain is most often found associated withS-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.
In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.
The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.
The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C19 (ubiquitin-specific protease family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. Predicted active site residues for members of this family and family C1 occur in the same order in the sequence: N/Q, C, H. The type example is human ubiquitin-specific protease 14.
Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.
The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 100-200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the GTPase domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. The GTPase domain is evolutionary related to P-loop NTPase domains found in a variety of other proteins.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L20 is a protein from the large (50S) subunit; in Escherichia coli it is known to bind directly to the 23S rRNA, and is required for ribosome assembly, but does not take part in protein synthesis. It belongs to a family of ribosomal proteins, including L20 from eubacteria, plant and alga chloroplasts and cyanelles.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases.
The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases . The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.
Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources show that the enzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is a highly related enzyme, dihydroxy-acetone synthase (DHAS)(also known as formaldehyde transketolase), which exhibits a very unusual specificity by including formaldehyde amongst its substrates.
1-deoxyxylulose-5-phosphate synthase (DXP synthase) is an enzyme so far found in bacteria (gene dxs) and plants (gene CLA1) which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D- xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase is evolutionary related to TK. The N-terminal section, contains a histidine residue which appears to function in proton transfer during catalysis . In the central section there are conserved acidic residues that are part of the active cleft and may participate in substrate-binding. This family includes transketolase enzymesand also partially matches to 2-oxoisovalerate dehydrogenase beta subunit. Both these enzymes utilise thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis.
Glutaredoxins, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system.
Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.
Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.
This entry represents Glutaredoxin.
Serine hydroxymethyltransferase (SHMT) is a pyridoxal phosphate (PLP) dependent enzyme and belongs to the aspartate aminotransferase superfamily (fold type I). The pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved in all forms of the enzyme. The enzyme carries out interconversion of serine and glycine using PLP as the cofactor. SHMT catalyses the transfer of a hydroxymethyl group from N5, N10- methylene tetrahydrofolate to glycine, resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers and the mammalian enzyme forms a homotetramer. PLP dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalysed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), D-amino acid superfamily (fold type IV) and glycogen phophorylase family (fold type V).
In vertebrates, glycine hydroxymethyltransferase exists in a cytoplasmic and a mitochondrial form whereas only one form is found in prokaryotes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
On the basis of sequence similarities the following prokaryotic and eukaryotic ribosomal proteins can be grouped:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and the bacterial transcription antitermination proteins NusG.
DNA-directed DNA polymerases are the key enzymes catalysing the accurate replication of DNA. They require either a small RNA molecule or a protein as a primer for the de novo synthesis of a DNA chain. A number of polymerases belong to this family.
This entry contains two related enzymes IMP dehydrogenase and GMP reducatase. These enzymes adopt a TIM barrel structure.
IMP dehydrogenase (IMPDH) catalyzes the rate-limiting reaction of de novo GTP biosynthesis, the NAD-dependent reduction of IMP into XMP.
Inosine 5-phosphate + NAD+ + H2O = xanthosine 5-phosphate + NADHIMP dehydrogenase is associated with cell proliferation and is a possible target for cancer chemotherapy. Mammalian and bacterial IMPDHs are tetramers of identical chains. There are two IMP dehydrogenase isozymes in humans. IMP dehydrogenase nearly always contains a long insertion that has two CBS domains within it.
GMP reductase catalyzes the irreversible and NADPH-dependent reductive deamination of GMP into IMP.
NADPH + guanosine 5-phosphate = NADP+ + inosine 5-phosphate + NH3It converts nucleobase, nucleoside and nucleotide derivatives of G to A nucleotides, and maintains intracellular balance of A and G nucleotides.
Glucose-6-phosphate dehydrogenase (G6PDH) is a ubiquitous protein, present in bacteria and all eukaryotic cell types. The enzyme catalyses the the first step in the pentose pathway, i.e. the conversion of glucose-6-phosphate to gluconolactone 6-phosphate in the presence of NADP, producing NADPH. The ubiquitous expression of the enzyme gives it a major role in the production of NADPH for the many NADPH-mediated reductive processes in all cells. Deficiency of G6PDH is a common genetic abnormality affecting millions of people worldwide. Many sequence variants, most caused by single point mutations, are known, exhibiting a wide variety of phenotypes.
This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphataseand adenylate cyclase
Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases PP2C is a monomeric enzyme of about 42 Kd which shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. Its exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1 which has weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and which contains a C-terminal PP2C domain.
PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase (PDPC) , which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.
Nucleotidyl transferases transfer nucleotides from one compound to another. This domain is found in a number of enzymes that transfer nucleotides onto phosphosugars.
Fatty acid desaturases are enzymes that catalyse the insertion of a double bond at the delta position of fatty acids.
There seem to be two distinct families of fatty acid desaturases which do not seem to be evolutionary related.
Family 1 is composed of:
Family 2 is composed of:
This entry contains fatty acid desaturases belonging to Family 1.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the C-terminal region found in proteins in the MutS family of DNA mismatch repair proteins. The C-terminal region of MutS is comprised of the ATPase domain and the HTH (helix-turn-helix) domain, the latter being involved in dimer contacts. Yeast MSH3, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.
The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.
Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.
This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme.
The ureohydrolase superfamily includes arginase, agmatinase, formiminoglutamase and proclavaminate amidinohydrolase. These enzymes share a 3-layer alpha-beta-alpha structure, and play important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways.
Arginase, which catalyses the conversion of arginine to urea and ornithine, is one of the five members of the urea cycle enzymes that convert ammonia to urea as the principal product of nitrogen excretion. There are several arginase isozymes that differ in catalytic, molecular and immunological properties. Deficiency in the liver isozyme leads to argininemia, which is usually associated with hyperammonemia.
Agmatinase hydrolyses agmatine to putrescine, the precursor for the biosynthesis of higher polyamines, spermidine and spermine. In addition, agmatine may play an important regulatory role in mammals.
Formiminoglutamase catalyses the fourth step in histidine degradation, acting to hydrolyse N-formimidoyl-L-glutamate to L-glutamate and formamide.
Proclavaminate amidinohydrolase is involved in clavulanic acid biosynthesis. Clavulanic acid acts as an inhibitor of a wide range of beta-lactamase enzymes that are used by various microorganisms to resist beta-lactam antibiotics. As a result, this enzyme improves the effectiveness of beta-lactamase antibiotics.
MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins.
This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.
The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.
The MCM proteins bind together in a large complex. Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7. This core complex in human MCMs has been associated with helicase activity in vitro, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.
Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus.
The forkhead-associated (FHA) domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands.
To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.
A number of prokaryotic and eukaryotic enzymes, which appear to act via an ATP-dependent covalent binding of AMP to their substrate, share a region of sequence similarity, . This region is a Ser/Thr/Gly-rich domain that is further characterised by a conserved Pro-Lys-Gly triplet. The family of enzymes includes luciferase, long chain fatty acid Co-A ligase, acetyl-CoA synthetase and various other closely-related synthetases.
High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.
The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins; LEF1 lymphoid enhancer binding factor 1; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.
Cytochrome c oxidase is the terminal enzyme of the respiratory chain of mitochondria and many aerobic bacteria. It catalyses the transfer of electrons from reduced cytochrome c to molecular oxygen:
4 cytochrome c+2 + 4 H+ + O2 --> 4 cytochrome c+3 + 2 H2O
This reaction is coupled to the pumping of four additional protons across the mitochondrial or bacterial membrane.
Cytochrome c oxidase is an oligomeric enzymatic complex that is located in the mitochondrial inner membrane of eukaryotes and in the plasma membrane of aerobic prokaryotes. The core structure of prokaryotic and eukaryotic cytochrome c oxidase contains three common subunits, I, II and III. In prokaryotes, subunits I and III can be fused and a fourth subunit is sometimes found, whereas in eukaryotes there are a variable number of additional small polypeptidic subunits. The functional role of subunit III is not yet understood.
As the bacterial respiratory systems are branched, they have a number of distinct terminal oxidases, rather than the single cytochrome c oxidase present in the eukaryotic mitochondrial systems. Although the cytochrome o oxidases do not catalyse the cytochrome c but the quinol (ubiquinol) oxidation they belong to the same haem-copper oxidase superfamily as cytochrome c oxidases. Members of this family share sequence similarities in all three core subunits: subunit I is the most conserved subunit, whereas subunit II is the least conserved.
The armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila melanogaster segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumour suppressor protein, and the nuclear transport factor importin-alpha, amongst others. A subset of these proteins is conserved across eukaryotic kingdoms. In higher plants, some Arm-repeat proteins function in intracellular signalling like their mammalian counterparts, while others have novel functions.
The 3-dimensional fold of an armadillo repeat is known from the crystal structure of beta-catenin, where the 12 repeats form a superhelix of alpha helices with three helices per unit. The cylindrical structure features a positively charged grove, which presumably interacts with the acidic surfaces of the known interaction partners of beta-catenin.
The tetratrico peptide repeat (TPR) is a structural motif present in a wide range of proteins. It mediates protein-protein interactions and the assembly of multiprotein complexes. The TPR motif consists of 3-16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.
The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helix-turn-helix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel alpha-helices. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24 degrees; within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent C-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has DNA-binding capacity.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
The egg peptide speract receptor is a transmembrane glycoprotein. Other members of this family include the macrophage scavenger receptor type I (a membrane glycoprotein implicated in the pathologic deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell proliferation).
The death domain (DD) is a homotypic protein interaction module composed of a bundle of six alpha-helices. DD is related in sequence and structure to the death effector domain (DED, see and the caspase recruitment domain (CARD, see, which work in similar pathways and show similar interaction properties. DD bind each other forming oligomers. Mammals have numerous and diverse DD-containing proteins. Within these proteins, the DD domains can be found in combination with other domains, including: CARDs, DEDs, ankyrin repeats, caspase-like folds, kinase domains, leucine zippers, leucine-rich repeats (LRR), TIR domains, and ZU5 domains.
Some DD-containing proteins are involved in the regulation of apoptosis and inflammation through their activation of caspases and NF-kappaB, which typically involves interactions with TNF (tumour necrosis factor) cytokine receptors. In humans, eight of the over 30 known TNF receptors contain DD in their cytoplasmic tails; several of these TNF receptors use caspase activation as a signalling mechanism. The DD mediates self-association of these receptors, thus giving the signal to downstream events that lead to apoptosis. Other DD-containing proteins, such as ankyrin, MyD88 and pelle, are probably not directly involved in cell death signalling. DD-containing proteins also have links to innate immunity, communicating with Toll family receptors through bipartite adapter proteins such as MyD88.
The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.
A chitin biosynthesis protein from yeast also seems to belong to this group.
The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates and related proteins into distinct sequence based families has been described. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.
Proteins containign this domain transfer UDP, ADP, GDP or CMP linked sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. The bacterial enzymes are involved in various biosynthetic processes that include exopolysaccharide biosynthesis, lipopolysaccharide core biosynthesis and the biosynthesis of the slime polysaccaride colanic acid. Mutations in this domain of the human N-acetylglucosaminyl-phosphatidylinositol biosynthetic protein are the cause of paroxysmal nocturnal hemoglobinuria (PNH), an acquired hemolytic blood disorder characterised by venous thrombosis, erythrocyte hemolysis, infections and defective hematopoiesis.
The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates and related proteins into distinct sequence based families has been described. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.
This domain is found in a diverse family of glycosyl transferases that transfer the sugar from UDP-glucose, UDP-N-acetyl-galactosamine, GDP-mannose or CDP-abequose, to a range of substrates including cellulose, dolichol phosphate and teichoic acids.The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins, nevertheless with a low affinity constant. SAM domains also appear to possess the ability to bind RNA. Smaug  a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA  binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding.
Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents the C-terminal domain of the large subunit ribosomal proteins, known as the L7/L12 family. L7/L12 is present in each 50S subunit in four copies organised as two dimers. The L8 protein complex consisting of two dimers of L7/L12 and L10 in Escherichia coli ribosomes is assembled on the conserved region of 23 S rRNA termed the GTPase-associated domain. The L7/L12 dimer probably interacts with EF-Tu. L7 and L12 only differ in a single post translational modification of the addition of an acetyl group to the N terminus of L7.
This entry represents a domain found in both the alpha and beta chains of succinyl-CoA synthase GDP-forming) and(ADP-forming)). This domain can also be found in ATP citrate synthase () and malate-CoA ligase (). Some members of the domain utilise ATP others use GTP.
Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups.
The amino-terminal region of the ACP proteins is well defined and consists of alpha four helices arranged in a right-handed bundle held together by interhelical hydrophobic interactions. The Asp-Ser-Leu (DSL)motif is conserved in all of the ACP sequences, and the 4'-PP prosthetic group is covalently linked via a phosphodiester bond to the serine residue. The DSL sequence is present at the amino terminus of helix II, a domain of the protein referred to as the recognition helix and which is responsible for the interaction of ACPs with the enzymes of type II fatty acid synthesis.
The trifunctional glycinamide ribonucleotide synthetase-aminoimidazole ribonucleotide synthetase-glycinamide ribonucleotide transformylase catalyses the second, third and fifth steps in de novo purine biosynthesis. The glycinamide ribonucleotide transformylase belongs to this group.
Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic core, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (have been studied most carefully with respect to the structural basis of catalysis. Although the active site of avian virus integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis.
Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.
HIV-1 integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyze this recombination event, integrase must recognize and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.
The entry also contains proteins that have lost catalytic activity, for example Spt16 , which is a component of the FACT complex. The crystal structure of the N terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails.
The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( ) and Pob3p (). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.
Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.
Reaction of amidotransferase domain:
L-glutamine + H2O = L-glutamate + NH3
Reactions of FMN-binding domain:
2-oxoglutarate + NH3 = 2-iminoglutarate + H2O
2e + FMNox = FMNred
2-iminoglutarate + FMNred = L-glutamate + FMNox
The alpha/beta hydrolase fold is common to a number of hydrolytic enzymes of widely differing phylogenetic origin and catalytic function. The core of each enzyme is an alpha/beta-sheet (rather than a barrel), containing 8 strands connected by helices. The enzymes are believed to have diverged from a common ancestor, preserving the arrangement of the catalytic residues. All have a catalytic triad, the elements of which are borne on loops, which are the best conserved structural features of the fold. Esterase (EST) from Pseudomonas putida is a member of the alpha/beta hydrolase fold superfamily of enzymes.
In most of the family members the beta-strands are parallels, but some have an inversion of the first strands, which gives it an antiparallel orientation. The catalytic triad residues are presented on loops. One of these is the nucleophile elbow and is the most conserved feature of the fold. Some other members lack one or all of the catalytic residues. Some members are therefore inactive but others are involved in surface recognition. The ESTHER database gathers and annotates all the published information related to gene and protein sequences of this superfamily.
This entry represents fold-1 of alpha/beta hydrolase.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain represents the hybrid-binding domain and the wall domain. The hybrid-binding domain binds the nascent RNA strand/template DNA strand in the Pol II transcription elongation complex. This domain contains the important structural motifs, switch 3 and the flap loop and binds an active site metal ion. This domain is also involved in binding to Rpb1 and Rpb3. Many of the bacterial members contain large insertions within this domain, which are known as dispensable region 2 (DRII).
Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents ZZ-type zinc finger domains, named because of their ability to bind two zinc ions. These domains contain 4-6 Cys residues that participate in zinc binding (plus additional Ser/His residues), including a Cys-X2-Cys motif found in other zinc finger domains. These zinc fingers are thought to be involved in protein-protein interactions. The structure of the ZZ domain shows that it belongs to the family of cross-brace zinc finger motifs that include the PHD, RING, and FYVE domains. ZZ-type zinc finger domains are found in:
Single copies of the ZZ zinc finger occur in the transcriptional adaptor/coactivator proteins P300, in cAMP response element-binding protein (CREB)-binding protein (CBP) and ADA2. CBP provides several binding sites for transcriptional coactivators. The site of interaction with the tumour suppressor protein p53 and the oncoprotein E1A with CBP/P300 is a Cys-rich region that incorporates two zinc-binding motifs: ZZ-type and TAZ2-type. The ZZ-type zinc finger of CBP contains two twisted anti-parallel beta-sheets and a short alpha-helix, and binds two zinc ions. One zinc ion is coordinated by four cysteine residues via 2 Cys-X2-Cys motifs, and the third zinc ion via a third Cys-X-Cys motif and a His-X-His motif. The first zinc cluster is strictly conserved, whereas the second zinc cluster displays variability in the position of the two His residues.
In Arabidopsis thaliana (Mouse-ear cress), the hypersensitive to red and blue 1 (Hrb1) protein, which regulating both red and blue light responses, contains a ZZ-type zinc finger domain.
ZZ-type zinc finger domains have also been identified in the testis-specific E3 ubiquitin ligase MEX that promotes death receptor-induced apoptosis. MEX has four putative zinc finger domains: one ZZ-type, one SWIM-type and two RING-type. The region containing the ZZ-type and RING-type zinc fingers is required for interaction with UbcH5a and MEX self-association, whereas the SWIM domain was critical for MEX ubiquitination.
In addition, the Cys-rich domains of dystrophin, utrophin and an 87kDa post-synaptic protein contain a ZZ-type zinc finger with high sequence identity to P300/CBP ZZ-type zinc fingers. In dystrophin and utrophin, the ZZ-type zinc finger lies between a WW domain (flanked by and EF hand) and the C-terminal coiled-coil domain. Dystrophin is thought to act as a link between the actin cytoskeleton and the extracellular matrix, and perturbations of the dystrophin-associated complex, for example, between dystrophin and the transmembrane glycoprotein beta-dystroglycan, may lead to muscular dystrophy. Dystrophin and its autosomal homologue utrophin interact with beta-dystroglycan via their C-terminal regions, which are comprised of a WW domain, an EF hand domain and a ZZ-type zinc finger domain. The WW domain is the primary site of interaction between dystrophin or utrophin and dystroglycan, while the EF hand and ZZ-type zinc finger domains stabilise and strengthen this interaction.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes.
Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking, while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites.
Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations.
In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family includes ribosomal L4/L1 from eukaryotes and plants and L4 from bacteria. L4 from yeast has been shown to bind rRNA. These proteins have 246 (plant) to 427 (human) amino acids.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of serine peptidases belong to the MEROPS peptidase family S14 (ClpP endopeptidase family, clan SK). ClpP is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of ATP, although the P subunit alone does possess some catalytic activity. This family of sequences represent the P subunit.
Proteases highly similar to ClpP have been found to be encoded in the genome of bacteria, metazoa, some viruses and in the chloroplast of plants. A number of the proteins in this family are classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of RNA-associated proteins. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site.
The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
Peroxiredoxins (Prxs) are a ubiquitous family of antioxidant enzymes that also control cytokine-induced peroxide levels which mediate signal transduction in mammalian cells. Prxs can be regulated by changes to phosphorylation, redox and possibly oligomerisation states. Prxs are divided into three classes: typical 2-Cys Prxs; atypical 2-Cys Prxs; and 1-Cys Prxs. All Prxs share the same basic catalytic mechanism, in which an active-site cysteine (the peroxidatic cysteine) is oxidised to a sulphenic acid by the peroxide substrate. The recycling of the sulphenic acid back to a thiol is what distinguishes the three enzyme classes. Using crystal structures, a detailed catalytic cycle has been derived for typical 2-Cys Prxs, including a model for the redox-regulated oligomeric state proposed to control enzyme activity.
Alkyl hydroperoxide reductase (AhpC) is responsible for directly reducing organic hyperoxides in its reduced dithiol form. Thiol specific antioxidant (TSA) is a physiologically important antioxidant which constitutes an enzymatic defence against sulphur-containing radicals. This family contains AhpC and TSA, as well as related proteins.
Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee, King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.
The allergens in this family include allergens with the following designations: Asp f 3, Mal f 2 and Mal f 3.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Members of this family are helicases that catalyse ATP dependent unwinding of double stranded DNA to single stranded DNA. THe family includes both Rep and UvrD helcases. The Rep family helicases are composed of four structural domains. The Rep proteins function as dimers.
Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including
Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.
Histone acetylation is carried out by a class of enzymes known as histone acetyltransferases (HATs), which catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histone. Early indication that HATs were involved in transcription came from the observation that in actively transcribed regions of chromatin, histones tend to be hyperacetylated, whereas in transcriptionally silent regions histones are hypoacetylated. The histone acetyltransferases are divided into five families. These include the Gcn5-related acetyltransferases (GNATs); the MYST (for 'MOZ, Ybf2/Sas3, Sas2 and Tip60)-related HATs; p300/CBP HATs; the general transcription factor HATs, which include the TFIID subunit TAF250; and the nuclear hormone-related HATs SRC1 and ACTR (SRC3). The GCN5-related N-acetyltransferase superfamily includes such enzymes as the histone acetyltransferases GCN5 and Hat1, the elongator complex subunit Elp3, the mediator-complex subunit Nut1, and Hpa2 .
Many GNATs share several functional domains, including an N-terminal region of variable length, an acetyltransferase domain that encompasses the conserved sequence motifs described above, a region that interacts with the coactivator Ada2, and a C-terminal bromodomain that is believed to interact with acetyl-lysine residues. Members of the GNAT family are important for the regulation of cell growth and development. In mice, knockouts of Gcn5L are embryonic lethal. Yeast Gcn5 is needed for normal progression through the G2ÂM boundary and mitotic gene expression. The importance of GNATs is probably related to their role in transcription and DNA repair.
The yeast GCN5 (yGCN5) transcriptional coactivator functions as a histone acetyltransferase (HAT) to promote transcriptional activation. The crystal structure of the yeast histone acetyltransferase Hat1-acetyl coenzyme A (AcCoA) shows that Hat1 has an elongated, curved structure, and the AcCoA molecule is bound in a cleft on the concave surface of the protein, marking the active site of the enzyme. A channel of variable width and depth that runs across the protein is probably the binding site for the histone substrate. The central protein core associated with AcCoA binding that appears to be structurally conserved among a superfamily of N-acetyltransferases, including yeast histone acetyltransferase 1 and Serratia marcescens aminoglycoside 3-N-acetyltransferase.
Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.
The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. SecE, part of the main SecYEG translocase complex, is ~106 residues in length, and spans the inner membrane of the Gram-negative bacterial envelope. Together with SecY and SecG, SecE forms a multimeric channel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The latter is mediated by SecA.
In eukaryotes, the evolutionary related protein sec61-gamma plays a role in protein translocation through the endoplasmic reticulum; it is part of a trimeric complex that also consist of sec61-alpha and beta. Both secE and sec61-gamma are small proteins of about 60 to 90 amino acids that contain a single transmembrane region at their C-terminal extremity (Escherichia coli secE is an exception, in that it possess an extra N-terminal segment of 60 residues that contains two additional transmembrane domains).
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This domain includes the glycine, histidine, proline, threonine and serine tRNA synthetases.
Phage integrase proteins cleave DNA substrates by a series of staggered cuts, during which the protein becomes covalently linked to the DNA through a catalytic tyrosine residue at the carboxy end of the alignment.
The catalytic site residues in CRE recombinase are Arg-173, His-289, Arg-292 and Tyr-324.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
This entry represents several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The DAG kinase domain is assumed to be an accessory domain. Upon cell stimulation, DAG kinase converts DAG into phosphatidate, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. It catalyses the reaction: ATP + 1,2-diacylglycerol = ADP + 1,2-diacylglycerol 3-phosphate. The enzyme is stimulated by calcium and phosphatidylserine and phosphorylated by protein kinase C. This domain is always associated with
The FCH domain is a short conserved region of around 60 amino acids first described as a region of homology between FER and CIP4 proteins. Many proteins containing an FCH domain are involved in the regulation of cytoskeletal rearrangements, vesicular transport and endocytosis. In the CIP4 protein the FCH domain binds to microtubules. The FCH domain is always found N-terminally and is followed by a coiled-coil region.
Proteins containing an FCH domain can be divided in 3 classes:
Calmodulin (CaM) is recognized as a major calcium sensor and orchestrator of regulatory events through its interaction with a diverse group of cellular proteins. Three classes of recognition motifs exist for many of the known CaM binding proteins; the IQ motif as a consensus for Ca2+-independent binding and two related motifs for Ca2+-dependent binding, termed 18-14 and 1-5-10 based on the position of conserved hydrophobic residues.
The regulatory domain of scallop myosin is a three-chain protein complex that switches on this motor in response to Ca2+ binding. Side-chain interactions link the two light chains in tandem to adjacent segments of the heavy chain bearing the IQ-sequence motif. The Ca2+-binding site is a novel EF-hand motif on the essential light chain and is stabilized by linkages involving the heavy chain and both light chains, accounting for the requirement of all three chains for Ca2+ binding and regulation in the intact myosin molecule.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The role of the accessory domain of phosphoinositide 3-kinase (PI3-kinase) is unclear. It may be involved in substrate presentation .
Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs.
RNA polymerases catalyse the DNA dependent polymerisation of RNA from DNA, using the four ribonucleoside triphosphates as substrates. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Eukaryotic RNA polymerase I is essentially used to transcribe ribosomal RNA units, polymerase II is used for mRNA precursors, and III is used to transcribe 5S and tRNA genes. Each class of RNA polymerase is assembled from nine to fourteen different polypeptides. Members of the family include the largest subunit from eukaryotes; the gamma subunit from Cyanobacteria; the beta' subunit from bacteria; the A' subunit from archaea; and the B'' subunit from chloroplast RNA polymerases.
Guanylate kinase (GK) catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes (such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.
Proteins containing one or more copies of the DHR domain, an SH3 domain as well as a C-terminal GK-like domain, are collectively termed MAGUKs (membrane-associated guanylate kinase homologs), and include Drosophila lethal(1)discs large-1 tumor suppressor protein (gene dlg1); mammalian tight junction protein Zo-1; a family of mammalian synaptic proteins that seem to interact with the cytoplasmic tail of NMDA receptor subunits (SAP90/PSD-95, CHAPSYN-110/PSD-93, SAP97/DLG1 and SAP102); vertebrate 55 kD erythrocyte membrane protein (p55); Caenorhabditis elegans protein lin-2; rat protein CASK; and human proteins DLG2 and DLG3. There is an ATP-binding site (P-loop) in the N-terminal section of GK, which is not conserved in the GK-like domain of the above proteins. However these proteins retain the residues known, in GK, to be involved in the binding of GMP.
Gelsolin is a cytoplasmic, calcium-regulated, actin-modulating protein that binds to the barbed ends of actin filaments, preventing monomer exchange (end-blocking or capping). It can promote nucleation (the assembly of monomers into filaments), as well as sever existing filaments. In addition, this protein binds with high affinity to fibronectin. Plasma gelsolin and cytoplasmic gelsolin are derived from a single gene by alternate initiation sites and differential splicing.
Sequence comparisons indicate an evolutionary relationship between gelsolin, villin, fragmin and severin. Six large repeating segments occur in gelsolin and villin, and 3 similar segments in severin and fragmin. While the multiple repeats have yet to be related to any known function of the actin-severing proteins, the superfamily appears to have evolved from an ancestral sequence of 120 to 130 amino acid residues.
UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases. The human homologue of yeast Rad23A is one example of a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The solution structure of human Rad23A UBA(2) showed that the domain forms a compact three-helix bundle. Comparison of the structures of UBA(1) and UBA(2) reveals that both form very similar folds and have a conserved large hydrophobic surface patch which may be a common protein-interacting surface present in diverse UBA domains. Evidence that ubiquitin binds to UBA domains leads to the prediction that the hydrophobic surface patch of UBA domains interacts with the hydrophobic surface on the five-stranded beta-sheet of ubiquitin.
This domain is similar in sequence to the N-terminal domain of translation elongation factor EF1B (or EF-Ts) from bacteria, mitochondria and chloroplasts.
More information about EF1B (EF-Ts) proteins can be found at Protein of the Month: Elongation Factors.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the PHD (homeodomain) zinc finger domain, which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.
The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The many different actin cross-linking proteins share a common architecture, consisting of a globular actin-binding domain and an extended rod. Whereas their actin-binding domains consist of two calponin homology domains (see, their rods fall into three families.
The rod domain of the family including the Dictyostelium discoideum (Slime mould) gelation factor (ABP120) and human filamin (ABP280) is constructed from tandem repeats of a 100-residue motif that is glycine and proline rich. The gelation factor's rod contains 6 copies of the repeat, whereas filamin has a rod constructed from 24 repeats. The resolution of the 3D structure of rod repeats from the gelation factor has shown that they consist of a beta-sandwich, formed by two beta-sheets arranged in an immunoglobulin-like fold. Because conserved residues that form the core of the repeats are preserved in filamin, the repeat structure should be common to the members of the gelation factor/filamin family.
The head to tail homodimerisation is crucial to the function of the ABP120 and ABP280 proteins. This interaction involves a small portion at the distal end of the rod domains. For the gelation factor it has been shown that the carboxy-terminal repeat 6 dimerises through a double edge-to-edge extension of the beta-sheet and that repeat 5 contributes to dimerisation to some extent.
The name HECT comes from 'Homologous to the E6-AP Carboxyl Terminus'. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.
The breast cancer type 2 susceptibility protein has a number of 39 amino acid repeats that are critical for binding to RAD51 (a key protein in DNA recombinational repair) and resistance to methyl methanesulphonate treatment. BRCA2 is a breast tumour suppressor with a potential function in the cellular response to DNA damage. At the cellular level, expression is regulated in a cell-cycle dependent manner and peak expression of BRCA2 mRNA is found in S phase, suggesting BRCA2 may participate in regulating cell proliferation. There are eight repeats in BRCA2 designated as BRC1 to BRC8. BRC1, BRC2, BRC3, BRC4, BRC7, and BRC8 are highly conserved and bind to Rad51, whereas BRC5 and BRC6 are less well conserved and do not bind to Rad51. It has been suggested that BRCA2 plays a role in positioning Rad51 at the site of DNA repair or in removing Rad51 from DNA once repair has been completed.
Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans.
MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.
This entry represents the 7-fold alpha-alpha-superhelical ARM-type repeat found at the C-terminal of clathrin heavy chains and in VPS (vacuolar protein sorting-associated) proteins. In clathrin heavy chains, the C-terminal 7-fold ARM-type repeats interact to form the central hub of the triskelion. VPS proteins are required for vacuolar assembly and vacuolar traffick, and contain one clathrin-type repeat.
More information about these proteins can be found at Protein of the Month: Clathrin.
Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran Binding Protein 1 (RanBP1) has guanine nucleotide dissociation inhibitory activity, specific for the GTP form of Ran and also functions to stimulate Ran GTPase activating protein(GAP)-mediated GTP hydrolysis by Ran. RanBP1 contributes to maintaining the gradient of RanGTP across the nuclear envelope high (GDI activity) or the cytoplasmic levels of RanGTP low (GAP cofactor).
All RanBP1 proteins contain an approx 150 amino acid residue Ran binding domain. Ran BP1 binds directly to RanGTP with high affinity. There are four sites of contact between Ran and the Ran binding domain. One of these involves binding of the C-terminal segment of Ran to a groove on the Ran binding domain that is analogous to the surface utilised in the EVH1Âpeptide interaction. Nup358 contains four Ran binding domains. The structure of the first of these is known.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the zinc finger domain found in RanBP2 proteins. Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran binding protein 2 (RanBP2) is a 358-kDa nucleoporin located on the cytoplasmic side of the nuclear pore complex which plays a role in nuclear protein import. RanBP2 contains multiple zinc fingers which mediate binding to RanGDP.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents C-x8-C-x5-C-x3-H (CCCH) type Zinc finger (Znf) domains. Proteins containing CCCH Znf domains include Znf proteins from eukaryotes involved in cell cycle or growth phase-related regulation, e.g. human TIS11B (butyrate response factor 1), a probable regulatory protein involved in regulating the response to growth factors, and the mouse TTP growth factor-inducible nuclear protein, which has the same function. The mouse TTP protein is induced by growth factors. Another protein containing this domain is the human splicing factor U2AF 35 kD subunit, which plays a critical role in both constitutive and enhancer-dependent splicing by mediating essential protein-protein interactions and protein-RNA interactions required for 3' splice site selection. It has been shown that different CCCH-type Znf proteins interact with the 3'-untranslated region of various mRNA. This type of Znf is very often present in two copies.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents B-box-type zinc finger domains, which are around 40 residues in length. B-box zinc fingers can be divided into two groups, where types 1 and 2 B-box domains differ in their consensus sequence and in the spacing of the 7-8 zinc-binding residues. Several proteins contain both types 1 and 2 B-boxes, suggesting some level of cooperativity between these two domains. B-box domains are found in over 1500 proteins from a variety of organisms. They are found in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). TRIM proteins contain a type 2 B-box domain, and may also contain a type 1 B-box. In proteins that do not contain RING or coiled-coil domains, the B-box domain is primarily type 2. Many type 2 B-box proteins are involved in ubiquitinylation. Proteins containing a B-box zinc finger domain include transcription factors, ribonucleoproteins and proto-oncoproteins; for example, MID1, MID2, TRIM9, TNL, TRIM36, TRIM63, TRIFIC, NCL1 and CONSTANS-like proteins.
The microtubule-associated E3 ligase MID1 contains a type 1 B-box zinc finger domain. MID1 specifically binds Alpha-4, which in turn recruits the catalytic subunit of phosphatase 2A (PP2Ac). This complex is required for targeting of PP2Ac for proteasome-mediated degradation. The MID1 B-box coordinates two zinc ions and adopts a beta/beta/alpha cross-brace structure similar to that of ZZ, PHD, RING and FYVE zinc fingers.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Poly(ADP-ribose) polymerases (PARP) are a family of enzymes present in eukaryotes, which catalyze the poly(ADP-ribosyl)ation of a limited number of proteins involved in chromatin architecture, DNA repair, or in DNA metabolism, including PARP itself. PARP, also known as poly(ADP-ribose) synthetase and poly(ADP-ribose) transferase, transfers the ADP-ribose moiety from its substrate, nicotinamide adenine dinucleotide (NAD), to carboxylate groups of aspartic and glutamic residues. Whereas some PARPs might function in genome protection, others appear to play different roles in the cell, including telomere replication and cellular transport. PARP-1 is a multifunctional enzyme. The polypeptide has a highly conserved modular organization consisting of an N-terminal DNA-binding domain, a central regulating segment, and a C-terminal or F region accommodating the catalytic centre. The F region is composed of two parts: a purely alpha-helical N- terminal domain (alpha-hd), and the mixed alpha/beta C-terminal catalytic domain bearing the putative NAD binding site. Although proteins of the PARP family are related through their PARP catalytic domain, they do not resemble each other outside of that region, but rather, they contain unique domains that distinguish them from each other and hint at their discrete functions. Domains with which the PARP catalytic domain is found associated include zinc fingers, SAP, ankyrin, BRCT, Macro, SAM, WWE and UIM domains.
The alpha-hd domain is about 130 amino acids in length and consists of an up-up-down-up-down-down motif of helices. It is thought to relay the activation signal issued on binding to damaged DNA. The PARP catalytic domain is about 230 residues in length. Its core consists of a five-stranded antiparallel beta-sheet and four-stranded mixed beta-sheet. The two sheets are consecutive and are connected via a single pair of hydrogen bonds between two strands that run at an angle of 90 degrees. These central beta-sheets are surrounded by five alpha-helices, three 3(10)-helices, and by a three- and a two-stranded beta-sheet in a 37-residue excursion between two central beta-strands. The active site, known as the 'PARP signature' is formed by a block of 50 amino acids that is strictly conserved among the vertebrates and highly conserved among all species. The 'PARP signature' is characteristic of all PARP protein family members. It is formed by a segment of conserved amino acid residues formed by a beta-sheet, an alpha-helix, a 3(10)-helix, a beta-sheet, and an alpha-helix.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains.
NAD(+) ADP-ribosyltransferase is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents a conserved domain usually found near the C-terminus of EF1B-gamma chains, a peptide of 410-440 residues. The gamma chain appears to play a role in anchoring the EF1B complex to the beta and delta chains and to other cellular components.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.
All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:
Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.
Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.
Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.
The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N terminus of a fraction of zinc finger proteins and in proteins that contain themotif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of sequences represent the p20 (20kDa) and p10 (10kDa) subunits of caspases, which together form the catalytic domain of the caspase and are derived from the p45 (45 kDa) precursor.
Caspases (Cysteine-dependent ASPartyl-specific proteASE) are cysteine peptidases that belong to the MEROPS peptidase family C14 (caspase family, clan CD) based on the architecture of their catalytic dyad or triad. Caspases are tightly regulated proteins that require zymogen activation to become active, and once active can be regulated by caspase inhibitors. Activated caspases act as cysteine proteases, using the sulphydryl group of a cysteine side chain for catalysing peptide bond cleavage at aspartyl residues in their substrates. The catalytic cysteine and histidine residues are on the p20 subunit after cleavage of the p45 precursor.
Caspases are mainly involved in mediating cell death (apoptosis). They have two main roles within the apoptosis cascade: as initiators that trigger the cell death process, and as effectors of the process itself. Caspase-mediated apoptosis follows two main pathways, one extrinsic and the other intrinsic or mitochondrial-mediated. The extrinsic pathway involves the stimulation of various TNF (tumour necrosis factor) cell surface receptors on cells targeted to die by various TNF cytokines that are produced by cells such as cytotoxic T cells. The activated receptor transmits the signal to the cytoplasm by recruiting FADD, which forms a death-inducing signalling complex (DISC) with caspase-8. The subsequent activation of caspase-8 initiates the apoptosis cascade involving caspases 3, 4, 6, 7, 9 and 10. The intrinsic pathway arises from signals that originate within the cell as a consequence of cellular stress or DNA damage. The stimulation or inhibition of different Bcl-2 family receptors results in the leakage of cytochrome c from the mitochondria, and the formation of an apoptosome composed of cytochrome c, Apaf1 and caspase-9. The subsequent activation of caspase-9 initiates the apoptosis cascade involving caspases 3 and 7, among others. At the end of the cascade, caspases act on a variety of signal transduction proteins, cytoskeletal and nuclear proteins, chromatin-modifying proteins, DNA repair proteins and endonucleases that destroy the cell by disintegrating its contents, including its DNA. The different caspases have different domain architectures depending upon where they fit into the apoptosis cascades, however they all carry the catalytic p10 and p20 subunits.
Caspases can have roles other than in apoptosis, such as caspase-1 (interleukin-1 beta convertase), which is involved in the inflammatory process. The activation of apoptosis can sometimes lead to caspase-1 activation, providing a link between apoptosis and inflammation, such as during the targeting of infected cells. Caspases may also be involved in cell differentiation.
The polyadenylate-binding protein (PABP) has a conserved C-terminal domain (PABC), which is also found in the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains. PABP recognises the 3' mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilisation/degradation. PABC domains of PABP are peptide-binding domains that mediate PABP homo-oligomerisation and protein-protein interactions. In mammals, the PABC domain of PABP functions to recruit several different translation factors to the mRNA poly(A) tail.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.
ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.
The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.
The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.
The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
A variety of ATP-binding transport proteins have a six transmembrane helical region. They are all integral membrane proteins involved in a variety of transport systems. Members of this family include; the cystic fibrosis transmembrane conductance regulator (CFTR), bacterial leukotoxin secretion ATP-binding protein, multidrug resistance proteins, the yeast leptomycin B resistance protein, the mammalian sulphonylurea receptor and antigen peptide transporter 2. Many of these proteins have two such regions.
Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis.
Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.
HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity.
This domain is found in sulphite reductase, NADPH cytochrome P450 reductase, nitric oxide synthase and methionine synthase reductase. Flavoprotein pyridine nucleotide cytochrome reductases (FPNCR) catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. The enzymes include ferredoxin:NADP+reductases (FNR), plant and fungal NAD(P)H:nitrate reductases, NADH:cytochrome b5 reductases, NADPH:P450 reductases, NADPH:sulphite reductases, nitric oxide synthases, phthalate dioxygenase reductase, and various other flavoproteins.
S-adenosyl-L-homocysteine hydrolase (AdoHcyase) is an enzyme of the activated methyl cycle, responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. AdoHcyase is an ubiquitous enzyme which binds and requires NAD+ as a cofactor. AdoHcyase is a highly conserved protein of about 430 to 470 amino acids.
This entry represents the glycine-rich region in the central part of AdoHcyase, which is thought to be involved in NAD-binding.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
L5 is a protein of about 180 amino-acid residues.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
The majority of the sequences in this entry are metallopeptidases and non-peptidase homologs belong to MEROPS peptidase family M16 (clan ME), subfamilies M16A, M16B and M16C; they include:
These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown that this H-x-x-E-H motif is involved in enzymatic activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity. The proteins classified as non-peptidase homologues either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome.
This entry represents the C-terminal domain found in EF2 (or EF-G) of both prokaryotes and eukaryotes (also known as eEF2), as well as in some tetracycline-resistance proteins. This domain adopts a ferredoxin-like fold consisting of an alpha/beta sandwich with anti-parallel beta-sheets. It resembles the topology of domain III found in these elongation factors, with which it forms the C-terminal block, but these two domains cannot be superimposed. This domain is often found associated with, which contains the signatures for the N-terminus of the proteins.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolyzing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.
This entry represents the conserved C-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain.
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.
This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Synonym(s): dUTP diphosphatase, Deoxyuridine-triphosphatase
The essential enzyme dUTP pyrophosphatase is specific for dUTP and is critical for the fidelity of DNA replication and repair. dUTPase hydrolyzes dUTP to dUMP and pyrophosphate, simultaneously reducing dUTP levels and providing the dUMP for dTTP biosynthesis. dUTPase decreases the intracellular concentration of dUPT so that uracil cannot be incorporated into DNA.
The crystal structure of human dUTPase reveals that each subunit of the dUTPase trimer folds into an eight-stranded jelly-roll beta barrel, with the C-terminal beta strands interchanged among the subunits. The structure is similar to that of the Escherichia coli enzyme, despite low sequence homology between the two enzymes.
Other enzymes like deoxycytidine triphosphate deaminase (dCTP) that specifically bind uridine also belong to this group suggesting that the signature may recognise a putative uridine-binding motif.
Some retroviruses encode dUTPases. Retroviral dUTPase is synthesised as part of POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, dUTPase and RNase H.
3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.
Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.
This entry represents the 'swivel' domain found at the C-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA. This domain has a three layer beta/beta/alpha structure, and in cytosolic Acn is known to rotate between the cAcn and IRP1 forms of the enzyme. This domain is also found in the small subunit of isopropylmalate dehydratase (LeuD).
More information about these proteins can be found at Protein of the Month: Aconitase.
This group of hydrolase enzymes is structurally different from the alpha/beta hydrolase family (abhydrolase). This group includes L-2-haloacid dehalogenase, epoxide hydrolases and phosphatases. The structure consists of two domains. One is an inserted four helix bundle, which is the least well conserved region of the alignment, between residues 16 and 96 of HAD1_PSESP. The rest of the fold is composed of the core alpha/beta domain.
O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.
Some members of this family belong to the chitinase class II group which includes chitinase, chitodextrinase and the killer toxin of Kluyveromyces lactis. The chitinases hydrolyse chitin oligosaccharides. The family also includes various glycoproteins from mammals; cartilage glycoprotein and the oviduct-specific glycoproteins are two examples.
Proliferating cell nuclear antigen (PCNA), or cyclin, is a non-histone acidic nuclear protein that plays a key role in the control of eukaryotic DNA replication. It acts as a co-factor for DNA polymerase delta, which is responsible for leading strand DNA replication. The sequence of PCNA is well conserved between plants and animals, indicating a strong selective pressure for structure conservation, and suggesting that this type of DNA replication mechanism is conserved throughout eukaryotes. In Saccharomyces cerevisiae (Baker's yeast), POL30, is associated with polymerase III, the yeast analog of polymerase delta.
Homologues of PCNA have also been identified in the archaea (Euryarchaeota and Crenarchaeota) and in Paramecium bursaria Chlorella virus 1 (PBCV-1) and in nuclear polyhedrosis viruses.
Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalysing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterised from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present: one involved in purine biosynthesis and the other in the purine nucleotide cycle.
The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry represents the C-terminal domain of the Escherichia coli LexA protein and the C-terminal domain of the E. coli signal peptidase (SPase). They share the same structural topology, consisting of a complex fold made of several coiled beta-sheets, and containing an SH3-like beta-barrel. This entry is associated with serine peptidases belong to MEROPS peptidase families: S24 (LexA family, clan SF); S26A (signal peptidase I) and S26B (signalase).
The S26 family includes E. coli signal peptidase, SPase, which is a membrane-bound endopeptidase, with two N-terminal transmembrane segments and a C-terminal catalytic region. SPase functions to release proteins that have been translocated into the inner membrane from the cell interior, by cleaving off their signal peptides.
The S24 family includes:
All of these proteins, with the possible exception of RulA, interact with RecA, which activates self cleavage either derepressing transcription in the case of CI and LexA or activating the lesion-bypass polymerase in the case of UmuD and MucA. UmuD'2, is the homodimeric component of DNA pol V, which is produced from UmuD by RecA-facilitated self-cleavage. The first 24 N-terminal residues of UmuD are removed; UmuD'2 is a DNA lesion bypass polymerase. MucA, like UmuD, is a plasmid encoded a DNA polymerase (pol RI) which is converted into the active lesion-bypass polymerase by a self-cleavage reaction involving RecA
This group of proteins also contains proteins not recognised as peptidases as well as those classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.
Inorganic pyrophosphatase (PPase) is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.
The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.
Endonuclease III is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair. The 3-D structures of Escherichia coli endonuclease III and catalytic domain of MutY have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL). Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs (see. The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif.
The HhH-GPD domain gets its name from its hallmark helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This domain is found in a diverse range of structurally related DNA repair proteins that include: endonuclease III,and DNA glycosylase MutY, an A/G-specific adenine glycosylase. Both of these enzymes have a C terminal iron-sulphur cluster loop (FCL). The methyl-CPG binding protein (MBD4) also contain a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II 8-oxoguanine DNA glycosylases and other members of the AlkA family.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma).
More information about these proteins can be found at Protein of the Month: Elongation Factors.
Viruses, parasites and bacteria are covered in protein and sugar molecules that help them gain entry into a host by counteracting the host's defences. One such molecule is the M protein produced by certain streptococcal bacteria. M proteins embody a motif that is now known to be shared by many Gram-positive bacterial surface proteins. The motif includes a conserved hexapeptide, which precedes a hydrophobic C-terminal membrane anchor, which itself precedes a cluster of basic residues. This structure is represented in the following schematic representation:
It has been proposed that this hexapeptide sequence is responsible for a post- translational modification necessary for the proper anchoring of the proteins which bear it, to the cell wall.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Glutamyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry represents the core region of arginyl-tRNA synthetase, which has been crystallized and preliminary X-ray crystallographic analysis of yeast arginyl-tRNA synthetase-yeast tRNAArg complexes is available.
Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.
XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.
This entry represents the N terminal of XPG.
Ferrochelatase catalyses the last step in haem biosynthesis: the chelation of a ferrous ion to proto-porphyrin IX, to form protohaem. In eukaryotic cells, it binds to the mitochondrial inner membrane with its active site on the matrix side of the membrane.
The X-ray structure of Bacillus subtilis and human ferrochelatase have been solved. The human enzyme exists as a homodimer. Each subunit contains one [Fe2S2] cluster. The monomer is folded into two similar domains, each with a four-stranded parallel beta-sheet flanked by an alpha-helix in a beta-alpha-beta motif that is reminiscent of the fold found in the periplasmic binding proteins. The topological similarity between the domains suggests that they have arisen from a gene duplication event. However, significant differences exist between the two domains, including an N-terminal section (residues 80-130) that forms part of the active site pocket, and a C-terminal extension (residues 390-423) that is involved in coordination of the [Fe2S2]cluster and in stabilisation of the homodimer. The [Fe2S2] cluster ligands are Cys196, Cys403, Cys406 and Cys411. The experiments with Co(II) binding show that His230 and Asp383 are part of the enzyme active site.
Ferrochelatase seems to have a structurally conserved core region that is common to the enzyme from bacteria, plants and mammals. Porphyrin binds in the identified cleft; this cleft also includes the metal-binding site of the enzyme. It is likely that the structure of the cleft region will have different conformations upon substrate binding and release.
This group of bacterial and eukaryotic proteins represent both characterised and related sequences to exoribonuclease II (RNase II)and ribonuclease R; a bacterial 3' --> 5' exoribonuclease homologous to RNase II.
The size of these proteins range from 644 residues (rnb) to 1250 (SSD1). While their sequence is highly divergent they share a conserved domain in their C-terminal section. It is possible that this domain plays a role in the exonuclease function.
Diacylglycerol kinase (DGK) phosphorylates diacylglycerol (DAG) to yield phosphatidic acid. This enzyme initiates resynthesis of phosphoinositides consumed by phospholipase C during cellular signal transduction. Mammalian DGK consists of nine isozymes encoded by separate genes. In addition to PKC-like zinc fingers and catalytic regions commonly conserved in all DGKs, these isozymes contain a variety of regulatory domains of known and/or predicted functions. The mammalian isozymes are named according to the order of their cDNA cloning and are subdivided into five groups based on their characteristic structural features. Each DGK isozyme is a critical downstream component of a DAG-dependent signalling system.
Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family.
This domain is usually associated with an accessory domain.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:
Based on their cellular localisation, PTPases are also classified as:
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.
This entry represents dual specificity protein-tyrosine phosphatases. Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr and tyrosine specific protein phosphatase activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. The crystal structure of a human DSP, vaccinia H1-related phosphatase (or VHR), has been determined at 2.1 angstrom resolution. A shallow active site pocket in VHR allows for the hydrolysis of phosphorylated serine, threonine, or tyrosine protein residues, whereas the deeper active site of protein tyrosine phosphatases (PTPs) restricts substrate specificity to only phosphotyrosine. Positively charged crevices near the active site may explain the enzyme's preference for substrates with two phosphorylated residues. The VHR structure defines a conserved structural scaffold for both DSPs and PTPs. A "recognition region" connecting helix alpha1 to strand beta1, may determine differences in substrate specificity between VHR, the PTPs, and other DSPs.
These proteins may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect again dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a new regulatory centre.
The PX (phox) domain occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities. The PX domain is approximately 120 residues long, and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of p47phox binds to the SH3 domain in the same protein. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain.
The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain.
Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The usually N-terminal C2 domain interacts mainly with the scaffolding helical domain of the enzyme, and exhibits only minor interactions with the catalytic domain. The domain consists of two four-stranded antiparallel beta-sheets that form a beta-sandwich. Isolated C2 domain binds multilamellar phospholipid vesicles which suggests that this domain could play a role in membrane association. Membrane attachment by C2 domains is typically mediated by the loops connecting beta-strand regions that in other C2 domain-containing proteins are calcium-binding region
Syntaxins A and B are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane. Syntaxins are a family of receptors for intracellular transport vesicles. Each target membrane may be identified by a specific member of the syntaxin family. Members of the syntaxin family have a size ranging from 30 Kd to 40 Kd; a C-terminal extremity which is highly hydrophobic and anchors the protein on the cytoplasmic surface of cellular membranes; a central, well conserved region, which seems to be in a coiled-coil conformation.
The drosophila pumilio gene codes for an unusual protein that binds through the Puf domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of Caenorhabditis elegans also has a Puf domain. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.
In Saccharomyces cerevisiae (Baker's yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.
The CCAAT-binding factor (CBF) is a mammalian transcription factor that binds to a CCAAT motif in the promoters of a wide variety of genes, including type I collagen and albumin. The factor is a heteromeric complex of A and B subunits, both of which are required for DNA-binding. The subunits can interact in the absence of DNA-binding, conserved regions in each being important in mediating this interaction.
The A subunit can be split into 3 domains on the basis of sequence similarity, a non-conserved N-terminal 'A domain'; a highly-conserved central 'B domain' involved in DNA-binding; and a C-terminal 'C domain', which contains a number of glutamine and acidic residues involved in protein-protein interactions. The A subunit shows striking similarity to the HAP3 subunit of the yeast CCAAT-binding heterotrimeric transcription factor. The Kluyveromyces lactis HAP3 protein has been predicted to contain a 4-cysteine zinc finger, which is thought to be present in similar HAP3 and CBF subunit A proteins, in which the third cysteine is replaced by a serine. This domain is found in the CCAAT transcription factor and archaeal histones.
All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesize folate de novo because they lack the active transport system of higher vertebrate cells that allows these organisms to use dietary folates. Proteins containing this domain include dihydropteroate synthase as well as a group of methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulphur protein methyltransferase (MeTr)that catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation.
Dihydropteroate synthase (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).
Proteins resident in the lumen of the endoplasmic reticulum (ER) contain a C-terminal tetrapeptide, commonly known as Lys-Asp-Glu-Leu (KDEL) in mammals and His-Asp-Glu-Leu (HDEL) in yeast (Saccharomyces cerevisiae) that acts as a signal for their retrieval from subsequent compartments of the secretory pathway. The receptor for this signal is a ~26 kDa Golgi membrane protein, initially identified as the ERD2 gene product in S. cerevisiae. The receptor molecule, known variously as the ER lumen protein retaining receptor or the 'KDEL receptor', is believed to cycle between the cis side of the Golgi apparatus and the ER. It has also been characterised in a number of other species, including plants, Plasmodium, Drosophila and mammals. In mammals, 2 highly related forms of the receptor are known.
The KDEL receptor is a highly hydrophobic protein of 220 residues; its sequence exhibits 7 hydrophobic regions, all of which have been suggested to traverse the membrane. More recently, however, it has been suggested that only 6 of these regions are transmembrane (TM), resulting in both N- and C-termini on the cytoplasmic side of the membrane.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to MEROPS peptidase family M22 (clan MK). The type example being O-sialoglycoprotein endopeptidase from Pasteurella haemolytica (Mannheimia haemolytica).
O-Sialoglycoprotein endopeptidase is secreted by the bacterium P. haemolytica, and digests only proteins that are heavily sialylated, in particular those with sialylated serine and threonine residues. Substrate proteins include glycophorin A and leukocyte surface antigens CD34, CD43, CD44 and CD45. Removal of glycosylation, by treatment with neuraminidase, completely negates susceptibility to O-sialoglycoprotein endopeptidase digestion.
Sequence similarity searches have revealed other members of the M22 family, from yeast, Mycobacterium, Haemophilus influenzae and the cyanobacterium Synechocystis. The zinc-binding and catalytic residues of this family have not been determined, although the motif HMEGH may be a zinc-binding region.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:
These proteins have about 200 amino acid residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Members of this family are large subunit ribosomal proteins which are found in the Eukaryota and Archaea. These proteins have 115 to 187 amino-acid residues. The family consists of:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L21 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L21 is known to bind to the 23S rRNA in the presence of L20. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Bacterial L21 is a protein of about 100 amino-acid residues, the mature form of the spinach chloroplast L21 has 200 residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The ribosomal L28 protein family include proteins from bacteria and chloroplasts. The L24 protein from yeast, found in the large subunit of the mitochodrial ribosome, contains a region similar to the bacterial L28 protein.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. L29 belongs to a family of ribosomal proteins of 63 to 138 amino-acid residues which, on the basis of sequence similarities, groups:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped in this family of ribosomal proteins, S17e. They include, vertebrate, Drosophila and Neurospora crassa (crp-3) S17's as well as yeast S17a (RP51A) and S17b (RP51B) and archaebacterial S17e.
Mammalian translationally controlled tumour protein (TCTP) (or P23) is a protein which has been found to be preferentially synthesised in cells during the early growth phase of some types of tumour, but which is also expressed in normal cells. The physiological function of TCTP is still not known. It was first identified as a histamine-releasing factor, acting in IgE +-dependent allergic reactions. In addition, TCTP has been shown to bind to tubulin in the cytoskeleton, has a high affinity for calcium, is the binding target for the antimalarial compound artemisinin, and is induced in vitamin D-dependent apoptosis. TCTP production is thought to be controlled at the translational as well as the transcriptional level.
TCTP is a hydrophilic protein of 18 to 20 Kd. TCTPs do not share significant sequence similarity with any other class of proteins. Recently, the structure of TCTP was determined and exhibited significant structural similarity to the human protein Mss4, which is a guanine nucleotide-free chaperone of the Rab protein. Close homologues have been found in plants, earthworm, Caenorhabditis elegans (F52H2.11), Hydra, Saccharomyces cerevisiae (YKL056c) and Schizosaccharomyces pombe (SpAC1F12.02c).
Pathogenesis-related genes transcriptional activator binds to the GCC-box pathogenesis-related promoter element and activates the plant's defence genes. Ethylene, chemically the simplest plant hormone, participates in a number of stress responses and developmental processes: e.g., fruit ripening, inhibition of stem and root elongation, promotion of seed germination and flowering, senescence of leaves and flowers, and sex determination. DNA sequence elements that confer ethylene responsiveness have been shown to contain two 11bp GCC boxes, which are necessary and sufficient for transcriptional control by ethylene. Ethylene responsive element binding proteins (EREBPs) have now been identified in a variety of plants. The proteins share a similar domain of around 59 amino acids, which interacts directly with the GCC box in the ERE.
The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure.
Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.
The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.
The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family includes L18 from bacteria and L5 from eukaryotes. The ribosomal 5S RNA is the only known rRNA species to bind a ribosomal protein before its assembly into the ribosomal subunits . In eukaryotes, the 5S rRNA molecule binds one protein species, a 34-kDa protein which has been implicated in the intracellular transport of 5 S rRNA, while in bacteria it binds two or three different protein species .
Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.
XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.
DNA photolyases are enzymes that bind to DNA containing pyrimidine dimers: on absorption of visible light, they catalyse dimer splitting into the constituent monomers, a process called photoreactivation. This is a DNA repair mechanism, repairing mismatched pyrimidine dimers induced by exposure to ultra-violet light. The precise mechanisms involved in substrate binding, conversion of light energy to the mechanical energy needed to rupture the cyclobutane ring, and subsequent release of the product are uncertain. Analysis of DNA lyases has revealed the presence of an intrinsic chromophore, all monomers containing a reduced FAD moiety, and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a second chromophore. Either chromophore may act as the primary photon acceptor, peak absorptions occurring in the blue region of the spectrum and in the UV-B region, at a wavelength around 290nm.
This domain binds a light harvesting cofactor.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine).
Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids.
The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain.
Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.
ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.
Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.
The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.
Other proteins containing an ACB domain include:
Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF), play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins.
This entry represents the N-terminal region of cullin proteins, which consists of several domains, including cullin repeat domain, a 4-helical bundle domain, an alpha+beta domain, and a winged helix-like domain.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).
This entry represents the C-terminal dimerisation domain found primarily in EF-Tu (EF1A) proteins from bacteria, mitochondria and chloroplasts.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
In bacteria two distinct, membrane-bound, enzyme complexes are responsible for the interconversion of fumarate and succinate : fumarate reductase (Frd) is used in anaerobic growth, and succinate dehydrogenase (Sdh) is used in aerobic growth. Both complexes consist of two main components: a membrane-extrinsic component composed of a FAD-binding flavoprotein and an iron-sulphur protein; and an hydrophobic component composed of a membrane anchor protein and/or a cytochrome B.
In eukaryotes mitochondrial succinate dehydrogenase (ubiquinone) is an enzyme composed of two subunits: a FAD flavoprotein and and iron-sulphur protein.
The flavoprotein subunit is a protein of about 60 to 70 Kd to which FAD is covalently bound to a histidine residue which is located in the N-terminal section of the protein. The sequence around that histidine is well conserved in Frd and Sdh from various bacterial and eukaryotic species.
This family includes members that bind FAD such as the flavoprotein subunits from succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate reductase.
The family of ubiquitin-activating enzymes shares in its catalytic domain significant similarity with a large family of NAD/FAD-binding proteins. This domain is based on the common NAD/FAD-binding fold and finds members of several families, including UBA ubiquitin activating enzymes; the hesA/moeB/thiF family; NADH peroxidases; the LDH family; sarcosin oxidase; phytoene dehydrogenases; alanine dehydrogenases; hydroxyacyl-CoA dehydrogenases and many other NAD/FAD dependent dehydrogenases and oxidases.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes yeast S7 (YS6); archaeal S4e; and mammalian and plant cytoplasmic S4. Two highly similar isoforms of mammalian S4 exist, one coded by a gene on chromosome Y, and the other on chromosome X. These proteins have 233 to 264 amino acids.
This entry represents the central region of these proteins.
A number of proteins involved in the transport of sulphate across a membrane as well as some yet uncharacterised proteins have been shown to be evolutionary related. These proteins are:
These proteins are highly hydrophobic and seem to contain about 12 transmembrane domains.
Although apparently functionally unrelated, intracellular TRAFs and extracellular meprins share a conserved region of about 180 residues, the meprin and TRAF homology (MATH) domain. Meprins are mammalian tissue-specific metalloendopeptidases of the astacin family implicated in developmental, normal and pathological processes by hydrolysing a variety of proteins. Various growth factors, cytokines, and extracellular matrix proteins are substrates for meprins. They are composed of five structural domains: an N-terminal endopeptidase domain, a MAM domain (see, a MATH domain, an EGF-like domain (see and a C-terminal transmembrane region. Meprin A and B form membrane bound homotetramer whereas homooligomers of meprin A are secreted. A proteolitic site adjacent to the MATH domain, only present in meprin A, allows the release of the protein from the membrane.
TRAF proteins were first isolated by their ability to interact with TNF receptors . They promote cell survival by the activation of downstream protein kinases and, finally, transcription factors of the NF-kB and AP-1 family. The TRAF proteins are composed of 3 structural domains: a RING finger (see in the N-terminal part of the protein, one to seven TRAF zinc fingers (see in the middle and the MATH domain in the C-terminal part . The MATH domain is necessary and sufficient for self-association and receptor interaction. From the structural analysis two consensus sequence recognized by the TRAF domain have been defined: a major one, [PSAT]x[QE]E and a minor one, PxQxxD.
The structure of the TRAF2 protein reveals a trimeric self-association of the MATH domain. The domain forms a new, light-stranded antiparallel beta sandwich structure. A coiled-coil region adjacent to the MATH domain is also important for the trimerisation. The oligomerisation is essential for establishing appropriate connections to form signalling complexes with TNF receptor-1. The ligand binding surface of TRAF proteins is located in beta-strands 6 and 7.
This entry represents an N-terminal domain found in a family of proteins defined by sequence similarity. Most of these proteins are not yet characterised, but those that are include
Mechanosensitive (MS) channels provide protection against hypo-osmotic shock, responding both to stretching of the cell membrane and to membrane depolarisation. They are present in the membranes of organisms from the three domains of life: bacteria, archaea, and eukarya. There are two families of MS channels: large-conductance MS channels (MscL) and small-conductance MS channels (MscS or YGGB). The pressure threshold for MscS opening is 50% that of MscL. The MscS family is much larger and more variable in size and sequence than the MscL family. Much of the diversity in MscS proteins occurs in the size of the transmembrane regions, which ranges from three to eleven transmembrane helices, although the three C-terminal helices are conserved. This family contains sequences form the MscS family of proteins.
MscS folds as a homo-heptamer with a cylindrical shape, and can be divided into transmembrane and extramembrane regions: an N-terminal periplasmic region, a transmembrane region, and a C-terminal cytoplasmic region (middle and C-terminal domains). The transmembrane region forms a channel through the membrane that opens into a chamber enclosed by the extramembrane portion, the latter connecting to the cytoplasm through distinct portals.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
This entry represents the C-terminal domain of the mu subunit from various clathrin adaptors (AP1, AP2 and AP3). The C-teminal domain has an immunoglobulin-like beta-sandwich fold consisting of 9 strands in 2 sheets with a Greek key topology, similar to that found in cytochrome f and certain transcription factors. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle.
More information about these proteins can be found at Protein of the Month: Clathrin.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian, Trypanosoma brucei, Caenorhabditis elegans and fungal L44, and Haloarcula marismortui LA.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
This is a family of single chain polymerases, which are evolutionary related, and which are related to the T3/T7 bacteriophage polymerases.
This entry represents a conserved region found in a family of UDP-GlcNAc/MurNAc: polyisoprenol-P GlcNAc/MurNAc-1-P transferases. Members of the family include eukaryotic N-acetylglucosamine-1-phosphate transferases, which catalyse the conversion of UDP-N-acteyl-D-glucosamine and dolichyl phosphate to UMP and N-acetyl-D-glucosaminyl-diphosphodolichol in the glycosylation pathway; and bacterial phospho-N-acetylmuramoyl-pentapeptide-transferases, which catalyse the first step of the lipid cycle reactions in the biosynthesis of cell wall peptidoglycan.
It is thought that NAPs act as histone chaperones, shuttling both core and linker histones from their site of synthesis in the cytoplasm to the nucleus. The proteins may be involved in regulating gene expression and therefore cellular differentiation.
The centrosomal protein c-Nap1, also known as Cep250, has been implicated in the cell-cycle-regulated cohesion of microtubule-organizing centres. This 281 kDa protein consists mainly of domains predicted to form coiled coil structures. The C-terminal region defines a novel histone-binding domain that is responsible for targeting CNAP1, and possibly condensin, to mitotic chromosomes. During interphase, C-Nap1 localizes to the proximal ends of both parental centrioles, but it dissociates from these structures at the onset of mitosis. Re-association with centrioles then occurs in late telophase or at the very beginning of G1 phase, when daughter cells are still connected by post-mitotic bridges. Electron microscopic studies performed on isolated centrosomes suggest that a proteinaceous linker connects parental centrioles and C-Nap1 may be part of a linker structure that assures the cohesion of duplicated centrosomes during interphase, but that is dismantled upon centrosome separation at the onset of mitosis.
Synaptobrevin is an intrinsic membrane protein of small synaptic vesicles, specialised secretory organelles of neurons that actively accumulate neurotransmitters and participate in their calcium-dependent release by exocytosis. Vesicle function is mediated by proteins in their membranes, although the precise nature of the protein-protein interactions underlying this are still uncertain. Synaptobrevin may play a role in the molecular events underlying neurotransmitter release and vesicle recycling and may be involved in the regulation of membrane flow in the nerve terminal, a process mediated by interaction with low molecular weight GTP-binding proteins. Synaptic vesicle-associated membrane proteins (VAMPs) from Torpedo californica (Pacific electric ray) and SNC1 from yeast are related to synaptobrevin.
The amidotransferase family of enzymes utilises the ammonia derived from the hydrolysis of glutamine for a subsequent chemical reaction catalyzed by the same enzyme. The ammonia intermediate does not dissociate into solution during the chemical transformations. GMP synthetase is a glutamine amidotransferase from the de novo purine biosynthetic pathway. The C-terminal domain is specific to the GMP synthases In prokaryotes this domain mediates dimerisation. Eukaryotic GMP synthases are monomers. This domain in eukaryotes includes several large insertions that may form globular domains.
Adenosine deaminase catalyzes the hydrolytic deamination of adenosine into inosine and AMP deaminase catalyzes the hydrolytic deamination of AMP into IMP. It has been shown that these two enzymes share three regions of sequence similarities; these regions are centred on residues which are proposed to play an important role in the catalytic mechanism of these two enzymes.
These sequences contain an oxidoreductase FAD-binding domain.
To date, the 3D-structures of the flavoprotein domain of Zea mays (Maize) nitrate reductase and of pig NADH:cytochrome b5 reductase have been solved. The overall fold is similar to that of ferredoxin:NADP+ reductase: the FAD-binding domain (N-terminal) has the topology of an anti-parallel beta-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel beta-sheet flanked by 2 helices on each side).
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions, domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.
Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.
This entry represents the C-terminal region (C-terminal part of domain 2) of subunit B found in topoisomerase II (gyrB) and topoisomerase IV (parE), which are primarily of bacterial origin. It does not include the topoisomerase II enzymes composed of a single polypeptide, as are found in most eukaryotes. This region is involved in subunit interaction, which accounts for the difference between subunit B and single polypeptide topoisomerase II.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the N-terminal domain of the small subunit of carbamoyl phosphate synthase. The small subunit catalyses the hydrolysis of glutamine to ammonia, which in turn used by the large chain to synthesize carbamoyl phosphate. The small subunit has a 3-layer beta/beta/alpha structure, and is thought to be mobile in most proteins that carry it. The C-terminal domain of the small subunit of CPSase has glutamine amidotransferase activity.
Sec1-like molecules have been implicated in a variety of eukaryotic vesicle transport processes including neurotransmitter release by exocytosis. They regulate vesicle transport by binding to a t-SNARE from the syntaxin family. This process is thought to prevent SNARE complex formation, a protein complex required for membrane fusion. Whereas Sec1 molecules are essential for neurotransmitter release and other secretory events, their interaction with syntaxin molecules seems to represent a negative regulatory step in secretion.
The mechanism of REP-1-mediated membrane association of Rab5 is similar to that mediated by Rab GDP dissociation inhibitor (GDI). REP-1 and Rab GDI also share other functional properties, including the ability to inhibit the release of GDP and to remove Rab proteins from membranes.
The crystal structure of the bovine alpha-isoform of Rab GDI has been determined to a resolution of 1.81A. The protein is composed of two main structural units: a large complex multi-sheet domain I, and a smaller alpha-helical domain II.
The structural organisation of domain I is closely related to FAD-containing monooxygenases and oxidases. Conserved regions common to GDI and the choroideraemia gene product, which delivers Rab to catalytic subunits of Rab geranylgeranyltransferase II, are clustered on one face of the domain. The two most conserved regions form a compact structure at the apex of the molecule; site-directed mutagenesis has shown these regions to play a critical role in the binding of Rab proteins.
Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia. Human NHE is also involved in heart disease, cell growth and in cell differentiation. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9). These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport.
This entry represents a number of cation/proton exchangers, including Na+/H+ exchangers, K+/H+ exchangers and Na+(K+,Li+,Rb+)/H+ exchangers.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha (RpoA) subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta') can interact.
The dimerisation domains differ between the different subunit families. In eukaryotic Rpb3, archaeal D and bacterial RpoA subunits, the dimerisation domain is comprised of a central insert domain, which interrupts an Rpb11-like domain, dividing it into two halves. In eukaryotic Rpb11 and archaeal L subunits, the insert domain is lacking, leaving the Rpb11-like domain intact and contiguous.
Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. The eukaryotic translation initiation factor EIF-2B is a complex made up of five different subunits, alpha, beta, gamma, delta and epsilon, and catalyses the exchange of EIF-2-bound GDP for GTP. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes; related proteins from archaebacteria and IF-2 from prokaryotes and also contains a subfamily of proteins in eukaryotes, archaeae (e.g. Pyrococcus furiosus), or eubacteria such as Bacillus subtilis and Thermotoga maritima. Many of these proteins were initially annotated as putative translation initiation factors despite the fact that there is no evidence for the requirement of an IF2 recycling factor in prokaryotic translation initiation. Recently, one of these proteins from B. subtilis has been functionally characterised as a 5-methylthioribose-1-phosphate isomerase (MTNA). This enzyme participates in the methionine salvage pathway catalysing the isomerisation of 5-methylthioribose-1-phosphate to 5-methylthioribulose-1-phosphate. The methionine salvage pathway leads to the synthesis of methionine from methylthioadenosine, the end product of the spermidine and spermine anabolism in many species.
Enzymes in this group have repeats of a beta propeller.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins that have from 220 to 250 amino acids.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.
Several proteins have recently been shown to contain the 5 structural motifs characteristic of GTP-binding proteins. These include murine DRG protein; GTP1 protein from Schizosaccharomyces pombe; OBG protein from Bacillus subtilis; and several others. Although the proteins contain GTP-binding motifs and are similar to each other, they do not share sequence similarity to other GTP-binding proteins, and have thus been classed as a novel group, the GTP1/OBG family. As yet, the functions of these proteins is uncertain, but they have been shown to be important in development and normal cell metabolism.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes the grpE protein. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
The X-ray crystal structure of GrpE in complex with the ATPase domain of DnaK revealed that GrpE is an asymmetric homodimer, bent in a manner that favours extensive contacts with only one DnaKATPase monomer. GrpE does not actively compete for the atomic positions occupied by the nucleotide. GrpE and ADP mutually reduce one another's affinity for DnaK 200-fold, and ATP instantly dissociates GrpE from DnaK.
This family of proteins of unknown function contains a subset of Bax inhibitor-1 proteins.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry represents the catalytic core of eukaryotic and viral topoisomerase I (type IB) enzymes, which occurs near the C-terminal region of the protein.
Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.
Vaccinia virus, a cytoplasmically-replicating poxvirus, encodes a type I DNA topoisomerase that is biochemically similar to eukaryotic-like DNA topoisomerases I, and which has been widely studied as a model topoisomerase. It is the smallest topoisomerase known and is unusual in that it is resistant to the potent chemotherapeutic agent camptothecin. The crystal structure of an amino-terminal fragment of vaccinia virus DNA topoisomerase I shows that the fragment forms a five-stranded, antiparallel beta-sheet with two short alpha-helices and connecting loops. Residues that are conserved between all eukaryotic-like type I topoisomerases are not clustered in particular regions of the structure.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Members in this domain include biotin dependent carboxylases. The carboxyl transferase domain carries out the following reaction; transcarboxylation from biotin to an acceptor molecule. There are two recognised types of carboxyl transferase. One of them uses acyl-CoA and the other uses 2-oxo acid as the acceptor molecule of carbon dioxide. All of the members in this family utilise acyl-CoA as the acceptor molecule.
The COX10/ctaB/cyoE signature is found in prenyltransferases including bacterial 4-hydroxybenzoate octaprenyltransferase (gene ubiA); yeast mitochondrial para-hydroxybenzoate--polyprenyltransferase (gene COQ2); and protohaem IX farnesyltransferase (haem O synthase) from yeast and mammals(gene COX10), and from bacteria (genes cyoE or ctaB). These are integral membrane proteins, which probably contain seven transmembrane segments. The signature is also found in cytochrome C oxidase assembly factor. The complexity of cytochrome C oxidase requires assistance in building the complex, and this is carried out by the cytochrome C oxidase assembly factor.
Phosphorylases in this entry include:
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.
ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.
The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.
The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.
The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
A number of bacterial transport systems have been found to contain integral membrane components that have similar sequences: these systems fit the characteristics of ATP-binding cassette transporters. The proteins form homo- or hetero-oligomeric channels, allowing ATP-mediated transport. Hydropathy analysis of the proteins has revealed the presence of 6 possible transmembrane regions. These proteins belong to family 2 of ABC transporters.Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies.
One of these, called class-IV, currently consists of proteins of about 270 to 415 amino-acid residues that share a few regions of sequence similarity. Surprisingly, the best conserved region does not include the lysine residue to which the pyridoxal-phosphate group is known to be attached, in ilvE, but is located some 40 residues at the C terminus side of the pyridoxal-phosphate-lysine. The D-amino acid transferases (D-AAT), which are among the members of this entry, are required by bacteria to catalyse the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.
All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:
Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.
Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.
This domain belongs to a more diverse superfamily, including catalytic domain of the mRNA capping enzyme and NAD-dependent DNA ligase.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.
The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein. S18 belongs to a family of ribosomal proteins that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.
Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.
The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family includes a number of eukaryotic and archaebacterial ribosomal proteins; mammalian S19, Drosophila S19, Ascaris lumbricoides S19g (ALEP-1) and S19s, yeast YS16 (RP55A and RP55B), Aspergillus S16 and Haloarcula marismortui HS12.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins have been grouped on the basis of sequence similarities. Ribosomal protein S6 is the major substrate of protein kinases in eukaryotic ribosomes and may play an important role in controlling cell growth and proliferation through the selective translation of particular classes of mRNA.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a zinc finger motif found in transcription factor IIs (TFIIS). In eukaryotes the initiation of transcription of protein encoding genes by polymerase II (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least eight different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, -IIH and -IIS. During mRNA elongation, Pol II can encounter DNA sequences that cause reverse movement of the enzyme. Such backtracking involves extrusion of the RNA 3'-end into the pore, and can lead to transcriptional arrest. Escape from arrest requires cleavage of the extruded RNA with the help of TFIIS, which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS extends from the polymerase surface via a pore to the internal active site. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.
TFIIS is a protein of about 300 amino acids. It contains three regions: a variable N-terminal domain not required for TFIIS activity; a conserved central domain required for Pol II binding; and a conserved C-terminal C4-type zinc finger essential for RNA cleavage. The zinc finger folds in a conformation termed a zinc ribbon characterised by a three-stranded antiparallel beta-sheet and two beta-hairpins. A backbone model for Pol II-TFIIS complex was obtained from X-ray analysis. It shows that a beta hairpin protrudes from the zinc finger and complements the pol II active site.
Some viral proteins also contain the TFIIS zinc ribbon C-terminal domain. The Vaccinia virus protein, unlike its eukaryotic homologue, is an integral RNA polymerase subunit rather than a readily separable transcription factor.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
p24 proteins are major membrane components of COPI- and COPII-coated vesicles and are implicated in cargo selectivity of ER to Golgi transport. Multiple members of the p24 family are found in all eukaryotes, from yeast to mammals. Members of the p24 family are type I membrane proteins with a signal peptide at the amino terminus, a lumenal coiled-coil (extracytosolic) domain, a single transmembrane domain with conserved amino acids, and a short cytoplasmic tail. They may be grouped into at least three subfamilies based on primary sequence. One subfamily comprises yeast Emp24p and mammalian p24A. Another subfamily comprises yeast Erv25p and mammalian Tmp21, and the third subfamily comprises mammalian gp25L proteins.
Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.
The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.
The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.
In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.
This entry represents the C-terminal of NifU and homologous proteins. NifU contains two domains: an N-terminal and a C-terminal domain. These domains exist either together or on different polypeptides, both domains being found in organisms that do not fix nitrogen (e.g. yeast), so they have a broader significance in the cell than nitrogen fixation.
The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis, as well as muscle contraction.
The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins.
The beta subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species.
This entry represents the C-terminal domain of DNA mismatch repair proteins, such as MutL. This domain functions in promoting dimerisation. The dimeric MutL protein has a key function in communicating mismatch recognition by MutS to downstream repair processes. Mismatch repair contributes to the overall fidelity of DNA replication by targeting mispaired bases that arise through replication errors during homologous recombination and as a result of DNA damage. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex.
This family contains dephospho-CoA kinases, which catalyzes the final step in CoA biosynthesis, the phosphorylation of the 3'-hydroxyl group of ribose using ATP as a phosphate donor.
The crystal structures of a number of the proteins in this entry have been determined, including the structure of the protein from Haemophilus influenzae to 2.0-A resolution in a comlex with ATP. The protein consists of three domains: the nucleotide-binding domain with a five-stranded parallel beta-sheet, the substrate-binding alpha-helical domain, and the lid domain formed by a pair of alpha-helices; the overall topology of the protein resembles the structures of other nucleotide kinases.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
Type IA topoisomerases are comprised of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA: an N-terminal alpha/beta Toprim domain, domain 2 and the C-terminal domain 4 are winged-helix domains, and domain 3 is a beta-barrel. Domains 1 (Toprim) and 3 form the active site of the enzyme, while the winged helix domains 2 and 4 form a single-strand DNA-binding groove. This entry represents the central portion of the enzyme, which covers domains 2 and 3 in topoisomerase type IA enzymes.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
GidA is a tRNA modification enzyme found in bacteria and mitochondria. Though its precise molecular function of these proteins is not known, it is involved in the 5-carboxymethylaminomethyl modification of the wobble uridine base in some tRNAs. Sequence variations in the human mitochondrial protein may influence the severity of aminoglycoside-induced deafness.
This entry represents GidA and related proteins, such as Gid, whose functions are not known.
Protein-L-isoaspartate(D-aspartate) O-methyltransferase (PCMT) (which is also known as L-isoaspartyl protein carboxyl methyltransferase) is an enzyme that catalyses the transfer of a methyl group from S-adenosylmethionine to the free carboxyl groups of D-aspartyl or L-isoaspartyl residues in a variety of peptides and proteins. The enzyme does not act on normal L-aspartyl residues L-isoaspartyl and D-aspartyl are the products of the spontaneous deamidation and/or isomerisation of normal L-aspartyl and L-asparaginyl residues in proteins. PCMT plays a role in the repair and/or degradation of these damaged proteins; the enzymatic methyl esterification of the abnormal residues can lead to their conversion to normal L-aspartyl residues. The SAM domain is present in most of these proteins.
ATP + RNA 3'-terminal-phosphate = AMP + diphosphate + RNA terminal-2',3'-cyclic-phosphateThese enzymes might be responsible for production of the cyclic phosphate RNA ends that are known to be required by many RNA ligases in both prokaryotes and eukaryotes.
RNA cyclase is a protein of from 36 to 42 kDa. The best conserved region is a glycine-rich stretch of residues located in the central part of the sequence and which is reminiscent of various ATP, GTP or AMP glycine-rich loops.
The crystal structure of RNA 3'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3). The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a similar secondary structure element with different topology, observed in many other proteins such as thioredoxin. Although the active site of this enzyme could not be unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate acceptor, in which a number of amino acids are highly conserved in the enzyme from different sources.
The PH (phosphorolytic) domain is responsible for 3'-5' exoribonuclease activity, although in some proteins this domain has lost its catalytic function. An active PH domain uses inorganic phosphate as a nucleophile, adding it across the phosphodiester bond between the end two nucleotides in order to release ribonucleoside 5'-diphosphate (rNDP) from the 3' end of the RNA substrate.
PH domains can be found in bacterial/organelle RNases and PNPases (polynucleotide phosphorylases), as well as in archaeal and eukaryotic RNA exosomes, the later acting as nano-compartments for the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA). Bacterial/organelle PNPases share a common barrel structure with RNA exosomes, consisting of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber . Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel.
This entry represents the phosphorolytic (PH) domain 1, which has a core 2-layer alpha/beta structure with a left-handed crossover, similar to that found in ribosomal protein S5. This domain is found in bacterial/organelle PNPases and in archaeal/eukaryotic exosomes.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
This entry represents tRNA pseudouridine synthase D (TruD) proteins, which appear to be responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. They are hydrophilic proteins of from 39 to 77 kDa and homologues are found in bacteria, archaea, and eukarya.
CTP + phosphatidate = diphosphate + CDP-diacylglycerolCDP-diacylglycerol is an important branch point intermediate in both prokaryotic and eukaryotic organisms. CDS is a membrane-bound enzyme.
A number of nucleoside diphosphate and triphosphate hydrolases as well as some yet uncharacterised proteins have been found to belong to the same family. The uncharacterised proteins all seem to be membrane-bound.
CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).
This group of eukaryotic integral membrane proteins are evolutionary related, but exact function has not yet clearly been established. The proteins have from 290 to 435 amino acid residues. Structurally, they seem to be formed of three sections: a N-terminal region with two transmembrane domains, a central hydrophilic loop and a C-terminal region that contains from one to three transmembrane domains. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signalling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L21E family contains proteins from a number of eukaryotic and archaebacterial organisms which include; mammalian L2, Entamoeba histolytica L21, Caenorhabditis elegans L21 (C14B9.7), Saccharomyces cerevisiae (Baker's yeast) L21E (URP1) and Haloarcula marismortui HL31.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The L36E ribosomal family consists of mammalian, Caenorhabditis elegans and Drosophila L36, Candida albicans L39, and yeast YL39 ribosomal proteins.
The PEBP (PhosphatidylEthanolamine-Binding Protein) family is a highly conserved group of proteins that have been identified in numerous tissues in a wide variety of organisms, including bacteria, yeast, nematodes, plants, drosophila and mammals. The various functions described for members of this family include lipid binding, neuronal development, serine protease inhibition, the control of the morphological switch between shoot growth and flower structures, and the regulation of several signalling pathways such as the MAP kinase pathway, and the NF-kappaB pathway. The control of the latter two pathways involves the PEBP protein RKIP, which interacts with MEK and Raf-1 to inhibit the MAP kinase pathway, and with TAK1, NIK, IKKalpha and IKKbeta to inhibit the NF-kappaB pathway. Other PEBP-like proteins that show strong structural homology to PEBP include Escherichia coli YBHB and YBCL, the Rattus norvegicus (Rat) neuropeptide HCNP, and Antirrhinum majus (Garden snapdragon) protein centroradialis (CEN).
Structures have been determined for several members of the PEBP-like family, all of which show extensive fold conservation. The structure consists of a large central beta-sheet flanked by a smaller beta-sheet on one side, and an alpha helix on the other. Sequence alignments show two conserved central regions, CR1 and CR2, that form a consensus signature for the PEBP family. These two regions form part of the ligand-binding site, which can accommodate various anionic groups. The N- and C-terminal regions are the least conserved, and may be involved in interactions with different protein partners. The N-terminal residues 2-12 form the natural cleavage peptide HCNP involved in neuronal development. The C-terminal region is deleted in plant and bacterial PEBP homologues, and may help control accessibility to the active site.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
This entry represents RIO kinase, they exhibit little sequence similarity with eukaryotic protein kinases, and are classified as atypical protein kinases. The conformation of ATP when bound to the RIO kinases is unique when compared with ePKs, such as serine/threonine kinases or the insulin receptor tyrosine kinase, suggesting that the detailed mechanism by which the catalytic aspartate of RIO kinases participates in phosphoryl transfer may not be identical to that employed in known serine/threonine ePKs. Representatives of the RIO family are present in organisms varying from Archaea to humans, although the RIO3 proteins have only been identified in multicellular eukaryotes, to date.
Yeast Rio1 and Rio2 proteins are required for proper cell cycle progression and chromosome maintenance, and are necessary for survival of the cells. These proteins are involved in the processing of 20 S pre-rRNA via late 18 S rRNA processing.
Tubby, an autosomal recessive mutation, mapping to mouse chromosome 7, was recently found to be the result of a splicing defect in a novel gene with unknown function. This mutation maps to the tub gene. The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and sensory deficits. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member, although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutated tub gene. TUB is the founding-member of the tubby-like proteins, the TULPs. TULPs are found in multicellular organisms from both the plant and animal kingdoms. Ablation of members of this protein family cause disease phenotypes that are indicative of their importance in nervous-system function and development.
Mammalian TUB is a hydrophilic protein of ~500 residues. The N-terminal portion of the protein is conserved neither in length nor sequence, but, in TUB, contains the nuclear localisation signal and may have transcriptional-activation activity. The C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The crystal structure of the C-terminal core domain from mouse tubby has been determined to 1.9A resolution. This domain is arranged as a 12-stranded, all anti-parallel, closed beta-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core. Structural analyses suggest that TULPs constitute a unique family of bipartite transcription factors.
This entry represents the PP-loop motif superfamily. The PP-loop motif appears to be a modified version of the P-loop of nucleotide binding domain that is involved in phosphate binding. Named PP-motif, since it appears to be a part of a previously uncharacterised ATP pyrophophatase domain. ATP sulfurylases, Escherichia coli NtrL, and Bacillus subtilis OutB consist of this domain alone. In other proteins, the pyrophosphatase domain is associated with amidotransferase domains (type I or type II), a putative citrulline-aspartate ligase domain or a nitrilase/amidase domain.
A number of uncharacterised hydrophilic proteins of about 30 kDa share regions of similarity. These include,
Members of this family are involved in the pyridoxine biosynthetic pathway. The regulation of cellular growth and proliferation in response to environmental cues is critical for development and the maintenance of viability in all organisms. In unicellular organisms, such as the budding yeast Saccharomyces cerevisiae (Baker's yeast), growth and proliferation are regulated by nutrient availability.
The S1 domain of around 70 amino acids, originally identified in ribosomal protein S1, is found in a large number of RNA-associated proteins. It has been shown that S1 proteins bind RNA through their S1 domains with some degree of sequence specificity. This type of S1 domain is found in translation initiation factor 1.
The solution structure of one S1 RNA-binding domain from Escherichia coli polynucleotide phosphorylase has been determined. It displays some similarity with the cold shock domain (CSD). Both the S1 and the CSD domain consist of an antiparallel beta barrel of the same topology with 5 beta strands. This fold is also shared by many other proteins of unrelated function and is known as the OB fold. However, the S1 and CSD fold can be distinguished from the other OB folds by the presence of a short 3(10) helix at the end of strand 3. This unique feature is likely to form a part of the DNA/RNA-binding site.
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
Dihydroorotate dehydrogenase (DHOD), also known as dihydroorotate oxidase, catalyses the fourth step in de novo pyrimidine biosynthesis, the stereospecific oxidation of (S)-dihydroorotate to orotate, which is the only redox reaction in this pathway. DHODs can be divided into two mains classes: class 1 cytosolic enzymes found primarily in Gram-positive bacteria, and class 2 membrane-associated enzymes found primarily in eukaryotic mitochondria and Gram-negative bacteria.
The class 1 DHODs can be further divided into subclasses 1A and 1B, which differ in their structural organisation and use of electron acceptors. The 1A enzyme is a homodimer of two PyrD subunits where each subunit forms a TIM barrel fold with a bound FMN cofactor located near the top of the barrel. Fumarate is the natural electron acceptor for this enzyme. The 1B enzyme, in contrast is a heterotetramer composed of a central, FMN-containing, PyrD homodimer resembling the 1A homodimer, and two additional PyrK subunits which contain FAD and a 2Fe-2S cluster. These additional groups allow the enzyme to use NAD(+) as its natural electron acceptor.
The class 2 membrane-associated enzymes are monomers which have the FMN-containing TIM barrel domain found in the class 1 PyrD subunit, and an additional N-terminal alpha helical domain. These enzymes use respiratory quinones as the physiological electron acceptor.
This entry represents the FMN-binding subunit common to all classes of dihydroorotate dehydrogenase.
Macrophage migration inhibitory factor (MIF) is a key regulatory cytokine within innate and adaptive immune responses, capable of promoting and modulating the magnitude of the response. MIF is released from T-cells and macrophages, and acts within the neuroendocrine system. MIF is capable of tautomerase activity, although its biological function has not been fully characterised. It is induced by glucocorticoid and is capable of overriding the anti-inflammatory actions of glucocorticoid. MIF regulates cytokine secretion and the expression of receptors involved in the immune response. It can be taken up into target cells in order to interact with intracellular signalling molecules, inhibiting p53 function, and/or activating components of the mitogen-activated protein kinase and Jun-activation domain-binding protein-1 (Jab-1). MIF has been linked to various inflammatory diseases, such as rheumatoid arthritis and atherosclerosis.
The MIF homologue D-dopachrome tautomerase is involved in detoxification through the conversion of dopaminechrome (and possibly norepinephrinechrome), the toxic quinine product of the neurotransmitter dopamine (and norepinephrine), to an indole derivative that can serve as a precursor to neuromelanin.
This domain is found in archaeal, bacterial and eukaryotic proteins.
In the archaea and bacteria, they are annotated as putative nucleolar protein, Sun (Fmu) family protein or tRNA/rRNA cytosine-C5-methylase. The majority have the S-adenosyl methionine (SAM) binding domain and are related to Escherichia coli Fmu (Sun) protein (16S rRNA m5C 967 methyltransferase) whose structure has been determined.
In the eukaryota, the majority are annotated as being 'hypothetical protein', nucleolar protein or the Nop2/Sun (Fmu) family. Unlike their bacterial homologues, few of the eukaryotic members in this family have a the SAM binding signature. Despite this, Saccharomyces cerevisiae (Baker's yeast) Nop2p is a probable RNA m5C methyltransferase. It is essential for processing and maturation of 27S pre-rRNA and large ribosomal subunit biogenesis; localized to the nucleolus and is essential for viability. Reduced Nop2p expression limits yeast growth and decreases levels of mature 60S ribosomal subunits while altering rRNA processing. There is substantial identity between Nop2p and Homo sapiens (Human) p120 (NOL1), which is also called the proliferation-associated nucleolar antigen.
Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).
This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2).
In eukaryotes, there are three different forms of DNA-dependent RNA polymerases transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. A component of 14 to 18 kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in Schizosaccharomyces pombe (Fission yeast) (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) is evolutionary related to the archaebacterial subunit K (gene rpoK). The archaebacterial protein is colinear with the C-terminal part of the eukaryotic subunit.
The structures of the omega subunit and RBP6, and the structures of the omega/beta' and RPB6/RPB1 interfaces, suggest a molecular mechanism for the function of omega and RPB6 in promoting RNAP assembly and/or stability. The conserved regions of omega and RPB6 form a compact structural domain that interacts simultaneously with conserved regions of the largest RNAP subunit and with the C-terminal tail following a conserved region of the largest RNAP subunit. The second half of the conserved region of omega and RPB6 forms an arc that projects away from the remainder of the structural domain and wraps over and around the C-terminal tail of the largest RNAP subunit, clamping it in a crevice, and threading the C-terminal tail of the largest RNAP subunit through the narrow gap between omega and RPB6.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha subunit (RpoA) homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta') can interact. These different subunits share regions of homology required for dimerisation. In eukaryotic Rpb11 and archaeal L subunits, the dimerisation domain consists of a contiguous Rpb11-like domain, whereas in eukaryotic Rpb3, archaeal D and bacterial RpoA subunits, the dimerisation domain consists of the Rpb11-like domain interrupted by an insert domain. In the prokaryotic alpha subunit, this dimerisation domain is the N-terminal domain.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L17 is one of the proteins from the large ribosomal subunit. Bacterial L17 is a protein of 120 to 130 amino-acid residues while yeast YmL8 is twice as large (238 residues). The N-terminal half of YmL8 is colinear with the sequence of L17 from Escherichia coli.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 87 to 128 amino-acid residues. This family consists of:
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31, plant L34, yeast putative ribosomal protein YIL052c and archaebacterial L34e.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. Examples are:
These proteins have from 64 to 78 amino acids and a highly conserved C-terminal extremity region.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins have been grouped based on sequence similarities. One of these families, S8e, consists of a number of proteins with either about 220 amino acids (in eukaryotes) or about 125 amino acids (in archaea).
Shikimate kinase catalyses the fifth step in the biosynthesis of aromatic amino acids from chorismate (the so-called shikimate pathway). The enzyme catalyses the following reaction:
ATP + shikimate = ADP + shikimate-3-phosphate
The protein is found in bacteria (gene aroK or aroL), plants and fungi (where it is part of a multifunctional enzyme that catalyses five consecutive steps in this pathway). In 1994, the 3D structure of shikimate kinase was predicted to be very close to that of adenylate kinase, suggesting a functional similarity as well as an evolutionary relationship. This prediction has since been confirmed experimentally. The protein is reported to possess an alpha/beta fold, consisting of a central sheet of five parallel beta-strands flanked by alpha-helices. Such a topology is very similar to that of adenylate kinase.
Members of this family catalyse the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae (Baker's yeast) acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria.
Uroporphyrinogen decarboxylase (URO-D), the fifth enzyme of the haem biosynthetic pathway, catalyses the sequential decarboxylation of the four acetyl side chains of uroporphyrinogen to yield coproporphyrinogen. URO-D deficiency is responsible for the human genetic diseases familial porphyria cutanea tarda (fPCT) and hepatoerythropoietic porphyria (HEP). The sequence of URO-D has been well conserved throughout evolution. The best conserved region is located in the N-terminal section; it contains a perfectly conserved hexapeptide. There are two arginine residues in this hexapeptide which could be involved in the binding, via salt bridges, to the carboxyl groups of the propionate side chains of the substrate.
The crystal structure of human uroporphyrinogen decarboxylase shows it as comprised of a single domain containing a (beta/alpha)8-barrel with a deep active site cleft formed by loops at the C-terminal ends of the barrel strands. URO-D is a dimer in solution. Dimerisation juxtaposes the active site clefts of the monomers, suggesting a functionally important interaction between the catalytic centres.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.
Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.
The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.
Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.
Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents the small sigma subunit of various adaptins from different AP clathrin adaptor complexes (including AP1, AP2, AP3 and AP4), and the zeta subunit of various coatomer (COP) adaptors. The small sigma subunit of AP proteins have been characterised in several species. The sigma subunit plays a role in protein sorting in the late-Golgi/trans-Golgi network (TGN) and/or endosomes. The zeta subunit of coatomers (zeta-COP) is required for coatomer binding to Golgi membranes and for coat-vesicle assembly.
More information about these proteins can be found at Protein of the Month: Clathrin.
Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.
Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family.
Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2).
A number of bacterial and archaebacterial proteins involved in transporting formate or nitrite have been shown to be related:
These transporters are proteins of about 280 residues and seem to contain six transmembrane regions.
GTP cyclohydrolase I catalyzes the biosynthesis of formic acid and dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine-containing pigments in insects. The comparison of the sequence of the enzyme from bacterial and eukaryotic sources shows that the structure of this enzyme has been extremely well conserved throughout evolution.
The Histidine Triad (HIT) motif, His-phi-His-phi-His-phi-phi (phi, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HIT superfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles. Hint homologues including rabbit Hint and yeast Hnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 and AMP-lysine to AMP plus the amine product and function as positive regulators of Cdk7/Kin28 in vivo. Fhit homologues are diadenosine polyphosphate hydrolases and function as tumour suppressors in human and mouse though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis. The third branch of the HIT superfamily, which includes GalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates rather than hydrolysing them.
Mannose-6-phosphate isomerase or phosphomannose isomerase (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals. Three classes of PMI have been defined.
Type I includes eukaryotic PMI and the enzyme encoded by the manA gene in enterobacteria. PMI has a bound zinc ion, which is essential for activity.
A crystal structure of PMI from Candida albicans shows that the enzyme has three distinct domains. The active site lies in the central domain, contains a single essential zinc atom, and forms a deep, open cavity of suitable dimensions to contain M6P or F6P The central domain is flanked by a helical domain on one side and a jelly-roll like domain on the other.
Protein prenylation is the posttranslational attachment of either a farnesyl group or a geranylgeranyl group via a thioether linkage (-C-S-C-) to a cysteine at or near the carboxyl terminus of the protein. Farnesyl and geranylgeranyl groups are polyisoprenes, unsaturated hydrocarbons with a multiple of five carbons; the chain is 15 carbons long in the farnesyl moiety and 20 carbons long in the geranylgeranyl moiety. There are three different protein prenyltransferases in humans: farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) share the same motif (the CaaX box) around the cysteine in their substrates, and are thus called CaaX prenyltransferases, whereas geranylgeranyltransferase 2 (GGT2, also called Rab geranylgeranyltransferase) recognises a different motif and is thus called a non-CaaX prenyltransferase. Protein prenyltransferases are currently known only in eukaryotes, but they are widespread, being found in vertebrates, insects, nematodes, plants, fungi and protozoa, including several parasites.
Each protein consists of two subunits, alpha and beta; the alpha subunit of FT and GGT1 is encoded by the same gene, FNTA. The alpha subunit is thought to participate in a stable complex with the isoprenyl substrate; the beta subunit binds the peptide substrate. In the alpha subunits of both types of protein prenyltransferases, seven tetratricopeptide repeats are formed by pairs of helices that are stabilized by conserved intercalating residues. The alpha subunits of GGT2 in mammals and plants also have an immunoglobulin-like domain between the fifth and sixth tetratricopeptide repeat, as well as leucine-rich repeats at the carboxyl terminus. The functions of these additional domains in GGT2 are as yet undefined, but they are apparently not directly involved in the interaction with substrates and Rab escort proteins. The tetratricopeptide repeats of the alpha subunit form a right-handed superhelix, which embraces the (alpha-alpha)6 barrel of the beta subunit.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L19 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L19 is known to be located at the 30S-50S ribosomal subunit interface and may play a role in the structure and function of the aminoacyl-tRNA binding site. It belongs to a family of ribosomal proteins, including L19 from bacteria and the chloroplasts of red algae.
L19 is a protein of 120 to 130 amino-acid residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian ribosomal protein L24; yeast ribosomal protein L30A/B (Rp29) (YL21); Kluyveromyces lactis ribosomal protein L30; Arabidopsis thaliana ribosomal protein L24 homolog; Haloarcula marismortui ribosomal protein HL21/HL22; and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1201. These proteins have 60 to 160 amino-acid residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:
These proteins have 87 to 110 amino-acid residues.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, ribosomal protein L30 from eukaryotes and archaebacteria, Gadd45 and MyD118.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 82 to 87 amino acids. The amino termini are all N alpha-acetylated. The N-terminal halves of the protein molecules are highly conserved in contrast to the carboxy-terminal parts.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein S6 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S6 is known to bind together with S18 to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, red algal chloroplast and cyanelle S6 ribosomal proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of Xenopus S8, and mammalian, insect and yeast S7. These proteins have about 200 amino acids.
Synonym(s): Di-trans-poly-cis-undecaprenyl-diphosphate synthase, Undecaprenyl pyrophosphate synthetase, Undecaprenyl pyrophosphate synthase, UPP synthetase
Di-trans-poly-cis-decaprenylcistransferase (UPP synthetase) generates undecaprenyl pyrophosphate (UPP) from isopentenyl pyrophosphate (IPP). This bacterial enzyme is also found in archaebacteria and in a number of uncharacterised proteins including some from yeasts.
This entry also matches related enzymes that transfer alkyl groups, such as dehydrodolichyl diphosphate synthase.
This family is related to Hydroxyethylthiazole kinaseand PfkB carbohydrate kinaseimplying that it also a carbohydrate kinase.
Several uncharacterised proteins have been shown to share regions of similarities, including yeast chromosome XI hypothetical protein YKL151c; Caenorhabditis elegans hypothetical protein R107.2; Escherichia coli hypothetical protein yjeF; Bacillus subtilis hypothetical protein yxkO; Helicobacter pylori hypothetical protein HP1363; Mycobacterium tuberculosis hypothetical protein MtCY77.05c; Mycobacterium leprae hypothetical protein B229_C2_201; Synechocystis sp. (strain PCC 6803) hypothetical protein sll1433; and Methanocaldococcus jannaschii (Methanococcus jannaschii) hypothetical protein MJ1586. These are proteins of about 30 to 40 kDa whose central region is well conserved.
This TIM alpha/beta barrel structure is found in xylose isomerase and in endonuclease IV. This domain is also found in the N termini of bacterial myo-inositol catabolism proteins. These are involved in the myo-inositol catabolism pathway, and is required for growth on myo-inositol in Rhizobium leguminosarum bv. viciae.
Alanine dehydrogenases and pyridine nucleotide transhydrogenase have been shown to share regions of similarity. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. Pyridine nucleotide transhydrogenase catalyzes the reduction of NADP+ to NADPH with the concomitant oxidation of NADH to NAD+. This enzyme is located in the plasma membrane of prokaryotes and in the inner membrane of the mitochondria of eukaryotes. The transhydrogenation between NADH and NADP is coupled with the translocation of a proton across the membrane. In prokaryotes the enzyme is composed of two different subunits, an alpha chain (gene pntA) and a beta chain (gene pntB), while in eukaryotes it is a single chain protein. The sequence of alanine dehydrogenase from several bacterial species are related with those of the alpha subunit of bacterial pyridine nucleotide transhydrogenase and of the N-terminal half of the eukaryotic enzyme. The two most conserved regions correspond respectively to the N-terminal extremity of these proteins and to a central glycine-rich region which is part of the NAD(H)-binding site.
This is a C-terminal domain of alanine dehydrogenases. This domain is also found in the lysine 2-oxoglutarate reductases.
D-amino acid oxidase (DAMOX or DAO) is an FAD flavoenzyme that catalyzes the oxidation of neutral and basic D-amino acids into their corresponding keto acids. DAOs have been characterised and sequenced in fungi and vertebrates where they are known to be located in the peroxisomes. D-aspartate oxidase (DASOX) is an enzyme, structurally related to DAO, which catalyzes the same reaction but is active only toward dicarboxylic D-amino acids. In DAO, a conserved histidine has been shown to be important for the enzyme's catalytic activity.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins grouped on the basis of sequence similarities.
The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker. Each domain contains an rRNA binding site, and the protein functions as a structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the hydrophobic cores of the two domains at both ends.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family contains the S24e ribosomal proteins from eukaryotes and archaebacteria. These proteins have 101 to 148 amino acids.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26; Octopus S26; Drosophila S26 (DS31); plant cytoplasmic S26; and fungal S26. These proteins have 114 to 127 amino acids.
Translation initiation factor 5A (IF-5A) is reported to be involved in the first step of peptide bond formation in translation, to be involved in cell-cycle regulation and to be a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively. IF-5A contains an unusual amino acid, hypusine N-epsilon-(4-aminobutyl-2-hydroxy)lysine), that is required for its function. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported.
The crystal structure of IF-5A from the archaeon Pyrobaculum aerophilum has been determined to 1.75 A. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding.
Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising and GTP-utilising enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.
PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels.
PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.
This entry represents ATP-utilising phosphoenolpyruvate carboxykinase enzymes.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The ribosomal protein L13e is widely found in vertebrates, Drosophila melanogaster, plants, yeast and others.
The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation. The YrdC protein, which consists solely of this domain, forms an alpha/beta twisted open-sheet structure composed of seven alpha helices and seven beta strands. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases.
Cytoskeleton-associated proteins (CAP) are made of three distinct parts, an N-terminal section that is most probably globular and contains the CAP-Gly domain, a large central region predicted to be in an alpha-helical coiled-coil conformation and, finally, a short C-terminal globular domain. The CAP-Gly domain is a conserved, glycine-rich domain of about 42 residues found in some CAPs. Proteins known to contain this domain include restin (also known as cytoplasmic linker protein-170 or CLIP-170), a 160 kDa protein associated with intermediate filaments and that links endocytic vesicles to microtubules; vertebrate dynactin (150 kDa dynein-associated polypeptide; DAP) and Drosophila glued, a major component of activator I; yeast protein BIK1, which seems to be required for the formation or stabilisation of microtubules during mitosis and for spindle pole body fusion during conjugation; yeast protein NIP100 (NIP80); human protein CKAP1/TFCB; Schizosaccharomyces pombe protein alp11 and Caenorhabditis elegans hypothetical protein F53F4.3. The latter proteins contain a N-terminal ubiquitin domain and a C-terminal CAP-Gly domain.
The crystal structure of the CAP-Gly domain of C. elegans F53F4.3 protein, solved by single wavelength sulphur-anomalous phasing, revealed a novel protein fold containing three beta-sheets. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. Residues in the groove are highly conserved as measured from the information content of the aligned sequences. The C-terminal tail of another molecule in the crystal is bound in this groove.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L15 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L15 is known to bind the 23S rRNA. Ribosomal protein, L15 from bacteria and plant chloroplasts (nuclear-encoded) belong to this family. Vertebrate L27a, Tetrahymena thermophila L29 and fungal L27a (L29, CRP-1, CYH2) also are members of this group.
Ribosomal L18E protein from a number of archebacteria show homology to both the eukaryotic L18 and eubacterial ribosomal protein L15, an observation which has been seen to substantiate the belief that archaea represent an evolutionary stage between bacteria and eukaryotes.
Creatinase or creatine amidinohydrolase catalyses the conversion of creatine and water to sarcosine and urea. The enzyme works as a homodimer, and is induced by choline chloride. Each monomer of creatinase has two clearly defined domains, a small N-terminal domain, and a large C-terminal domain.
The structure of the C-terminal region represents the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.
Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::
N-formyl-L-methionine + H2O = formate + methionyl peptideCatalytic efficiency strongly depends on the identity of the bound metal.
The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .
The mRNA capping enzyme in yeasts is composed of two separate chains, alpha a mRNA guanyltransferase and beta an RNA 5'-triphosphate. X-ray crystallography reveals a large conformational change during guanyl transfer by mRNA capping enzymes. Binding of the enzyme to nucleotides is specific to the GMP moiety of GTP. The viral mRNA capping enzyme is a monomer that transfers a GMP cap onto the end of mRNA that terminates with a 5'-diphosphate tail.
The OB-fold (oligonucleotide/oligosaccharide-binding fold) is found in all three kingdoms and its common architecture presents a binding face that has adapted to bind different ligands. The OB-fold is a five/six-stranded closed beta-barrel formed by 70-80 amino acid residues. The strands are connected by loops of varying length which form the functional appendages of the protein. The majority of OB-fold proteins use the same face for ligand binding or as an active site. Different OB-fold proteins use this 'fold-related binding face' to, variously, bind oligosaccharides, oligonucleotides, proteins, metal ions and catalytic substrates.
This entry contains OB-fold domains that bind to nucleic acids. It includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl-tRNA synthetases (See. Aminoacyl-tRNA synthetases catalyse the addition of an amino acid to the appropriate tRNA molecule This domain is found in RecG helicase involved in DNA repair. Replication factor A is a heterotrimeric complex, that contains a subunit in this family. This domain is also found at the C terminus of bacterial DNA polymerase III alpha chain.
Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.
The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.
This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.
Ribonuclease HII is involved in the degradation of the ribonucleotide moiety on RNA-DNA hybrid molecules carrying out endonucleolytic cleavage to 5'-phospo-monoester. Proteins which belong to this family have been found in bacteria, archaea, and yeasts. This family also includes Ribonuclease HIII.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two zinc ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. FYVE-type domains are divided into two known classes: FYVE domains that specifically bind to phosphatidylinositol 3-phosphate in lipid bilayers and FYVE-related domains of undetermined function. Those that bind to phosphatidylinositol 3-phosphate are often found in proteins targeted to lipid membranes that are involved in regulating membrane traffic. Most FYVE domains target proteins to endosomes by binding specifically to phosphatidylinositol-3-phosphate at the membrane surface. By contrast, the CARP2 FYVE-like domain is not optimized to bind to phosphoinositides or insert into lipid bilayers. FYVE domains are distinguished from other zinc fingers by three signature sequences: an N-terminal WxxD motif, a basic R(R/K)HHCR patch, and a C-terminal RVC motif.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The N-terminal and internal 5'3'-exonuclease domains are commonly found together, and are most often associated with 5' to 3' nuclease activities. The XPG protein signatures are never found outside the '53EXO' domains. The latter are found in more diverse proteins. The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families.
In the eubacterial type A DNA-polymerases, the N-terminal and internal domains are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A is always present towards the C-terminus. Several eukaryotic structure-dependent endonucleases and exonucleases have the 53EXO domains separated by 24 to 27 amino acids, and the XPG protein signatures are always present. In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. Eukaryotic DNA repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures.
This family of proteins utilise NAD as a cofactor. The proteins in this family use nucleotide-sugar substrates for a variety of chemical reactions. It contains the NAD(P)- binding domain which is a commonly found domain with a core Rossmann-type fold. One of the best studied of these proteins is UDP-galactose 4-epimerase which catalyses the conversion of UDP-galactose to UDP-glucose during galactose metabolism.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.
The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.
Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.
This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).
This is large family of DNA binding helix-turn helix proteins that include a bacterial plasmid copy control protein, bacterial methylases, various bacteriophage transcription control proteins and a vegetative specific protein from Dictyostelium discoideum (Slime mould).
The PHO-4 family of transporters includes the phosphate-repressible phosphate permease (PHO-4) from Neurospora crassa which is probably a sodium-phosphate symporter. This family also includes the human leukemia virus receptor.
Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.
This entry represents the N-terminal beta-propeller region of clathrin heavy chains that extends away from the hub of triskelia, and which are responsible for peptide binding.
More information about these proteins can be found at Protein of the Month: Clathrin.
Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain. It has been shown that this domain occurs in prokaryotes.
Mov34 proteins act as the regulatory subunit of the 26 proteasome, which is involved in the ATP-dependent degradation of ubiquitinated proteins. The function of this domain is unclear, but it is found in the N-terminus of the proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors.
A number of the proteins associated with this family belong to MEROPS peptidase family M67 (clan M-). This includes the Poh1 peptidase of Saccharomyces cerevisiae (Baker's yeast) which is a component of the 19S proteasome regulatory particle.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12A (astacin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH.
The astacin family of metalloendopeptidases encompasses a range of proteins found in hydra to humans, in mature and developmental systems. Their functions include activation of growth factors, degradation of polypeptides, and processing of extracellular proteins. The proteins are synthesised with N-terminal signal and pro-enzyme sequences, and many contain multiple domains C-terminal to the protease domain. They are either secreted from cells, or are associated with the plasma membrane.
The astacin molecule adopts a kidney shape, with a deep active-site cleft between its N- and C-terminal domains. The zinc ion, which lies at the bottom of the cleft, exhibits a unique penta-coordinated mode of binding, involving 3 histidine residues, a tyrosine and a water molecule (which is also bound to the carboxylate side chain of Glu93). The N-terminal domain comprises 2 alpha-helices and a 5-stranded beta-sheet. The overall topology of this domain is shared by the archetypal zinc-endopeptidase thermolysin. Astacin protease domains also share common features with serralysins, matrix metalloendopeptidases, and snake venom proteases; they cleave peptide bonds in polypeptides such as insulin B chain and bradykinin, and in proteins such as casein and gelatin; and they have arylamidase activity.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Cysteinyl-tRNA synthetase is an alpha monomer and belongs to class Ia.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Phenylalanyl-tRNA synthetase is an alpha2/beta2 tetramer composed of 2 subunits that belongs to class IIc. In eubacteria, a small subunit (pheS gene) can be designated as beta (E. coli) or alpha subunit (nomenclature adopted in InterPro). Reciprocally the large subunit (pheT gene) can be designated as alpha (E. coli) or beta (see. In all other kingdoms the two subunits have equivalent length in eukaryota, and can be identified by specific signatures. The enzyme from Thermus thermophilus has an alpha 2 beta 2 type quaternary structure and is one of the most complicated members of the synthetase family. Identification of phenylalanyl-tRNA synthetase as a member of class II aaRSs was based only on sequence alignment of the small alpha-subunit with other synthetases.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Alanyl-tRNA synthetase is an alpha4 tetramer that belongs to class IIc.
This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.
The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved. It consists of a three-stranded beta-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis.
The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates nuclear export of unspliced and partly-spliced viral RNAs. Rev contains an RNA-binding domain and an effector domain; the latter is believed to interact with a cellular cofactor required for the Rev response and hence HIV-1 replication. Human Rev interacting protein (hRIP) specifically interacts with the Rev effector. The amino acid sequence of hRIP is characterised by an N-terminal, C-4 class zinc finger motif.
The ENTH (Epsin N-terminal homology) domain is approximately 150 amino acids in length and is always found located at the N-termini of proteins. The domain forms a compact globular structure, composed of 9 alpha-helices connected by loops of varying length. The general topology is determined by three helical hairpins that are stacked consecutively with a right hand twist.. An N-terminal helix folds back, forming a deep basic groove that forms the binding pocket for the Ins(1,4,5)P3 ligand. The ligand is coordinated by residues from surrounding alpha-helices and all three phosphates are multiply coordinated. The coordination of Ins(1,4,5)P3 suggests that ENTH is specific for particular head groups.
Proteins containing this domain have been found to bind PtdIns(4,5)P2 and PtdIns(1,4,5)P3 suggesting that the domain may be a membrane interacting module. The main function of proteins containing this domain appears to be to act as accessory clathrin adaptors in endocytosis, Epsin is able to recruit and promote clathrin polymerisation on a lipid monolayer, but may have additional roles in signalling and actin regulation. Epsin causes a strong degree of membrane curvature and tubulation, even fragmentation of membranes with a high PtdIns(4,5)P2 content. Epsin binding to membranes facilitates their deformation by insertion of the N-terminal helix into the outer leaflet of the bilayer, pushing the head groups apart. This would reduce the energy needed to curve the membrane into a vesicle, making it easier for the clathrin cage to fix and stabilise the curved membrane. This points to a pioneering role for epsin in vesicle budding as it provides both a driving force and a link between membrane invagination and clathrin polymerisation.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12B (adamalysin family, clan (MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH.
The adamalysins are zinc dependent endopeptidases found in snake venom. There are some mammalian proteins such as and fertilin Fertilin and closely related proteins appear to not have some active site residues and may not be active enzymes.
CD156 (also called ADAM8 or MS2 human) has been implicated in extravasation of leukocytes. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).
This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology.
Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/BÂ, D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins.
The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.
Amidase signature (AS) enzymes are a large group of hydrolytic enzymes that contain a conserved stretch of approximately 130 amino acids known as the AS sequence. They are widespread, being found in both prokaryotes and eukaryotes. AS enzymes catalyse the hydrolysis of amide bonds (CO-NH2), although the family has diverged widely with regard to substrate specificity and function. Nonetheless, these enzymes maintain a core alpha/beta/alpha structure, where the topologies of the N- and C-terminal halves are similar. AS enzymes characteristically have a highly conserved C-terminal region rich in serine and glycine residues, but devoid of aspartic acid and histidine residues, therefore they differ from classical serine hydrolases. These enzymes posses a unique, highly conserved Ser-Ser-Lys catalytic triad used for amide hydrolysis, although the catalytic mechanism for acyl-enzyme intermediate formation can differ between enzymes.
Examples of AS enzymes include:
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to MEROPS peptidase family M3 (clan MA(E)), subfamilies M3A and M3B. The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.
The Thimet oligopeptidase family, is a large family of archaeal, bacterial and eukaryotic oligopeptidases that cleave medium sized peptides. The group contains:
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M1 (clan MA(E)), the type example being aminopeptidase N from Homo sapiens (Human). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.
Membrane alanine aminopeptidase is part of the HEXXH+E group; it consists entirely of aminopeptidases, spread across a wide variety of species. Functional studies show that CD13/APN catalyzes the removal of single amino acids from the amino terminus of small peptides and probably plays a role in their final digestion; one family member (leukotriene-A4 hydrolase) is known to hydrolyse the epoxide leukotriene-A4 to form an inflammatory mediator. This hydrolase has been shown to have aminopeptidase activity, and the zinc ligands of the M1 family were identified by site-directed mutagenesis on this enzyme CD13 participates in trimming peptides bound to MHC class II molecules and cleaves MIP-1 chemokine, which alters target cell specificity from basophils to eosinophils. CD13 acts as a receptor for specific strains of RNA viruses (coronaviruses) which cause a relatively large percentage of upper respiratory trace infections.
CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to MEROPS peptidase family M41 (FtsH endopeptidase family, clan MA(E)). The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH.
The peptidase M41 family belong to a larger family of zinc metalloproteases. This family includes the cell division protein FtsH, and the yeast mitochondrial respiratory chain complexes assembly protein, which is a putative ATP-dependent protease required for assembly of the mitochondrial respiratory chain and ATPase complexes. FtsH is an integral membrane protein, which seems to act as an ATP-dependent zinc metallopeptidase that binds one zinc ion.
The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C-terminus alpha-amidation of biological peptides. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller.
The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.
The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain and a GATA binding domain. ELM2, in some instances, is also found associated with the ARID DNA binding domain This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.
Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.
The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.
The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.
In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.
This entry represents SufB and SufD proteins that form part of the SufBCD complex in the SUF system. No specific functions have been assigned to these proteins.
The major protein of the outer mitochondrial membrane of eukaryotes is a porin that forms a voltage-dependent anion-selective channel (VDAC) that behaves as a general diffusion pore for small hydrophilic molecules. The channel adopts an open conformation at low or zero membrane potential and a closed conformation at potentials above 30-40 mV.
This protein contains about 280 amino acids and its sequence is composed of between 12 to 16 beta-strands that span the mitochondrial outer membrane. Yeast contains two members of this family (genes POR1 and POR2); vertebrates have at least three members (genes VDAC1, VDAC2 and VDAC3).
SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex and is also involved in the ubiquitin pathway. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1.
This entry represents a dimerisation domain found at the C-terminal of SKP1 proteins, as well as in subunit D of the centromere DNA-binding protein complex Cbf3. This domain is multi-helical in structure, and consists of an interlocked herterodimer in F-box proteins.
This family includes:
CTP:cholinephosphate cytidylyltransferase (CCT) is a key regulatory enzyme in phosphatidylcholine biosynthesis that catalyzes the formation of CDP-choline. A comparison of the catalytic domains of CCTs from a wide variety of organisms reveals a large number of completely conserved residues. There may be a role for the conserved HXGH sequence in catalysis. The membrane-binding domain in rat CCT has been defined, and it has been suggested that lipids may play a role in inactivating the enzyme. A phosphorylation domain has been described.
The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases.
Lipoxygenases are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes. Sequence data is available for the following lipoxygenases:
The iron atom in lipoxygenases is bound by four ligands, three of which are histidine residues. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown to be important for the activity of lipoxygenases.
This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterised, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA.
The PWI domain, named after a highly conserved PWI tri-peptide located within its N-terminal region, is a ~80 amino acid module, which is found either at the N-terminus or at the C-terminus of eukaryotic proteins involved in pre-mRNA processing. It is generally found in association with other domains such as RRM and RS. The PWI domain is a RNA/DNA-binding domain that has an equal preference for single- and double-stranded nucleic acids and is likely to have multiple important functions in pre-mRNA processing. Proteins containing this domain include the SR-related nuclear matrix protein of 160 kD (SRm160) splicing and 3'-end cleavage-stimulatory factor, and the mammalian splicing factor PRP3.
The PWI domain is a soluble, globular and independently folded domain which consists of a four-helix bundle, with structured N- and C-terminal elements.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a cysteine-rich (C6HC) zinc finger domain that is present in Triad1, and which is conserved in other proteins encoded by various eukaryotes. The C6HC consensus pattern is:
The C6HC zinc finger motif is the fourth family member of the zinc-binding RING, LIM, and LAP/PHD fingers. Strikingly, in most of the proteins the C6HC domain is flanked by two RING finger structures The novel C6HC motif has been called DRIL (double RING finger linked). The strong conservation of the larger tripartite TRIAD (twoRING fingers and DRIL) structure indicates that the three subdomains are functionally linked and identifies a novel class of proteins.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Glutamate synthase (GltS) is a complex iron-sulphur flavoprotein that catalyses the reductive synthesis of L-glutamate from 2-oxoglutarate and L-glutamine via intramolecular channelling of ammonia, a reaction in the bacterial, yeast and plant pathways for ammonia assimilation. GltS is a multifunctional enzyme that functions through three distinct active centres carrying out multiple reaction steps: L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor. The active centres are synchronised to avoid the wasteful consumption of L-glutamine. There are three classes of GltS, which share many functional properties: bacterial NADPH-dependent GltS, ferredoxin-dependent GltS from photosynthetic cells, and NAD(P)H-dependent GltS from yeast, fungi and lower animals.
The dimeric alpha subunits each consist of four domains: N-terminal amidotransferase domain, the central domain, the FMN binding domain and the C-terminal domain. The C-terminal domain forms a right-handed beta-helix that comprises seven helical turns. Each helical turn has a sharp bend that is associated with a repeated sequence motif consisting of G-XX-G-XXX-G. This domain does not contain any residues directly involved in catalysis, but has a crucial structural role.
This domain is also found in proteins such as subunit C of formylmethanofuran dehydrogenase, which catalyses the first step in methane formation from carbon dioxide in methanogenic archaea. There are two isoenzymes of formylmethanofuran dehydrogenase: a tungsten-containing isoenzyme (FwdC) and a molybdenum-containing isoenzyme (FmdC). The tungsten isoenzyme is constitutively transcribed, whereas transcription of the molybdenum operon is induced by molybdate.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents the 116-kDa subunit (or subunit a) and subunit I found in the V0 or A0 complex of V- or A-ATPases, respectively. The 116-kDa subunit is a transmembrane glycoprotein required for the assembly and proton transport activity of the ATPase complex. Several isoforms of the 116-kDa subunit exist, providing a potential role in the differential targeting and regulation of the V-ATPase for specific organelles.
More information about this protein can be found at Protein of the Month: ATP Synthases.
Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine, specifically converting uracil-55 to pseudouridine in most tRNAs. This family also includes Cbf5p that modifies rRNA.
The proteins in this entry are variously annotated as iron-sulphur cluster insertion protein or Fe/S biogenesis protein. They appear to be involved in Fe-S cluster biogenesis. This family includes IscA, HesB, YadR and YfhF-like proteins. The hesB gene is expressed only under nitrogen fixation conditions. IscA, an 11 kDa member of the hesB family of proteins, binds iron and [2Fe-2S] clusters, and participates in the biosynthesis of iron-sulphur proteins. IscA is able to bind at least 2 iron ions per dimer. Other members of this family include various hypothetical proteins that also contain the NifU-like domain suggesting that they too are able to bind iron and are involved in Fe-S cluster biogenesis. The HesB family are found in species as divergent as Homo sapiens (Human) and Haemophilus influenzae suggesting that these proteins are involved in basic cellular functions.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the DHHC-type zinc finger domain, which is also known as NEW1. The DHHC Zn-finger was first isolated in the Drosophila putative transcription factor DNZ1 . The function of this domain is unknown, but it has been predicted to be involved in protein-protein or protein-DNA interactions.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
This entry represents the PPR repeat.
Pentatricopeptide repeat (PPR) proteins are characterised by tandem repeats of a degenerate 35 amino acid motif. Most of PPR proteins have roles in mitochondria or plastid. PPR repeats were discovered while screening Arabidopsis proteins for those predicted to be targeted to mitochondria or chloroplast. Some of these proteins have been shown to play a role in post-transcriptional processes within organelles and they are thought to be sequence-specific RNA-binding proteins. Plant genomes have between one hundred to five hundred PPR genes per genome whereas non-plant genomes encode two to six PPR proteins.
Although no PPR structures are yet known, the motif is predicted to fold into a helix-turn-helix structure similar to those found in the tetratricopeptide repeat (TPR) family (see.
The plant PPR protein family has been divided in two subfamilies on the basis of their motif content and organisation.
Examples of PPR repeat-containing proteins include PET309 which may be involved in RNA stabilisation, and crp1, which is involved in RNA processing. The repeat is associated with a predicted plant proteinthat has a domain organisation similar to the human BRCA1 protein.
S-adenosylmethionine decarboxylase (AdoMetDC) catalyzes the removal of the carboxylate group of S-adenosylmethionine to form S-adenosyl-5'-3-methylpropylamine which then acts as the n-propylamine group donor in the synthesis of the polyamines spermidine and spermine from putrescine.
The catalytic mechanism of AdoMetDC involves a covalently-bound pyruvoyl group. This group is post-translationally generated by a self-catalyzed intramolecular proteolytic cleavage reaction between a glutamate and a serine. This cleavage generates two chains, beta (N-terminal) and alpha (C-terminal). The N-terminal serine residue of the alpha chain is then converted by nonhydrolytic serinolysis into a pyruvyol group.
During the process of Escherichia coli nucleotide excision repair, DNA damage recognition and processing are achieved by the action of the uvrA, uvrB, and uvrC gene products. The UvrC proteins contain 4 conserved regions: a central region which interacts with UvrB (Uvr domain), a Helix hairpin Helix (HhH) domain important for 5 prime incision of damage DNA and the homology regions 1 and 2 of unknown function. UvrC homology region 2 is specific for UvrC proteins, whereas UvrC homology region 1 is also shared by few other nucleases.
It is found in the amino terminal region of excinuclease abc subunit c (uvrC), Bacteriophage T4 endonucleases segA, segB, segC, segD and segE; it is also found in putative endonucleases encoded by group I introns of fungi and phage.
Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are considered to be efflux pumps that remove these ions from cells, however others are implicated in ion uptake. The family has six predicted transmembrane domains. Members of the family are variable in length because of variably sized inserts, often containing low-complexity sequence.
This family contains acyltransferases involved in phospholipid biosynthesis and other proteins of unknown function. This domain is found in tafazzins, defects in which are the cause of Barth syndrome; a severe inherited disorder which is often fatal in childhood and is characterised by cardiac and skeletal abnormalities. Phospholipid/glycerol acyltransferase is not found in the viruses or the archaea and is under represented in the bacteria. Bacterial glycerol-phosphate acyltransferases are involved in membrane biogenesis since they use fatty acid chains to form the first membrane phospholipids.
This domain is found in DNA methylases. In prokaryotes, the major role of DNA methylation is to protect host DNA against degradation by restriction enzymes. This family contains both N-4 cytosine-specific DNA methylases and N-6 Adenine-specific DNA methylases. N-4 cytosine-specific DNA methylases are enzymes that specifically methylate the amino group at the C-4 position of cytosines in DNA. Such enzymes are found as components of type II restriction-modification systems in prokaryotes. Such enzymes recognise a specific sequence in DNA and methylate a cytosine in that sequence. By this action they protect DNA from cleavage by type II restriction enzymes that recognise the same sequence. N-6 adenine-specific DNA methylases (A-Mtase) are enzymes that specifically methylate the amino group at the C-6 position of adenines in DNA. Such enzymes are found in the three existing types of bacterial restriction-modification systems (in type I system the A-Mtase is the product of the hsdM gene, and in type III it is the product of the mod gene). All of these enzymes recognise a specific sequence in DNA and methylate an adenine in that sequence.
Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolizing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.
Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.
This domain consists of the C-terminal region of the DnaJ protein. Although the function of this region is unknown, it is always found associated withand
A group of polyamine biosynthetic enzymes involved in the fifth (last) step in the biosynthesis of spermidine from arginine and methionine which includes; spermidine synthase, spermine synthase and putrescine N-methyltransferase.
The Thermotoga maritima spermidine synthase monomer consists of two domains: an N-terminal domain composed of six beta-strands, and a Rossmann-like C- terminal domain. The larger C-terminal catalytic core domain consists of a seven-stranded beta-sheet flanked by nine alpha helices. This domain resembles a topology observed in a number of nucleotide and dinucleotide-binding enzymes, and in S-adenosyl-L-methionine (AdoMet)- dependent methyltransferase (MTases).
The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functionally related proteins defined by a conserved hydrophobic core of ten transmembrane domains. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others. It is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations.
The natural resistance of mice to infection with intracellular parasites is controlled by the Bcg locus, which modulates the cytostatic/cytocidal activity of phagocytes. Nramp1, the gene responsible, is expressed exclusively in macrophages and poly-morphonuclear leukocytes, and encodes a polypeptide (natural resistance-associated macrophage protein) with features typical of integral membrane proteins. Other transporter proteins from a variety of sources also belong to this family.
(6S)-tetrahydrofolate + S-aminomethyldihydrolipoylprotein = (6R)-5,10-methylenetetrahydrofolate + NH3 + dihydrolipoylprotein
Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.
The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.
The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.
In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.
This entry represents the N-terminal of NifU and homologous proteins. NifU contains two domains: an N-terminal and a C-terminal domain. These domains exist either together or on different polypeptides, both domains being found in organisms that do not fix nitrogen (e.g. yeast), so they have a broader significance in the cell than nitrogen fixation.
This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. A lipoyl group is attached to a completely conserved lysine residue. The H protein shuttles the methylamine group of glycine from the P protein to the T protein.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesized as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a source of proteins.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.
Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents the N-terminal domain of various adaptins from different AP clathrin adaptor complexes (including AP1, AP2, AP3 and AP4), and from the beta and gamma subunits of various coatomer (COP) adaptors. This domain has a 2-layer alpha/alpha fold that forms a right-handed superhelix, and is a member of the ARM repeat superfamily. The N-terminal region of the various AP adaptor proteins share strong sequence identity; by contrast, the C-terminal domains of different adaptins share similar structural folds, but have little sequence identity. It has been proposed that the N-terminal domain interacts with another uniform component of the coated vesicles.
More information about these proteins can be found at Protein of the Month: Clathrin.
This domain is responsible for the 3'-5' exonuclease proofreading activity of Escherichia coli DNA polymerase I (polI) and other enzymes, it catalyses the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D).
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins, as well as closely related proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed beta-sheet surrounded by three alpha-helices, which is similar to the structure of tRNA endonuclease. Yeast MSH3, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
L35 is a basic protein of 60 to 70 amino-acid residues from the large (50S) subunit. Like many basic polypeptides, L35 completely inhibits ornithine decarboxylase when present unbound in the cell, but the inhibitory function is abolished upon its incorporation into ribosomes. It belongs to a family of ribosomal proteins, including L35 from bacteria, plant chloroplast, red algae chloroplasts and cyanelles. In plants it is a nuclear encoded gene product, which suggests a chloroplast-to-nucleus relocation during the evolution of higher plants.
Choline kinase, (ATP:choline phosphotransferase) belongs to the choline/ethanolamine kinase family.
Ethanolamine and choline are major membrane phospholipids, in the form of glycerophosphoethanolamine and glycerophosphocholine. Ethanolamine is also a component of the glycosylphosphatidylinositol (GPI) anchor, which is necessary for cell-surface protein attachment. The de novo synthesis of these phospholipids begins with the creation of phosphoethanolamine and phosphocholine by ethanolamine and choline kinases in the first step of the CDP-ethanolamine pathway. There are two putative choline/ethanolamine kinases (C/EKs) in the Trypanosoma brucei genome.
Ethanolamine kinase has no choline kinase activity and its activity is inhibited by ADP. Inositol supplementation represses ethanolamine kinase, decreasing the incorporation of ethanolamine into the CDP-ethanolamine pathway and into phosphatidylethanolamine and phosphatidylcholine.
Ferredoxin-dependent glutamate synthases have been implicated in a number of functions including photorespiration in Arabidopsis where they may also play a role in primary nitrogen assimilation in roots. This region is expressed as a seperate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms.
The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster.
These proteins transfer the 4'-phosphopantetheine (4'-PP) moiety from coenzyme A (CoA) to the invariant serine of pp-binding. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4'-PP. This superfamily consists of two subtypes: The ACPS type such as ACPS_ECOLI and the Sfp type such as SFP_BACSU. The structure of the Sfp type is known, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The L32e family consists of proteins that have 135 to 240 amino-acid residues.
This is a region of myo-inositol-1-phosphate synthases that is related to the glyceraldehyde-3-phosphate dehydrogenase-like, C-terminal domain.
1L-myo-Inositol-1-phosphate synthase catalyzes the conversion of D-glucose 6-phosphate to 1L-myo-inositol-1-phosphate, the first committed step in the production of all inositol-containing compounds, including phospholipids, either directly or by salvage. The enzyme exists in a cytoplasmic form in a wide range of plants, animals, and fungi. It has also been detected in several bacteria and a chloroplast form is observed in alga and higher plants. Inositol phosphates play an important role in signal transduction.
In Saccharomyces cerevisiae (Baker's yeast), the transcriptional regulation of the INO1 gene has been studied in detail and its expression is sensitive to the availability of phospholipid precursors as well as growth phase. The regulation of the structural gene encoding 1L-myo-inositol-1-phosphate synthase has also been analyzed at the transcriptional level in the aquatic angiosperm, Spirodela polyrrhiza (Giant duckweed) and the halophyte, Mesembryanthemum crystallinum (Common ice plant).
The Macro or A1pp domain is a module of about 180 amino acids which can bind ADP-ribose, an NAD metabolite or related ligands. The domain was described originally in association with ADP-ribose 1''-phosphate (Appr-1''-P) processing activity (A1pp) of the yeast YBR022W protein. The domain is also called Macro domain as it is the C-terminal domain of mammalian core histone macro-H2A. Macro domain proteins can be found in eukaryotes, in (mostly pathogenic) bacteria, in archaea and in ssRNA viruses, such as coronaviruses, Rubella and Hepatitis E viruses. In vertebrates the domain occurs e.g. in histone macroH2A, in predicted poly-ADP-ribose polymerases (PARPs) and in B aggressive lymphoma (BAL) protein. The macro domain can be associated with catalytic domains, such as PARP, or sirtuin. The Macro domain can recognize ADP-ribose or in some cases poly-ADP-ribose, which can be involved in ADP-ribosylation reactions that occur in important processes, such as chromatin biology, DNA repair and transcription regulation. The human macroH2A1.1 Macro domain binds an NAD metabolite O-acetyl-ADP-ribose. The Macro domain has been suggested to play a regulatory role in ADP-ribosylation, which is involved in inter- and intracellular signaling, transcriptional regulation, DNA repair pathways and maintenance of genomic stability, telomere dynamics, cell differentiation and proliferation, and necrosis and apoptosis.
The 3D structure of the Macro domain has a mixed alpha/beta fold of a mixed beta sheet sandwiched between four helices. Several Macro domain only domains are shorter than the structure of AF1521 and lack either the first strand or the C-terminal helix 5. Well conserved residues form a hydrophobic cleft and cluster around the AF1521-ADP-ribose binding site.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families include mammalian, yeast, Chlamydomonas reinhardtii and Entamoeba histolytica S27, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0250. These proteins have from 62 to 87 amino acids. They contain, in their central section, a putative zinc-finger region of the type C-x(2)-C-x(14)-C-x(2)-C.
Snz1p is a highly conserved protein involved in growth arrest in Saccharomyces cerevisiae (Baker's yeast). Sor1 (singlet oxygen resistance) is essential in pyridoxine (vitamin B6) synthesis in Cercospora nicotianae and Aspergillus flavus. Pyridoxine quenches singlet oxygen at a rate comparable to that of vitamins C and E, two of the most highly efficient biological antioxidants, suggesting a previously unknown role for pyridoxine in active oxygen resistance..
Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases.
This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of proteins contain serine peptidases belonging to the MEROPS peptidase family S54 (Rhomboid, clan S-). They are integral membrane proteins related to the Drosophila melanogaster (Fruit fly) rhomboid protein Members of this family are found in archaea, bacteria and eukaryotes.
The D. melanogaster rhomboid protease cleaves type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains. It cleaves the transmembrane proteins Spitz, Gurken and Keren within their transmembrane domains to release a soluble TGFalpha-like growth factor. Cleavage occurs in the Golgi, following translocation of the substrates from the endoplasmic reticulum membrane by Star, another transmembrane protein. The growth factors are then able to activate the epidermal growth factor receptor.
Few substrates of mammalian rhomboid homologues have been determined, but rhomboid-like protein 2 (MEROPS S54.002) has been shown to cleave ephrin B3. Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite.
In Saccharomyces cerevisiae (Baker's yeast) the Pcp1 (MDM37) protein (MEROPS S54.007) is a mitochondrial endopeptidase required for the activation of cytochrome c peroxidase and for the processing of the mitochondrial dynamin-like protein Mgm1. Mutations in Pcp1 result in cells have fragmented mitochondria, which have very few short tubulues.
This family contains the Saccharomyces cerevisiae (Baker's yeast) HAM1 proteinand other hypothetical archaeal, bacterial and Caenorhabditis elegans proteins. S. cerevisiae HAM1 protects against the mutagenic effects of the base analog 6-N-hydroxylaminopurine (HAP) which can be a natural product of monooxygenase activity on adenine. HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions.
RrmJ (FtsJ) is a well conserved heat shock protein present in prokaryotes, archaea, and eukaryotes. RrmJ is responsible for methylating 23 S rRNA at position U2552 in the aminoacyl (A)1-site of the ribosome. U2552 is one of the five universally conserved A-loop residues and has been shown to be methylated at the ribose 2'-OH group in the majority of organisms investigated so far. This suggests that this modification plays an important role in the A-loop function. RrmJ recognises its methylation target only when the 23 S rRNA is present in 50 S ribosomal subunits. This suggests that the RrmJ-mediated methylation must occur late in the maturation process of the ribosome. This is in contrast to other known 23 S rRNA modifications that occur in earlier maturation steps.
The 1.5 A crystal structure of RrmJ in complex with its cofactor S-adenosylmethionine revealed that RrmJ has a methyltransferase fold. The active site of RrmJ appears to be formed by a catalytic triad consisting of two lysine residues and the negatively charged aspartate residue. Another highly conserved glutamate residue that is present in the active site of RrmJ appears to play only a minor role in the methyltransfer reaction in vivo.
This group includes nucleic acid independent RNA polymerases, such as polynucleotide adenylyltransferase, which adds the poly (A) tail to mRNA. This group also includes the tRNA nucleotidyltransferase that adds the CCA to the 3' of the tRNA
In transfer RNA many different modified nucleosides are found, especially in the anticodon region. tRNA (guanine-N1-)-methyltransferaseis one of several nucleases operating together with the tRNA-modifying enzymes before the formation of the mature tRNA. It catalyses the reaction:
S-adenosyl-L-methionine + tRNA -> S-adenosyl-L-homocysteine + tRNA containing N1-methylguaninemethylating guanosine(G) to N1-methylguanine (1-methylguanosine (m1G)) at position 37 of tRNAs that read CUN (leucine), CCN(proline), and CGG (arginine) codons. The presence of m1G improves the cellular growth rate and the polypeptide steptime and also prevents the tRNA from shifting the reading frame.
The mechanism of the trmD3-induced frameshift involving mutant tRNA(Pro) and tRNA(Leu) species has been investigated. It has been suggested that the conformation of the anticodon loop may be a major determining element for the formation of m1G37 in vivo.
The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.
Members of the importin-alpha (karyopherin-alpha) family can form heterodimers with importin-beta. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Proteins can contain one (monopartite) or two (bipartite) NLS motifs. Importin-alpha contains several armadillo (ARM) repeats, which produce a curving structure with two NLS-binding sites, a major one close to the N-terminus and a minor one close to the C-terminus.
Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. The N-terminal importin-beta-binding (IBB) domain of importin-alpha contains an auto-regulatory region that mimics the NLS motif. The release of importin-beta frees the auto-regulatory region on importin-alpha to loop back and bind to the major NLS-binding site, causing the cargo to be released.
This entry represents the N-terminal IBB domain of importin-alpha that contains the auto-regulatory region.
More information about these proteins can be found at Protein of the Month: Importins.
This is a conserved region from DNA primase. This corresponds to the Toprim (topoisomerase-primase) domain common to DnaG primases, topoisomerases, OLD family nucleases and RecR/M DNA repair proteins. Both DnaG motifs IV and V are present in the alignment, the DxD (V) motif may be involved in Mg2+ binding and mutations to the conserved glutamate (IV) completely abolish DnaG type primase activity. DNA primaseis a nucleotidyltransferase it synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork; it can also prime the leading stand and has been implicated in cell division. This family also includes the atypical archaeal A subunit from type II DNA topoisomerases. Type II DNA topoisomerases catalyse the relaxation of DNA supercoiling by causing transient double strand breaks.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents MYND-type zinc finger domains. The MYND domain (myeloid, Nervy, and DEAF-1) is present in a large group of proteins that includes RP-8 (PDCD2), Nervy, and predicted proteins from Drosophila, mammals, Caenorhabditis elegans, yeast, and plants. The MYND domain consists of a cluster of cysteine and histidine residues, arranged with an invariant spacing to form a potential zinc-binding motif. Mutating conserved cysteine residues in the DEAF-1 MYND domain does not abolish DNA binding, which suggests that the MYND domain might be involved in protein-protein interactions. Indeed, the MYND domain of ETO/MTG8 interacts directly with the N-CoR and SMRT co-repressors. Aberrant recruitment of co-repressor complexes and inappropriate transcriptional repression is believed to be a general mechanism of leukemogenesis caused by the t(8;21) translocations that fuse ETO with the acute myelogenous leukemia 1 (AML1) protein. ETO has been shown to be a co-repressor recruited by the promyelocytic leukemia zinc finger (PLZF) protein. A divergent MYND domain present in the adenovirus E1A binding protein BS69 was also shown to interact with N-CoR and mediate transcriptional repression. The current evidence suggests that the MYND motif in mammalian proteins constitutes a protein-protein interaction domain that functions as a co-repressor-recruiting interface.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the zinc finger domain found in A20. A20 is an inhibitor of cell death that inhibits NF-kappaB activation via the tumour necrosis factor receptor associated factor pathway. The zinc finger domains appear to mediate self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Triglyceride lipases are lipolytic enzymes that hydrolyse ester linkages of triglycerides. Lipases are widely distributed in animals, plants and prokaryotes. This family of lipases have been called Class 3 as they are not closely related to other lipase families.
The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth. Thus ribosomes are 'recycled' and ready for another round of protein synthesis.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L18ae forms part of the 60S ribosomal subunit. This family is found in eukaryotes. Rat ribosomal protein L18 is homologous to Xenopus laevis L14.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L22e forms part of the 60S ribosomal subunit. This family is found in eukaryotes. Rattus norvegicus (Rat) L22 is related to ribosomal proteins from other eukaryotes and is identical in amino acid sequence to human EAP, the EBER 1 (Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) encoded RNA) associated protein.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein, L27 is found in fungi, plants, algae and vertebrates. The family has a specific signature at the C terminus.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This ribosomal protein is found in archaebacteria and eukaryotes. Ribosomal protein L37 has a single zinc finger-like motif of the C2-C2 type.
The RimM protein is essential for efficient processing of 16S rRNA. The RimM protein was shown to have affinity for free ribosomal 30S subunits but not for 30S subunits in the 70S ribosomes.
2-deoxy-D-ribose 5-phosphate = D-glyceraldehyde 3-phosphate + acetaldehydeThe family also includes a group of related bacterial proteins of unknown function, see examplesand
This is a family of methyltransferases, so called because they are responsible for the transfer of methyl groups between molecules. Despite its name, it does not occur solely in bacteria. This protein is essential in Escherichia coli and has been linked to peptidoglycan biosynthesis.
Proteins have been implicated in an expanding variety of functions during pre-mRNA splicing. Molecular cloning has identified genes encoding spliceosomal proteins that potentially act as novel RNA helicases, GTPases, or protein isomerases. Novel protein-protein and protein-RNA interactions that are required for functional spliceosome formation have also been described. Finally, growing evidence suggests that proteins may contribute directly to the spliceosome's active sites.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents the D subunit found in V1 and A1 complexes of V- and A-ATPases, respectively. Subunit D appears to be located in the central stalk, whereas subunits E and G form part of the peripheral stalk connecting V1 and V0. This subunit is the most likely homologue to the gamma subunit of the F1 complex in F-ATPases, which undergoes rotation during ATP hydrolysis and serves an essential function in rotary catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the olymerization of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}.
Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerizes into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells.
There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin.
This domain is found in proteinase inhibitors as well as in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.
This inhibitor domain belongs to MEROPS inhibitor family I8 (clan IA). Proteins containing this domain inhibit peptidases belonging to families S1, S8, and M4 and are restricted to the chordata, nematoda, arthropoda and echinodermata. Examples of proteins containing this domain are:
Nascent polypeptide-associated complex (NAC) is among the first ribosome-associated entities to bind the nascent polypeptide after peptide bond formation. The nascent polypeptide-associated complex (NAC) of yeast functions in the targeting process of ribosomes to the ER membrane. NAC may prevent binding of ribosome nascent chains (RNCs) without a signal sequence to yeast membranes.
Moz is a monocytic leukemia Zn_finger protein and the SAS protein from Saccharomyces cerevisiae (Baker's yeast) is involved in silencing the Hmr locus. These proteins were reported to be homologous to acetyltransferases but this similarity is not supported by standard sequence analysis.
The contiguous gene deletion syndrome is characterised by Alport syndrome (A), mental retardation (M), midface hypoplasia (M), and elliptocytosis (E), as well as generalized hypoplasia and cardiac abnormalities. It is caused by a deletion in Xq22.3, comprising several genes including AMME chromosomal region gene 1 (AMMECR1), which encodes a protein with a nuclear location and presently unknown function. The C-terminal region of AMMECR1 (from residue 122 to 333) is well conserved, and homologues appear in species ranging from bacteria and archaea to eukaryotes. The high level of conservation of the AMMECR1 domain points to a basic cellular function, potentially in either the transcription, replication, repair or translation machinery.
The AMMECR1 domain contains a 6-amino-acid motif (LRGCIG) that might be functionally important since it is strikingly conserved throughout evolution. The AMMECR1 domain consists of two distinct subdomains of different sizes. The large subdomain, which contains both the N- and C-terminal regions, consists of five alpha-helices and five beta-strands. These five beta-strands form an antiparallel beta-sheet. The small subdomain consists of four alpha-helices and three beta-strands, and these beta-strands also form an antiparallel beta-sheet. The conserved 'LRGCIG' motif is located at beta(2) and its N-terminal loop, and most of the side chains of these residues point toward the interface of the two subdomains. The two subdomains are connected by only two loops, and the interaction between the two subdomains is not strong. Thus, these subdomains may move dynamically when the substrate enters the cleft. The size of the cleft suggests that the substrate is large, e.g., the substrate may be a nucleic acid or protein. However, the inner side of the cleft is not filled with positively charged residues, and therefore it is unlikely that negatively charged nucleic acids such as DNA or RNA interact at this site.
The beta subunit of archaeal and eukaryotic translation initiation factor 2 (IF2beta) and the N-terminal domain of translation initiation factor 5 (IF5) show significant sequence homology. Archaeal IF2beta contains two independent structural domains: an N-terminal mixed alpha/beta core domain (topological similarity to the common core of ribosomal proteins L23 and L15e), and a C-terminal domain consisting of a zinc-binding C4 finger. Archaeal IF2beta is a ribosome-dependent GTPase that stimulates the binding of initiator Met-tRNA(i)(Met) to the ribosomes, even in the absence of other factors. The C-terminal domain of eukaryotic IF5 is involved in the formation of the multi-factor complex (MFC), an important intermediate for the 43S pre-initiation complex assembly. IF5 interacts directly with IF1, IF2beta and IF3c, which together with IF2-bound Met-tRNA(i)(Met) form the MFC.
This entry represents both the N-terminal and zinc-binding domains of IF2, as well as a domain in IF5.
This entry contains proteins from all branches of life. The molecular function of these proteins are unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton.
DNA primase synthesizes the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large (p60) and small (p50) subunits in eukaryotes. This family represents sequences of the small subunit and the DNA primase sequences of the Archaea. No sequence similarity can be detected between the eukaryotic p50 and p60 subunits and the primases purified from bacteriophage and bacteria.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins of 56 to 96 amino-acid residues that share a highly conserved region located in the N-terminal part.
A small region that overlaps with a nuclear localization signal and binds to the RNA primer contains three aspartates that are essential for catalysis. Sequence and secondary structure comparisons of regions surrounding these aspartates with sequences of other polymerases revealed a significant homology to the palm structure of DNA polymerase beta, terminal deoxynucleotidyltransferase and DNA polymerase IV of Saccharomyces cerevisiae, all members of the family X of polymerases. This homology extends as far as cca: tRNA nucleotidyltransferase and streptomycin adenylyltransferase, an antibiotic resistance factor.
Proteins containing this domain include kanamycin nucleotidyltransferase (KNTase) which is a plasmid-coded enzyme responsible for some types of bacterial resistance to aminoglycosides. KNTase inactivates antibiotics by catalysing the addition of a nucleotidyl group onto the drug. In experiments, Mn2+ strongly stimulated this reaction due to a 50-fold lower Ki for 8-azido-ATP in the presence of Mn2+. Mutations of the highly conserved Asp residues 113, 115, and 167, critical for metal binding in the catalytic domain of bovine poly(A) polymerase, led to a strong reduction of cross-linking efficiency, and Mn2+ no longer stimulated the reaction. Mutations in the region of the "helical turn motif" (a domain binding the triphosphate moiety of the nucleotide) and in the suspected nucleotide-binding helix of bovine poly(A) polymerase impaired ATP binding and catalysis. The results indicate that ATP is bound in part by the helical turn motif and in part by a region that may be a structural analogue of the fingers domain found in many polymerases.
This family includes eukaryotic translation initiation factor 6 (eIF6) as well as presumed archaeal homologues.
The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This event is rate-limiting for translation, and depends on external stimuli and the status of the cell. Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and prevents its association with the 40S ribosomal subunit ribosomes. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.
Spermidine + [eIF-5A]-lysine = 1,3-diaminopropane + [eIF-5A]-deoxyhypusineThe modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation. The structure is known for this enzyme in complex with its NAD+ cofactor.
Members of this family include the archaeal protein Alba and a number of eukaryotic proteins with no known function. The DNA/RNA-binding protein Alba binds double-stranded DNA tightly but without sequence specificity. It binds rRNA and mRNA in vivo, and may play a role in maintaining the structural and functional stability of RNA, and, perhaps, ribosomes. It is distributed uniformly and abundantly on the chromosome. Alba has been shown to bind DNA and affect DNA supercoiling in a temperature dependent manner. It is regulated by acetylation (alba = acetylation lowers binding affinity) by the Sir2 protein. Alba is proposed to play a role in establishment or maintenace of chromatin architecture and thereby in transcription repression. For further information see.
Prefoldin (PFD) is a chaperone that interacts exclusively with type II chaperonins, hetero-oligomers lacking an obligate co-chaperonin that are found only in eukaryotes (chaperonin-containing T-complex polypeptide-1 (CCT)) and archaea. Eukaryotic PFD is a multi-subunit complex containing six polypeptides in the molecular mass range of 14Â23 kDa. In archaea, on the other hand, PFD is composed of two types of subunits, two alpha and four beta. The six subunits associate to form two back-to-back up-and-down eight-stranded barrels, from which hang six coiled coils. Each subunit contributes one (beta subunits) or two (alpha subunits) beta hairpin turns to the barrels. The coiled coils are formed by the N and C termini of an individual subunit. Overall, this unique arrangement resembles a jellyfish. The eukaryotic PFD hexamer is composed of six different subunits; however, these can be grouped into two alpha-like (PFD3 and -5) and four beta-like (PFD1, -2, -4, and -6) subunits based on amino acid sequence similarity with their archaeal counterparts. Eukaryotic PFD has a six-legged structure similar to that seen in the archaeal homologue. This family contains the archaeal beta subunit, eukaryotic prefoldin subunits 1, 2, 4 and 6.
Eukaryotic PFD has been shown to bind both actin and tubulin co-translationally. The chaperone then delivers the target protein to CCT, interacting with the chaperonin through the tips of the coiled coils. No authentic target proteins of any archaeal PFD have been identified, to date.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This entry includes the eukaryotic ribosomal protein L14, which binds to the 60S ribosomal subunit, and archaebacterial ribosomal protein L14E, which binds to the 50S ribosomal subunit.
This entry contains uncharacterised proteins. Those with structural information consist of two domains: an all-alpha domain with a 3-helical bundle fold, and an alpha-beta domain in 3 layers, alpha/beta/alpha.
The TRAM (after TRM2 and miaB) domain is a 60-70-residue-long module that is found in:
Proteins in this entry are found in archaea, bacteria and eukaryotes. Their function is unknown, but alignment shows several conserved polar residues which are potential catalytic residues. The structure of one of these proteins has been determined and shows homolgy to heat shock protein 33, which is a chaperone protein that inhibits the aggregation of partially denatured proteins.
This signature defines a diverse group of protein families which include proteins involved in RNA-protein interaction regulation, thiamine biosynthesis, Ras-related signal transduction, and those with protease activity. Examples of annotation are:
This entry represents a 3-layer alpha/beta/alpha domain found as the catalytic domain at the C-terminal in homotetrameric tRNA-intron endonucleases, and as domains 2 and 4 (C-terminal) in the homodimeric enzymes. tRNA-intron endonucleases remove tRNA introns by cleaving pre-tRNA at the 5'- and 3'-splice sites to release the intron. The products are an intron and two tRNA half-molecules bearing 2',3' cyclic phosphate and 5'-hydroxyl termini. These enzymes recognise a pseudosymmetric substrate in which 2 bulged loops of 3 bases are separated by a stem of 4 bp. Although homotetrameric enzymes contain four active sites, only two participate in the cleavage, and should therefore, be considered as a dimer of dimers.
This group of enzymes represents a large metal dependent hydrolase superfamily. The family includes adenine deaminase that hydrolyses adenine to form hypoxanthine and ammonia. The adenine deaminase reaction is important for adenine utilization as a purine and also as a nitrogen source. This family also includes dihydroorotase and N-acetylglucosamine-6-phosphate deacetylases. These enzymes catalyse the reaction:
N-acetyl-D-glucosamine 6-phosphate + H2O = D-glucosamine 6-phosphate + acetateThis family includes dihydroorotase and urease which belong to MEROPS peptidase family M38 (beta-aspartyl dipeptidase, clan MJ), where they are classified as non-peptidase homologs.
This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.
Proteins containing this entry have no known function and are predicted to be integral membrane proteins. They include the Ccc1 protein from Saccharomyces cerevisiae (Baker's yeast) that may have a role in regulating calcium levels.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents subunit F found in the V1 complex of V-ATPases (both eukaryotic and bacterial), as well as in the A1 complex of A-ATPases. Subunit F is a 16 kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis.
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents subunit E from the V1 and A1 complexes of V- and A-ATPases, respectively. Subunit E appears to form a tight interaction with subunit G in the F0 complex, which together may act as stators to prevent certain subunits from rotating with the central rotary element, much in the same way as the F0 complex subunit B does in F-ATPases. In addition to its key role in stator structure, subunit E appears to have a role in mediating interactions with putative regulatory subunits.
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.
This entry represents subunit C from the A0 complex of A-ATPases, and subunits C and D from the V0 complex of V-ATPases, all of which are involved in the translocation of protons across a membrane. There is more than one type of D subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity .
More information about this protein can be found at Protein of the Month: ATP Synthases.
Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit and the small beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE.
The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C-terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow.
This entry represents the conserved amino terminal region of eukaryotic TFIIE-alpha and proteins from archaebacteria (TFE) that are also presumed to be TFIIE-alpha subunits.
S-AdoMet + tRNA = S-adenosyl-L-homocysteine + tRNA containing N2-methylguanineThe TRM1 gene of Saccharomyces cerevisiae is necessary for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNAs. The enzyme is found in both eukaryotes and archaea.
This entry represents the W2 domain (two invariant tryptophans) and is a region of ~165 amino acids which is found in the C-terminus of the following eIFs:
Translation initiation is a sophisticated, well regulated and highly coordinated cellular process in eukaryotes, in which at least 11 eukayrotic initiation factors (eIFs) are included.
The W2 domain has a globular fold and is exclusively composed out of alpha-helices. The structure can be divided into a structural C-terminal core onto which the two N-terminal helices are attached. The core contains two aromatic/acidic residue-rich regions (AA boxes), which are important for mediating protein-protein interactions.
The entry covers the entire W2 domain.
Retroviral integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains: an N-terminal zinc binding domain, a central catalytic core and a C-terminal DNA-binding domain. Often found as part of the POL polyprotein.
The B subunit contains a region of similarity with the yeast protein HAP2. For the B subunit it has been suggested that the N-terminal portion of the conserved region is involved in subunit interaction and the C-terminal region involved in DNA-binding.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide.
The type example is aminopeptidase I from Saccharomyces cerevisiae (Baker's yeast), the sequence of which has been deduced, and the mature protein shown to consist of 469 amino acids. A 45-residue presequence contains both positively- and negatively-charged and hydrophobic residues, which could be arranged in an N-terminal amphiphilic alpha-helix. The presequence differs from signal sequences that direct proteins across bacterial plasma membranes and endoplasmic reticulum or into mitochondria. It is unclear how this unique presequence targets aminopeptidase I to yeast vacuoles, and how this sorting utilises classical protein secretory pathways.
These, as yet, uncharacterised proteins are of 17 to 21 kDa. They contain a conserved region with three histidines at the C terminus. The protein family is represented by a single member sequence only in nearly every bacterium.
The crystal structure of the protein from the hyperthermophilic bacteria Aquifex aeolicus has been determined. The overall fold consists of one central alpha-helix surrounded by a four-stranded beta-sheet and four other alpha-helices. Structure-based homology analysis reveals a good resemblance to the metal-dependent proteinases such as collagenases and gelatinases. However, experimental tests for collagenase and gelatinase-type function show no detectable activity under standard assay conditions.
The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme is responsible for activating ubiquitin, the first step in ubiquitinylation. The E1 enzyme hydrolyses ATP and adenylates the C-terminal glycine residue of ubiquitin, and then links this residue to the active site cysteine of E1, yielding a ubiquitin-thioester and free AMP. To be fully active, E1 must non-covalently bind to and adenylate a second ubiquitin molecule. The E1 enzyme can then transfer the thioester-linked ubiquitin molecule to a cysteine residue on the ubiquitin-conjugating enzyme, E2, in an ATP-dependent reaction.
This domain is found 2 times in each member of the ubiquitin activating enzymes and is located downstream of the active site cysteine.
Ran is an evolutionary conserved member of the Ras superfamily of small GTPases that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Import receptors bind their cargos in the cytoplasm where the concentration of RanGTP is low and release their cargos in the nucleus where the concentration of RanGTP is high. Export receptors respond to Ran GTP in the opposite manner.
Nuclear transport factor 2 (NTF2) is a homodimer of approximately 14kDa subunits which stimulates efficient nuclear import of a cargo protein. NTF2 binds to both RanGDP and FxFG repeat-containing nucleoporins. NTF2 binds to RanGDP sufficiently strongly for the complex to remain intact during transport through NPCs, but the interaction between NTF2 and FxFG nucleoporins is much more transient, which would enable NTF2 to move through the NPC by hopping from one repeat to another.
NTF2 folds into a cone with a deep hydrophobic cavity, the opening of which is surrounded by several negatively charged residues. RanGDP binds to NTF2 by inserting a conserved phenylalanine residue into the hydrophobic pocket of NTF2 and making electrostatic interactions with the conserved negatively charged residues that surround the cavity.
A structurally similar domain appears in other nuclear import proteins.
The "beige" mouse is established as an animal model of Chediak-Higashi Syndrome (CHS). The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein It is also found in distantly related proteins like, for example,andwhich are factor associated with neutral sphingomyelinase activation.
The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown.
This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP), All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties.
Some of the proteins containing an UBP zinc finger include:
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.
RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:
In archaebacteria, there is generally a single form of RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. It has recently been shown that small subunits of about 15 kDa, found in polymerase types I and II, are highly conserved. These proteins contain a probable zinc finger in their N-terminal region and a C-terminal zinc ribbon domain.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This entry contains metallopeptidases belonging to MEROPS peptidase family M50 (S2P protease family, clan MM).
Members of the M50 metallopeptidase family include: mammalian sterol-regulatory element binding protein (SREBP) site 2 protease, Escherichia coli protease EcfE, stage IV sporulation protein FB and various hypothetical bacterial and eukaryotic homologues. A number of proteins are classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.
AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins, in DNA-binding proteins from plants and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex.
High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions.
Formin homology (FH) proteins play a crucial role in the reorganization of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other, and may also act to inhibit actin polymerisation. The FH3 domain is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).
This entry represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains.
This domain has been termed SRAÂYDG, for SET and Ring finger Associated, and because of the conserved YDG motif within the domain. Further characteristics of the domain are the conservation of up to 13 evenly spaced glycine residues and a VRV(I/V)RG motif. The domain is mainly found in plants and animals and in bacteria. In animals, this domain is associated with the Np95-like ring finger protein and the related gene product Np97, which contains PHD and RING FINGER domains and which is an important determinant in cell cycle progression. Np95 is a chromatin-associated ubiquitin ligase, binding to histones is direct and shows a remarkable preference for histone H3 and its N-terminal tail. The SRA-YDG domain contained in Np95 is indispensable both for the interaction with histones and for chromatin binding in vivo. In plants the SRA-YDG domain is associated with the SET domain, found in a family of histone methyl transferases, and in bacteria it is found in association with HNH, a non-specific nuclease motif.
The HAT (Half A TPR) repeat has a repetitive pattern characterised by three aromatic residues with a conserved spacing. They are structurally and sequentially similar to TPRs (tetratricopeptide repeats), though they lack the highly conserved alanine and glycine residues found in TPRs. The number of HAT repeats found in different proteins varies between 9 and 12. HAT-repeat-containing proteins appear to be components of macromolecular complexes that are required for RNA processing. The repeats may be involved in protein-protein interactions. The HAT motif has striking structural similarities to HEAT repeats, being of a similar length and consisting of two short helices connected by a loop domain, as in HEAT repeats.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This signature defines the N-terminal domain of the archael, bacterial and eukaryotic lon proteases, which are ATP-dependent serine peptidases belonging to the MEROPS peptidase family S16 (lon protease family, clan SF). In the eukaryotes the majority of the proteins are located in the mitochondrial matrix. In yeast, Pim1, is located in the mitochondrial matrix, is required for mitochondrial function, is constitutively expressed but is increased after thermal stress, suggesting that Pim1 may play a role in the heat shock response.
The olfactomedin-domain was first identified in olfactomedin, an extracellular matrix protein of the olfactory neuroepithelium. Members of this extracellular domain-family have since been shown to be present in several metazoan proteins, such as latrophilins, myocilins, optimedins and noelins, the latter being involved in the generation of neural crest cells. Myocilin is of considerable interest, as mutations in its olfactomedin-domain can lead to glaucoma. The olfactomedin-domains in myocilin and optimedin are essential for the interaction between these two proteins.
The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins, and can be found fused to the C-terminus of DNA topoisomerase in Chlamydia.
MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded beta-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 alpha-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.
The SWIB and MDM2 domains are homologous and share a common fold.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
The N-end rule-based degradation signal, which targets a protein for ubiquitin-dependent proteolysis, comprises a destabilizing amino-terminal residue and a specific internal lysine residue. This entry describes a putative zinc finger in N-recognin, a recognition component of the N-end rule pathway.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Dynamin GTPase effector domain found in proteins related to dynamin.
Dynamin is a GTP-hydrolysing protein that is an essential participant in clathrin-mediated endocytosis by cells. It self-assembles into 'collars' in vivo at the necks of invaginated coated pits; the self-assembly of dynamin being coordinated by the GTPase domain. Mutation studies indicate that dynamin functions as a molecular regulator of receptor-mediated endocytosis.
The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eukaryotic proteins of unknown function . It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition. Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand-binding site. There is limited homology within the C-terminal 20-30 amino acids of various GYF domains, supporting the idea that this part of the domain is structurally but not functionally important.
Potassium channels are the most diverse group of the ion channel family. They are important in shaping the action potential, and in neuronal excitability and plasticity. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.
These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers. In eukaryotic cells, K+ channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis.
All K+ channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K+ selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK). The 2TM domain family comprises inward-rectifying K+ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.
The Kv family can be divided into several subfamilies on the basis of sequence similarity and function. Four of these subfamilies, Kv1 (Shaker), Kv2 (Shab), Kv3 (Shaw) and Kv4 (Shal), consist of pore-forming alpha subunits that associate with different types of beta subunit. Each alpha subunit comprises six hydrophobic TM domains with a P-domain between the fifth and sixth, which partially resides in the membrane. The fourth TM domain has positively charged residues at every third residue and acts as a voltage sensor, which triggers the conformational change that opens the channel pore in response to a displacement in membrane potential. More recently, 4 new electrically-silent alpha subunits have been cloned: Kv5 (KCNF), Kv6 (KCNG), Kv8 and Kv9 (KCNS). These subunits do not themselves possess any functional activity, but appear to form heteromeric channels with Kv2 subunits, and thus modulate Shab channel activity. When highly expressed, they inhibit channel activity, but at lower levels show more specific modulatory actions.
The N-terminal, cytoplasmic tetramerization domain (T1) of voltage-gated potassium channels encodes molecular determinants for subfamily-specific assembly of alpha-subunits into functional tetrameric channels. This domain is found in a subset of a larger group of proteins that contain the BTB/POZ domain.
Thymidylate kinase (dTMP kinase) catalyzes the phosphorylation of thymidine 5'-monophosphate (dTMP) to form thymidine 5'-diphosphate (dTDP) in the presence of ATP and magnesium:
ATP + thymidine 5'-phosphate = ADP + thymidine 5'-diphosphate
Thymidylate kinase is an ubiquitous enzyme of about 25 Kd and is important in the dTTP synthesis pathway for DNA synthesis. The function of dTMP kinase in eukaryotes comes from the study of a cell cycle mutant, cdc8, in Saccharomyces cerevisiae. Structural and functional analyses suggest that the cDNA codes for authentic human dTMP kinase. The mRNA levels and enzyme activities corresponded to cell cycle progression and cell growth stages.
This entry reprsents known and predicted kinases, and related enzymes such as UMP-CMP kinase.
This C-terminal domain has an SH3-like barrel fold, the function of which is unknown. It is found associated with prokaryotic bifunctional transcriptional repressors and eukaryotic enzymes involved in biotin utilization.
In Escherichia coli the biotin operon repressor (BirA) is a bifunctional protein. BirA acts both as the acetyl-coA carboxylase biotin holoenzyme synthetase and as the biotin operon repressor. DNA sequence analysis of mutations indicates that the helix-turn-helix DNA binding region is located at the N-terminus while mutations affecting enzyme function, although mapping over a large region, are found mainly in the central part of the protein's primary sequence.
Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension.
PA28 activator complex (also known as 11S regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha (PA28alpha) and beta (PA28beta) subunits. The catalytic properties of PA28alpha and PA28beta-activated proteosome are similar. This entry represents the beta subunit. The activator complex binds to the 20S proteasome and stimulates peptidase activity in and ATP-independent manner.
Guanylate-binding protein is a GTPase that is induced by interferon (IFN)-gamma. GTPases induced by IFN-gamma are key to the protective immunity against microbial and viral pathogens. These GTPases are classified into three groups: the small 47-kd GTPases, the Mx proteins, and the large 65- to 67-kd GTPases. Guanylate-binding proteins (GBP) fall into the last class. In humans, there are seven GBPs (hGBP1-7). Structurally, hGBP1 consists of two domains: a compact globular N-terminal domain harbouring the GTPase function, and an alpha-helical finger-like C-terminal domain. Human GBP1 is secreted from cells without the need of a leader peptide, and has been shown to exhibit antiviral activity against Vesicular stomatitis virus and Encephalomyocarditis virus, as well as being able to regulate the inhibition of proliferation and invasion of endothelial cells in response to IFN-gamma.
This domain is often found adjacent to the DHH domain, found in the RecJ-like phosphoesterase family and is called DHHA1 for DHH associated domain. DHHA1 is diagnostic of DHH subfamily 1 members. This domain is also found in alanyl tRNA synthetase e.g. suggesting that it may have an RNA binding function. The domain is about 60 residues long and contains a conserved GG motif.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the 14 kDa SRP14 component. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.
This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65).
None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity.
AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors..
AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades.
They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it.
They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE).
The VAT protein of the archaebacterium Thermoplasma acidophilum, like all other members of the Cdc48/p97 family of AAA ATPases, has two ATPase domains and a 185-residue amino-terminal substrate-recognition domain, VAT-N. VAT shows activity in protein folding and unfolding and thus shares the common function of these ATPases in disassembly and/or degradation of protein complexes.
VAT-N is composed of two equally sized subdomains. The amino-terminal subdomain VAT-Nn forms a double-psi beta-barrel whose pseudo-twofold symmetry is mirrored by an internal sequence repeat of 42 residues. The carboxy-terminal subdomain VAT-Nc forms a novel six-stranded beta-clam fold. Together, VAT-Nn and VAT-Nc form a kidney-shaped structure, in close agreement with results from electron microscopy. VAT-Nn is related to numerous proteins including prokaryotic transcription factors, metabolic enzymes, the protease cofactors UFD1 and PrlF, and aspartic proteinases.
This ATPase is involved in the removal of arsenate, antimonite, and arsenate from the cell.
In Escherichia coli an anion-translocating ATPase has been identified as the product of the arsenical resistance operon of resistance Plasmid R773. This ATP-driven oxyanion pump catalyses extrusion of the oxyanions arsenite, antimonite and arsenate. Maintenance of a low intracellular concentration of oxyanion produces resistance to the toxic agents. The pump is composed of two polypeptides, the products of the arsA and arsB genes. This two-subunit enzyme produces resistance to arsenite and antimonite. A third gene, arsC, expands the substrate specificity to allow for arsenate pumping and resistance.
The ArsA and ArsB proteins form a membrane-bound pump that functions as an oxyanion-translocating ATPase. The ArsC protein is an arsenate reductase that reduces arsenate to arsenite, which is subsequently pumped out of the cell.
The recessive suppressor of secretory defect in yeast Golgi and yeast actin function belongs to this family. This protein may be involved in the coordination of the activities of the secretory pathway and the actin cytoskeleton.
Human synaptojanin which may be localised on coated endocytic intermediates in nerve terminals also belongs to this family.
This entry represents tRNA (guanine-N-7) methyltransferase, which catalyses the formation of N(7)-methylguanine at position 46 (m7G46) in tRNA. Capping of the pre-mRNA 5' end by addition a monomethylated guanosine cap (m(7)G) is an essential and the earliest modification in the biogenesis of mRNA. The reaction is catalysed by three enzymes: triphosphatase, guanylyltransferase, and tRNA (guanine-N-7) methyltransferase.
Terpenes are among the largest groups of natural products and include compounds such as vitamins, cholesterol and carotenoids. The biosynthesis of all terpenoids begins with one or both of the two C5 precursors of the pathway: isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). In animals, fungi, and certain bacteria, the synthesis of IPP and DMAPP occurs via the well-known mevalonate pathway, however, a second, nonmevalonate terpenoid pathway has been identified in many eubacteria, algae and the chloroplasts of higher plants.
LytB(IspH) catalyses the conversion of 1-hydroy-2-methyl-2-(E)-butenyl 4-diphosphate into IPP and DMAPP in this second pathway The enzyme appears to be responsible for a branch-step in the nonmevalonate pathway, in that IPP and DMAPP are produced in parallel from a single precursor although the exact mechanism of this is not currently fully understood. Escherichia coli LytB protein had been found to regulate the activity of RelA (guanosine 3',5'-bispyrophosphate synthetase I), which in turn controls the level of a regulatory metabolite. It is involved in penicillin tolerance and the stringent response.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This entry represents the N-terminal domain of Seryl-tRNA synthetase, which consists of two helices in a long alpha-hairpin. Seryl-tRNA synthetase exists as monomer and belongs to class IIa.
A novel antigen of Plasmodium falciparum has been cloned that contains a hydrophobic domain typical of an integral membrane protein. The antigen is designated apical membrane antigen 1 (AMA-1) by virtue of appearing to be located in the apical complex. AMA-1 appears to be transported to the merozoite surface close to the time of schizont rupture.
The 66kDa merozoite surface antigen (PK66) of Plasmodium knowlesi, a simian malaria, possesses vaccine-related properties believed to originate from a receptor-like role in parasite invasion of erythrocytes. The sequence of PK66 is conserved throughout plasmodium, and shows high similarity to P. falciparum AMA-1. Following schizont rupture, the distribution of PK66 changes in a coordinate manner associated with merozoite invasion. Prior to rupture, the protein is concentrated at the apical end, following which it distributes itself entirely across the surface of the free merozoite. Immunofluorescence studies suggest that, during invasion, PK66 is excluded from the erythrocyte at, and behind, the invasion interface.
The SMC (structural maintenance of chromosomes) family of proteins exists in virtually all organisms including both bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms and form three types of heterodimer (SMC1ÂSMC3, SMC2ÂSMC4, SMC5ÂSMC6), which are core components of large multiprotein complexes. The best known complexes are cohesin, which is responsible for sister-chromatid cohesion, and condensin, which is required for full chromosome condensation in mitosis.
SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and share a five-domain structure, with globular N- and C-terminal domains separated by a long (circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular ''hinge'' domain, characterised by a set of four highly conserved glycine residues that are typical of flexible regions in a protein. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases.
All SMC proteins appear to form dimers, either forming homodimers with themselves, as in the case of prokaryotic SMC proteins, or heterodimers between different but related SMC proteins. The dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits associate with the SMC amino- and carboxy-terminal domains. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences.
SMCs share not only sequence similarity but also structural similarity with ABC proteins. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression.
This domain is found at the N terminus of SMC proteins.
The membrane-embedded multi-protein complexes of mitochondria mediate the transport of nuclear-encoded proteins across and into the outer or inner mitochondrial membranes. The TOM (translocase of the outer mitochondrial membrane) complex consists of cytosol-exposed receptors and a pore-forming core, and mediates the transport of proteins from the cytosol across and into the outer mitochondrial membrane. A novel protein complex in the outer membrane of mitochondria, called the SAM complex (sorting and assembly machinery), is involved in the biogenesis of beta-barrel proteins of the outer membrane. Two translocases of the inner mitochondrial membrane (TIM complexes) mediate protein transport at the inner membrane.
The TIM23 complex (a presequence translocase) mediates the transport of presequence-containing proteins across and into the inner membrane. TIM17 forms a part of this complex, although its role is not yet fully understood. The TIM22 complex (a twin-pore carrier translocase) catalyses the insertion of multi-spanning proteins that have internal targeting signals into the inner membrane. The TIM22 complex mediates the membrane insertion of multi-spanning inner-membrane proteins that have internal targeting signals, and it uses a as an external driving force. The Tim22 subunit of the mitochondrial import inner membrane translocase is included in this family.
Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase.
There are at least two distinct cobalamin biosynthetic pathways in bacteria:
Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways). There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.
CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids.
This entry represents CobW-like proteins, including P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression, and urease accessory protein UreG, which acts as a chaperone in the activation of urease upon insertion of nickel into the active site.
N-linked glycosylation is a ubiquitous protein modification, and is essential for viability in eukaryotic cells. A lipid-linked core-oligosaccharide is assembled at the membrane of the endoplasmic reticulum and transferred to selected asparagine residues of nascent polypeptide chains by the oligosaccharyl transferase (OTase) complex.
This family consists of the oligsacharyl transferase STT3 subunit and related proteins. The STT3 subunit is part of the oligosccharyl transferase (OTase) complex of proteins and is required for its activity.
This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
This family currently contains one sequence of known function human mitochondrial transcription termination factor (mTERF), a multizipper protein but binds to DNA as a monomer, with evidence pointing to intramolecular leucine zipper interactions. The precursors contain a mitochondrial targeting sequence, and the mature mTERF exhibits three leucine zippers, of which one is bipartite, and two widely spaced basic domains. Both basic domains and the three leucine zipper motifs are necessary for DNA binding. The leucine zippers are not implicated in a dimerisation role as in other leucine zippers.
The rest of the family consists of hypothetical proteins none of which have any functional information.
This entry represents MECDP (2-C-methyl-D-erythritol 2,4-cyclodiphosphate) synthetase, an enzyme in the non-mevalonate pathway of isoprenoid synthesis, isoprenoids being essential in all organisms. Isoprenoids can also be synthesized through the mevalonate pathway. The non-mevolante route is used by many bacteria and human pathogens, including Mycobacterium tuberculosis and Plasmodium falciparum. This route appears to involve seven enzymes. MECDP synthetase catalyses the intramolecular attack by a phosphate group on a diphosphate, with cytidine monophosphate (CMP) acting as the leaving group to give the cyclic diphosphate product MEDCP. The enzyme is a trimer with three active sites shared between adjacent copies of the protein. The enzyme also has two metal binding sites, the metals playing key roles in catalysi.
A number of proteins from eukaryotes and prokaryotes share this common N-terminal signature and appear to be involved in terpenoid biosynthesis. The ygbB protein is a putative enzyme of this type.
Synonym(s): Steroid 5-alpha-reductase
3-oxo-5-alpha-steroid 4-dehydrogenases,catalyse the conversion of 3-oxo-5-alpha-steroid + acceptor to 3-oxo-delta(4)-steroid + reduced acceptor. The steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during foetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related steroid reductase enzyme, DET2, is found in plants such as Arabidopsis. Mutations in this enzyme cause defects in light-regulated development. This domain is present in both type 1 and type 2 forms.
Maf is a putative inhibitor of septum formation in eukaryotes, bacteria, and archaea. The Maf protein shares substantial amino acid sequence identity with the Escherichia coli OrfE protein.
This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader.
This entry describes proteins of unknown function.
A number of the members of this family have been characterised as a probable N-acetylglucosaminyl-phosphatidylinositol de-N-acetylase, that catalyses the second step in glycosylphosphatidylinositol (GPI) biosynthesis.
This entry describes proteins of unknown function.
In the bacterial cytosol, ATP-dependent protein degradation is performed by several different chaperone-protease pairs, including ClpAP. ClpS directly influences the ClpAP machine by binding to the N-terminal domain of the chaperone ClpA. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins.
ClpS is a small alpha/beta protein that consists of three alpha-helices connected to three antiparallel beta-strands. The protein has a globular shape, with a curved layer of three antiparallel alpha-helices over a twisted antiparallel beta-sheet. Dimerization of ClpS may occur through its N-terminal domain. This short extended N-terminal region in ClpS is followed by the central seven-residue beta-strand, which is flanked by two other beta-strands in a small beta-sheet.
This family is involved in biogenesis of respiratory and photosynthetic systems. In yeast the SCO1 protein is specifically required for a post-translational step in the accumulation of subunits 1 and 2 of cytochrome c oxidase (COXI and COX-II). It is a mitochondrion-associated cytochrome c oxidase assembly factor.
The purple nonsulphur photosynthetic eubacterium Rhodobacter capsulatus is a versatile organism that can obtain cellular energy by several means, including the capture of light energy for photosynthesis as well as the use of light-independent respiration, in which molecular oxygen serves as a terminal electron acceptor. The SenC protein is required for optimal cytochrome c oxidase activity in aerobically grown R. capsulatus cells and is involved in the induction of structural polypeptides of the light-harvesting and reaction centre complexes.
The GatB domain, the function of which is uncertain, is associated with aspartyl/glutamyl amidotransferase subunit B and glutamyl amidotransferase subunit E. These are involved in the formation of correctly charged Asn-tRNA(Asn) or Gln-tRNA(Gln) through the transamidation of misacylated Asp-tRNA(Asn) or Glu-tRNA(Gln) in organisms which lack either or both of asparaginyl-tRNA or glutaminyl-tRNA synthetases. The reaction takes place in the presence of glutamine and ATP through an activated phospho-Asp-tRNA(Asn) or phospho-Glu-tRNA(Gln).
This entry describes proteins of unknown function.
This family consists of the SufE-related proteins. These have been implicated in Fe-S metabolism and export.
This entry contains two related enzymes:
SKIP (SKI-interacting protein) is an essential spliceosomal component and transcriptional coregulator, which may provide regulatory coupling of transcription initiation and splicing. SKIP was identified in a yeast 2-hybrid screen, where it was shown to interact with both the cellular and viral forms of SKI through the highly conserved region on SKIP known as the SNW domain. SKIP is now known to interact with a number of other proteins as well. SKIP potentiates the activity of important transcription factors, such as vitamin D receptor, CBF1 (RBP-Jkappa), Smad2/3, and MyoD. It works with Ski in overcoming pRb-mediated cell cycle arrest, and it is targeted by the viral transactivators EBNA2 and E7.
This entry represents the SNW domain.
This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases.
The N-terminal and internal 5'3'-exonuclease domains are commonly found together, and are most often associated with 5' to 3' nuclease activities. The XPG protein signatures are never found outside the '53EXO' domains. The latter are found in more diverse proteins. The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families.
In the eubacterial type A DNA-polymerases, the N-terminal and internal domains are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A is always present towards the C-terminus. Several eukaryotic structure-dependent endonucleases and exonucleases have the 53EXO domains separated by 24 to 27 amino acids, and the XPG protein signatures are always present. In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. Eukaryotic DNA repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures.
Proliferating cell nuclear antigen (PCNA), or cyclin, is a non-histone acidic nuclear protein that plays a key role in the control of eukaryotic DNA replication. It acts as a co-factor for DNA polymerase delta, which is responsible for leading strand DNA replication. The sequence of PCNA is well conserved between plants and animals, indicating a strong selective pressure for structure conservation, and suggesting that this type of DNA replication mechanism is conserved throughout eukaryotes. In Saccharomyces cerevisiae (Baker's yeast), POL30, is associated with polymerase III, the yeast analog of polymerase delta.
Homologues of PCNA have also been identified in the archaea (Euryarchaeota and Crenarchaeota) and in Paramecium bursaria Chlorella virus 1 (PBCV-1) and in nuclear polyhedrosis viruses.
This entry includes Hydrogen expression/formation protein, HypE, which may be involved in the maturation of NifE hydrogenase; AIR synthase and FGAM synthase, which are involved in de novo purine biosynthesis; and selenide, water dikinase, an enzyme which synthesizes selenophosphate from selenide and ATP.
S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.
In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.
The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.
S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.
In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.
The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.
A number of enzymes require thiamine pyrophosphate (TPP) (vitamin B1) as a cofactor. It has been shown that some of these enzymes are structurally related. This represents the C-terminal TPP binding domain of TPP enzymes.
A number of enzymes require thiamine pyrophosphate (TPP) (vitamin B1) as a cofactor. It has been shown that some of these enzymes are structurally related. This represents the N-terminal TPP binding domain of TPP enzymes.
Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.
The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).
Transketolase(TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources show that the enzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is a highly related enzyme, dihydroxy-acetone synthase (DHAS)(also known as formaldehyde transketolase), which exhibits a very unusual specificity by including formaldehyde amongst its substrates.
1-deoxyxylulose-5-phosphate synthase (DXP synthase) is an enzyme so far found in bacteria (gene dxs) and plants (gene CLA1) which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D- xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase is evolutionary related to TK. The N-terminal section, contains a histidine residue which appears to function in proton transfer during catalysis . In the central section there are conserved acidic residues that are part of the active cleft and may participate in substrate-binding. This family includes transketolase enzymesand also partially matches to 2-oxoisovalerate dehydrogenase beta subunit. Both these enzymes utilise thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis.
Transketolase(TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources show that the enzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is a highly related enzyme, dihydroxy-acetone synthase (DHAS)(also known as formaldehyde transketolase), which exhibits a very unusual specificity by including formaldehyde amongst its substrates.
1-deoxyxylulose-5-phosphate synthase (DXP synthase) is an enzyme so far found in bacteria (gene dxs) and plants (gene CLA1) which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D- xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase is evolutionary related to TK. The N-terminal section, contains a histidine residue which appears to function in proton transfer during catalysis . In the central section there are conserved acidic residues that are part of the active cleft and may participate in substrate-binding. This family includes transketolase enzymesand also partially matches to 2-oxoisovalerate dehydrogenase beta subunit. Both these enzymes utilise thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis.
Glucose-6-phosphate dehydrogenase (G6PDH) is a ubiquitous protein, present in bacteria and all eukaryotic cell types. The enzyme catalyses the the first step in the pentose pathway, i.e. the conversion of glucose-6-phosphate to gluconolactone 6-phosphate in the presence of NADP, producing NADPH. The ubiquitous expression of the enzyme gives it a major role in the production of NADPH for the many NADPH-mediated reductive processes in all cells. Deficiency of G6PDH is a common genetic abnormality affecting millions of people worldwide. Many sequence variants, most caused by single point mutations, are known, exhibiting a wide variety of phenotypes.
This entry represents the C-terminal domain of these proteins. It adopts a ribonuclease H-like fold and is structurally related to the N-terminal domain.
Acetyl-CoA carboxylase is found in all animals, plants, and bacteria and catalyzes the first committed step in fatty acid synthesis. It is a multicomponent enzyme containing a biotin carboxylase activity, a biotin carboxyl carrier protein, and a carboxyltransferase functionality. The "B-domain" extends from the main body of the subunit where it folds into two alpha-helical regions and three strands of beta-sheet. Following the excursion into the B-domain, the polypeptide chain folds back into the body of the protein where it forms an eight-stranded antiparallel beta-sheet. In addition to this major secondary structural element, the C-terminal domain also contains a smaller three-stranded antiparallel beta-sheet and seven alpha-helices.
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the ATP-binding domain found in the large subunit of carbamoyl phosphate synthase, as well as in related proteins.
Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.
Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.
Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.
This entry represents the oligomerisation domain found in the large subunit of carbamoyl phosphate synthases as well as in certain other carboxy phsophate domain-containing enzymes.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine).
Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids.
The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer. The two zinc ions and the active site are entirely located in the C-terminal catalytic domain.
In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of Cephalopoda is also a GST.
Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.
Soluble GSTs activate glutathione (GSH) to GS-. In many GSTs, this is accomplished by a Tyr at H-bonding distance from the sulphur of GSH. These enzymes catalyse nucleophilic attack by reduced glutathione (GSH) on nonpolar compounds that contain an electrophilic carbon, nitrogen, or sulphur atom.
Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold, with each monomer composed of two distinct domains. The N-terminal domain forms a thioredoxin-like fold that binds the glutathione moiety, while the C-terminal domain contains several hydrophobic alpha-helices that specifically bind hydrophobic substrates.
This entry represents the N-terminal domain of GST.
Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme.
GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.
GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington's disease, Alzheimer's disease, Parkinson's disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins.
Beta-ketoacyl-ACP synthase(KAS) is the enzyme that catalyzes the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component of a number of enzymatic systems, including fatty acid synthetase (FAS), which catalyzes the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH; the multi-functional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum, which is involved in the biosynthesis of a polyketide antibiotic; polyketide antibiotic synthase enzyme systems; Emericella nidulans multifunctional protein Wa, which is involved in the biosynthesis of conidial green pigment; Rhizobium nodulation protein nodE, which probably acts as a beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain; and yeast mitochondrial protein CEM1. The condensation reaction is a two step process, first the acyl component of an activated acyl primer is transferred to a cysteine residue of the enzyme and is then condensed with an activated malonyl donor with the concomitant release of carbon dioxide.
This entry represents the C-terminal domain of beta-ketoacyl-ACP synthases. The active site is contained in a cleft betweeen N- and C-terminal domains, with residues from both domains contributing to substrate binding and catalysis.
Two different types of thiolase are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase and 3-ketoacyl-CoA thiolase. 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.
In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.
There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.
Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases.
The Ubiquitin Interacting Motif (UIM), or 'LALAL-motif', is a stretch of about 20 amino acid residues, which was first described in the 26S proteasome subunit PSD4/RPN-10 that is known to recognise ubiquitin. In addition, the UIM is found, often in tandem or triplet arrays, in a variety of proteins either involved in ubiquitination and ubiquitin metabolism, or known to interact with ubiquitin-like modifiers. Among the UIM proteins are two different subgroups of the UBP (ubiquitin carboxy-terminal hydrolase) family of deubiquitinating enzymes, one F-box protein, one family of HECT-containing ubiquitin-ligases (E3s) from plants, and several proteins containing ubiquitin-associated UBA and/or UBX domains. In most of these proteins, the UIM occurs in multiple copies and in association with other domains such as UBA, UBX, ENTH, EH, VHS, SH3, HECT, VWFA, EF-hand calcium-binding, WD-40, F-box, LIM, protein kinase, ankyrin, PX, phosphatidylinositol 3- and 4-kinase, C2, OTU, dnaJ, RING-finger or FYVE-finger. UIMs have been shown to bind ubiquitin and to serve as a specific targeting signal important for monoubiquitination. Thus, UIMs may have several functions in ubiquitin metabolism each of which may require different numbers of UIMs.
The UIM is unlikely to form an independent folding domain. Instead, based on the spacing of the conserved residues, the motif probably forms a short alpha-helix that can be embedded into different protein folds. Some proteins known to contain an UIM are listed below:
Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.
Glutamate dehydrogenases (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids.
Leucine dehydrogenase (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.
Phenylalanine dehydrogenase (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate.
Valine dehydrogenase (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate.
This entry represents the dimerisation region of these enzymes.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.
This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit). In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The TGS domain is present in a number of enzymes, for example, in threonyl-tRNA synthetase (ThrRS), GTPase, and guanosine-3',5'-bis(diphosphate) 3'-pyrophosphohydrolase (SpoT). The TGS domain is also present at the amino terminus of the uridine kinase from the spirochaete Treponema pallidum (but not any other organism, including the related spirochaete Borrelia burgdorferi).
TGS is a small domain that consists of ~50 amino acid residues and is predicted to possess a predominantly beta-sheet structure. There is no direct information on the functions of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role.
The carbohydrate-binding domain (CBD) is a short domain found in many different glycosyl hydrolase enzymes, such as the C-terminal cellulose-binding domain of endoglucanase Z. The domain has a core structure consisting of a 3-stranded meander beta-sheet, which contains six aromatic groups that may be important for binding.
The overall topology of the CBD is structurally similar to the C-terminal chitin-binding domains (ChBD) of chitinase A1 and chitinase B, however the binding mechanism for the ChBD may be different from that of the CBD.
This entry represents the MI domain (after MA-3 and eIF4G), it is a protein-protein interaction module of ~130 amino acids. It appears in several translation factors and is found in:
The MI domain consists of seven alpha-helices, which pack into a globular form. The packing arrangement consists of repeating pairs of antiparallel helices packed one upon the other such that a superhelical axis is generated perpendicular to the alpha-helical axes.
The MI domain has also been named MA3 domain.
This entry represents a dimerisation domain that is usually found at the C-terminal of both class I and class II oxidoreductases, as well as in NADH oxidases and peroxidases.
This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. This entry is designated "type 3", and is found in nuclear cap-binding proteins, eIF4G, and UPF2.
The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Proteins usually contain two or more structurally similar MIF4G domains connected by unstructured linkers. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low.
The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80. The complex binds to 5'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans.
Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.
Nonsense-mediated mRNA decay (NMD) in eukaryotes involves UPF1, UPF2 and UPF3 to accelerate the decay rate of two unique classes of transcripts: (1) nonsense mRNAs that arise through errors in gene expression, and (2) naturally occurring transcripts that lack coding errors but have built-in features that target them for accelerated decay (error-free mRNAs). NMD can trigger decay during any round of translation and can target CBC-bound or eIF-4E-bound transcripts. UPF2 contains MIF4G domains, while UPF3 contains an RNP domain.
L-lactate dehydrogenases are metabolic enzymes which catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.
Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.
Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain.
The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals, the function of each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction process by hydrogen atom abstraction from the ribonucleotide. The radical-based reaction involves five cysteines: two of these are located at adjacent anti-parallel strands in a new type of ten-stranded alpha/beta-barrel; two others reside at the carboxyl end in a flexible arm; and the fifth, in a loop in the centre of the barrel, is positioned to initiate the radical reaction. There are several regions of similarity in the sequence of the large chain of prokaryotes, eukaryotes and viruses spread across 3 domains: an N-terminal domain common to the mammalian and bacterial enzymes; a C-terminal domain common to the mammalian and viral ribonucleotide reductases; and a central domain common to all three.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases). The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .
In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.
In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.
The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain. This entry represents the N-terminal domain of the alpha/A/beta/B subunits, which forms a closed beta-barrel with Greek-key topology.
More information about this protein can be found at Protein of the Month: ATP Synthases.
The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.
Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.
The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.
This entry represents domain I found in alpha-D-phosphohexomutase enzymes. This domain has a 3-layer alpha/beta/alpha topology.
The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.
Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.
The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.
This entry represents domain II found in alpha-D-phosphohexomutase enzymes. This domain has a 3-layer alpha/beta/alpha topology.
The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.
Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.
The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.
This entry represents domain III found in alpha-D-phosphohexomutase enzymes. This domain has a 3-layer alpha/beta/alpha topology.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.
Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.
AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .
GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking, and with ubiquitin for receptor sorting. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.
This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-, beta- and gamma-adaptin from AP clathrin adaptor complexes, and the GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 7 or 8 strands in 2 beta-sheets in a Greek key topology. Although these domains share a similar fold, there is little sequence identity between the alpha/beta-adaptins and gamma-adaptin/GAE.
More information about these proteins can be found at Protein of the Month: Clathrin.
Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:
ADP + phosphoenolpyruvate = ATP + pyruvate
The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.
PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.
The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.
This entry represents the 3-layer alpha/beta/alpha sandwich domain.
This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents MIZ-type zinc finger domains. Miz1 (Msx-interacting-zinc finger) is a zinc finger-containing protein with homology to the yeast protein, Nfi-1. Miz1 is a sequence specific DNA binding protein that can function as a positive-acting transcription factor. Miz1 binds to the homeobox protein Msx2, enhancing the specific DNA-binding ability of Msx2. Other proteins containing this domain include the human pias family (protein inhibitor of activated STAT protein).
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents predicted BED-type zinc finger domains. The BED finger which was named after the Drosophila proteins BEAF and DREF, is found in one or more copies in cellular regulatory factors and transposases from plants, animals and fungi. The BED finger is an about 50 to 60 amino acid residues domain that contains a characteristic motif with two highly conserved aromatic positions, as well as a shared pattern of cysteines and histidines that is predicted to form a zinc finger. As diverse BED fingers are able to bind DNA, it has been suggested that DNA-binding is the general function of this domain. Some proteins known to contain a BED domain include animal, plant and fungi AC1 and Hobo-like transposases; Caenorhabditis elegans Dpy-20 protein, a predicted cuticular gene transcriptional regulator; Drosophila BEAF (boundary element-associated factor), thought to be involved in chromatin insulation; Drosophila DREF, a transcriptional regulator for S-phase genes; and tobacco 3AF1 and tomato E4/E8-BP1, light- and ethylene-regulated DNA binding proteins that contain two BED fingers.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. It is normally about 70 amino acids in length. It is thought to be an intracellular protein-binding or lipid-binding signalling domain, which has an important function in membrane-associated processes. Mutations in the GRAM domain of myotubularins cause a muscle disease, which suggests that the domain is essential for the full function of the enzyme. Myotubularin-related proteins are a large subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of proteins contain cysteine peptidases belonging to MEROPS peptidase family C48 (Ulp1 endopeptidase family, clan CE). The protein fold of the peptidase domain for members of this family resembles that of adenain, the type example for clan CE. This group of sequences also contains a number of hypothetical proteins, which have not yet been characterised, and non-peptidase homologues. These are proteins that have either been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of the peptidases in the family.
The Ulp1 endopeptidase family contain the deubiquitinating enzymes (DUB) that can de-conjugate ubiquitin or ubiquitin-like proteins from ubiquitin-conjugated proteins. They can be classified in 3 families according to sequence homology: Ubiquitin carboxyl-terminal hydrolase (UCH) (see, Ubiquitin-specific processing protease (UBP) (see , and ubiquitin-like protease (ULP) specific for de-conjugating ubiquitin-like proteins. In contrast to the UBP pathway, which is very redundant (16 UBP enzymes in yeast), there are few ubiquitin-like proteases (only one in yeast, Ulp1).
Ulp1 catalyses two critical functions in the SUMO/Smt3 pathway via its cysteine protease activity. Ulp1 processes the Smt3 C-terminal sequence (-GGATY) to its mature form (-GG), and it de-conjugates Smt3 from the lysine epsilon-amino group of the target protein.
Crystal structure of yeast Ulp1 bound to Smt3 revealed that the catalytic and interaction interface is situated in a shallow and narrow cleft where conserved residues recognise the Gly-Gly motif at the C-terminal extremity of Smt3 protein. Ulp1 adopts a novel architecture despite some structural similarity with other cysteine protease. The secondary structure is composed of seven alpha helices and seven beta strands. The catalytic domain includes the central alpha helix, beta-strands 4 to 6, and the catalytic triad (Cys-His-Asp). This profile is directed against the C-terminal part of ULP proteins that displays full proteolytic activity.
In bacteria two distinct, membrane-bound, enzyme complexes are responsible for the interconversion of fumarate and succinate : fumarate reductase (Frd) is used in anaerobic growth, and succinate dehydrogenase (Sdh) is used in aerobic growth. Both complexes consist of two main components: a membrane-extrinsic component composed of a FAD-binding flavoprotein and an iron-sulphur protein; and an hydrophobic component composed of a membrane anchor protein and/or a cytochrome B.
In eukaryotes mitochondrial succinate dehydrogenase (ubiquinone) is an enzyme composed of two subunits: a FAD flavoprotein and and iron-sulphur protein.
The flavoprotein subunit is a protein of about 60 to 70 Kd to which FAD is covalently bound to a histidine residue which is located in the N-terminal section of the protein. The sequence around that histidine is well conserved in Frd and Sdh from various bacterial and eukaryotic species.
This family includes members that bind FAD such as the flavoprotein subunits from succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate reductase.
Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. This family also includes formyltetrahydrofolate dehydrogenases, which produce formate from formyl-tetrahydrofolate. These enzymes contain an N-terminal domain in common with other formyl transferase enzymes. The C-terminal domain has an open beta-barrel fold.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.
This entry represents the N-terminal DNA-binding domain found in eukaryotic topoisomerase I, which is a type IB enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
Glutamyl-tRNA(Gln) amidotransferase subunit B is a microbial enzyme that furnishes a means for formation of correctly charged Gln-tRNA(Gln) through the transamidation of misacylated Glu-tRNA(Gln) in organisms which lack glutaminyl-tRNA synthetase. The reaction takes place in the presence of glutamine and ATP through an activated gamma-phospho-Glu-tRNA(Gln). The enzyme is composed of three subunits: A (an amidase), B and C. It also exists in eukaryotes as a protein targeted to the mitochondria.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein Deafness Dystonia Protein 1. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localised to the mitochondrial intermembrane space.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of 2 cysteine thiol groups to a disulphide, accompanied by the transfer of 2 electrons and 2 protons. The net result is the covalent interconversion of a disulphide and a dithiol.
Compared to human thioredoxin, human U5 snRNP-specific protein U5-15kD contains 37 additional residues that may cause structural changes which most likely form putative binding sites for other spliceosomal proteins or RNA. Although U5-15kD apparently lacks protein disulphide isomerase activity, it is strictly required for pre-mRNA splicing.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the M domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.
The HEAT repeat is a tandemly repeated, 37-47 amino acid long module occurring in a number of cytoplasmic proteins, including the four name-giving proteins huntingtin, elongation factor 3 (EF3), the 65 Kd alpha regulatory subunit of protein phosphatase 2A (PP2A) and the yeast PI3-kinase TOR1. Arrays of HEAT repeats consists of 3 to 36 units forming a rod-like helical structure and appear to function as protein-protein interaction surfaces. It has been noted that many HEAT repeat-containing proteins are involved in intracellular transport processes.
In the crystal structure of PP2A PR65/A, the HEAT repeats consist of pairs of antiparallel alpha helices, as predicted in.
Many human L1 elements are capable of retrotransposition. Some of these have been shown to exhibit reverse transcriptase (RT) activity although the function of many are, as yet, unknown.
More information about these proteins can be found at Protein of the Month: Transposase.
Prefoldin (PFD) is a chaperone that interacts exclusively with type II chaperonins, hetero-oligomers lacking an obligate co-chaperonin that are found only in eukaryotes (chaperonin-containing T-complex polypeptide-1 (CCT)) and archaea. Eukaryotic PFD is a multi-subunit complex containing six polypeptides in the molecular mass range of 14Â23 kDa. In archaea, on the other hand, PFD is composed of two types of subunits, two alpha and four beta. The six subunits associate to form two back-to-back up-and-down eight-stranded barrels, from which hang six coiled coils. Each subunit contributes one (beta subunits) or two (alpha subunits) beta hairpin turns to the barrels. The coiled coils are formed by the N and C termini of an individual subunit. Overall, this unique arrangement resembles a jellyfish. The eukaryotic PFD hexamer is composed of six different subunits; however, these can be grouped into two alpha-like (PFD3 and -5) and four beta-like (PFD1, -2, -4, and -6) subunits based on amino acid sequence similarity with their archaeal counterparts. Eukaryotic PFD has a six-legged structure similar to that seen in the archaeal homologue. This family contains the archaeal alpha subunit, eukaryotic prefoldin subunits 3 and 5 and the UXT (ubiquitously expressed transcript) family.
Eukaryotic PFD has been shown to bind both actin and tubulin co-translationally. The chaperone then delivers the target protein to CCT, interacting with the chaperonin through the tips of the coiled coils. No authentic target proteins of any archaeal PFD have been identified, to date.
Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.
Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. This family represents the C-terminal region of dynein heavy chain. The dynein heavy chain also exhibits ATPase activity and microtubule binding ability and acts as a motor for the movement of organelles and vesicles along microtubules.
Two types of proteins that hydrolyse inorganic pyrophosphate (PPi), very different in both amino acid sequence and structure, have been characterised to date: soluble and membrane-bound proton-pumping pyrophosphatases (sPPases and H(+)-PPases, respectively). sPPases are ubiquitous proteins that hydrolyse PPi to release heat, whereas H+-PPases, so far unidentified in animal and fungal cells, couple the energy of PPi hydrolysis to proton movement across biological membranes. The latter type is represented by this group of proteins. H+-PPases are also called vacuolar-type inorganic pyrophosphatases (V-PPase) or pyrophosphate-energised vacuolar membrane proton pumps. In plants, vacuoles contain two enzymes for acidifying the interior of the vacuole, the V-ATPase and the V-PPase (V is for vacuolar).
Two distinct biochemical subclasses of H+-PPases have been characterised to date: K+-stimulated and K+-insensitive.
For additional information please see.
DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation. A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.
This domain has 3' to 5' exonuclease activity and adopts a ribonuclease H type fold.
The SPX domain is named after SYG1/Pho81/XPR1 proteins. This 180 residue length domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G- protein beta subunit and inhibits transduction of the mating pheromone signal suggesting that all the members of this family are involved in G-protein associated signal transduction. The C-terminal of these proteins often have an EXS domain.
The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. NUC-2 contains several ankyrin repeats.
Several members of this family are the XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with Murine leukemia virus (MLV). The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction and may itself function as a phosphate sensor.
This entry includes ABC1 from yeast and AarF from Escherichia coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins.
These proteins contain a short bi-helical repeat that is related to HEAT. Cyanobacteria and red algae harvest light energy using macromolecular complexes known as phycobilisomes (PBS), peripherally attached to the photosynthetic membrane. The major components of PBS are the phycobiliproteins. These heterodimeric proteins are covalently attached to phycobilins: open-chain tetrapyrrole chromophores, which function as the photosynthetic light-harvesting pigments. Phycobiliproteins differ in sequence and in the nature and number of attached phycobilins to each of their subunits. These proteins include the lyase enzymes that specifically attach particular phycobilins to apophycobiliprotein subunits. The most comprehensively studied of these is the CpcE/Flyasewhich attaches phycocyanobilin (PCB) to the alpha subunit of apophycocyanin. Similarly, MpeU/V attaches phycoerythrobilin to phycoerythrin II, while CpeY/Z is thought to be involved in phycoerythrobilin (PEB) attachment to phycoerythrin (PE) I (PEs I and II differ in sequence and in the number of attached molecules of PEB: PE I has five, PE II has six).
All the reactions of the above lyases involve an apoprotein cysteine SH addition to a terminal delta 3,3'-double bond. Such a reaction is not possible in the case of phycoviolobilin (PVB), the phycobilin of alpha-phycoerythrocyanin (alpha-PEC). It is thought that in this case, PCB, not PVB, is first added to apo-alpha-PEC, and is then isomerized to PVB. The addition reaction has been shown to occur in the presence of either of the components of alpha-PEC-PVB lyase PecE or PecF (or both). The isomerisation reaction occurs only when both PecE and PecF components are present, i.e. the PecE/F phycobiliprotein lyase is also a phycobilin isomerase. Another member of this family is the NblB protein, whose similarity to the phycobiliprotein lyases was previously noted. This constitutively expressed protein is not known to have any lyase activity. It is thought to be involved in the coordination of PBS degradation with environmental nutrient limitation. It has been suggested that the similarity of NblB to the phycobiliprotein lyases is due to the ability to bind tetrapyrrole phycobilins via the common repeated motif.
Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). Tubulin-tyrosine ligase (TTL) catalyses the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. The true physiological function of TTL has so far not been established. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness.
3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).
EF1A consists of three structural domains. This entry represents the C-terminal domain, which adopts a beta-barrel structure, and is involved in binding to both charged tRNA and to EF1B (or EF-Ts).
More information about these proteins can be found at Protein of the Month: Elongation Factors.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).
EF1A consists of three structural domains. This entry represents domain 2 of EF2, which adopts a beta-barrel structure, and is involved in binding to both charged tRNA. This domain is structurally related to the C-terminal domain of EF2, to which it displays weak sequence matches. This domain is also found in other proteins such as translation initiation factor IF-2 and tetracycline-resistance proteins.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2).
Members of this family have been called SAND proteins although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologues to domains of known structure that assemble into cargo vesicle adapter (AP) complexes.
This entry represents various uracil-DNA glycosylases and related DNA glycosylases, such as uracil-DNA glycosylase, thermophilic uracil-DNA glycosylase, G:T/U mismatch-specific DNA glycosylase (Mug), and single-strand selective monofunctional uracil-DNA glycosylase (SMUG1). These proteins have a 3-layer alpha/beta/alpha structure. Uracil-DNA glycosylases are DNA repair enzymes that excise uracil residues from DNA by cleaving the N-glycosylic bond, initiating the base excision repair pathway. Uracil in DNA can arise either through the deamination of cytosine to form mutagenic U:G mispairs, or through the incorporation of dUMP by DNA polymerase to form U:A pairs. These aberrant uracil residues are genotoxic. The sequence of uracil-DNA glycosylase is extremely well conserved in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localization, but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures to act as a general base in the catalytic mechanism.
This family consists of several LUC7 protein homologues that are restricted to eukaryotes. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The entry contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP).
This entry represents the substrate-binding domain of glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system. The substrate-binding domain has a 3-layer alpha/beta/alpha structure.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents the C subunit that is part of the V1 complex, and is localised to the interface between the V1 and V0 complexes. This subunit does not show any homology with F-ATPase subunits. The C subunit plays an essential role in controlling the assembly of V-ATPase, acting as a flexible stator that holds together the catalytic (V1) and membrane (V0) sectors of the enzyme . The release of subunit C from the ATPase complex results in the dissociation of the V1 and V0 subcomplexes, which is an important mechanism in controlling V-ATPase activity in cells.
More information about this protein can be found at Protein of the Month: ATP Synthases.
ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.
V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.
This entry represents subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat.
More information about this protein can be found at Protein of the Month: ATP Synthases.
This entry represents the Yippee-like (YPEL) family of putative zinc-binding proteins which is highly conserved among eukaryotes. The first protein in this family to be characterised, the Yippee protein from Drosophila, was identified by yeast interaction trap screen as a protein that physically interacts with moth hemolin. It was subsequently found to be a member of a highly conserved family of proteins found in diverse eukaryotes including plants, animals and fungi. Mammals contain five members of this family, YPEL1 to YPEL5, while other organisms tend to contain only two or three members. The mammalian proteins all appear to localise in the nucleus. YPEL1-4 are located in an unknown structure located on or close to the mitotic apparatus in the mitotic phase, whereas in the interphase they are located in the nuclei and nucleoli. In contrast, YPEL5 is localised to the centrosome and nucleus during interphase and at the mitotic spindle during mitosis, suggesting a function distinct from that of YPEL1-4. The localisation of the YPEL proteins suggests a novel, thopugh still unknown, function involved in cell division.
RER1 family proteins are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C terminus of yeast Rer1p interacts with a coatomer complex.
The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation.
Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication.
APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts.
One of the subunits of the APC that is required for ubiquitination activity is APC10, a one-domain protein homologous to a sequence element, termed the DOC domain, found in several hypothetical proteins that may also mediate ubiquitination reactions, because they contain combinations of either RING finger (see, cullin (see or HECT (see domains.
The DOC domain consists of a beta-sandwich, in which a five-stranded antiparallel beta-sheet is packed on top of a three stranded antiparallel beta-sheet, exhibiting a 'jellyroll' fold.
Proteins known to contain a DOC domain include:
This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role.
A group of microtubule-associated proteins called +TIPs (plus end tracking proteins), including EB1 (end-binding protein 1) family proteins, label growing microtubules ends specifically in diverse organisms and are implicated in spindle dynamics, chromosome segregation, and directing microtubules toward cortical sites. EB1 members have a bipartite composition: the N-terminal CH domain mediates microtubule plus end localization and a C-terminal cargo binding domain (EB1-C) that captures cell polarity determinants. The EB1-C domain comprises a unique EB1-like sequence motif that acts as a binding site for other +TIP proteins. It interacts with the carboxy terminus of the adenomatous polyposis coli (APC) tumor suppressor, a well conserved +TIP phosphoprotein with a pivotal function in cell cycle regulation. Another binding partner of the EB1-C domain is the well conserved +TIP protein dynactin, a component of the large cytoplasmic dynein/dynactin complex.
The ~80-residue EB1-C domain starts with a long smoothly curved helix (alpha1), which is followed by a hairpin connection leading to a short second helix (alpha2) running antiparallel to alpha1. The two parallel alpha1 helices of the EB1-C domain dimer wrap around each other in a slightly left-handed supercoil. The two alpha2 helices run antiparallel to helices alpha1 and form a similar fork in the opposite orientation and rotated by 90°. As a result, two helical segments from each monomer form a four-helix bundle. The side chain forming the hydrophobic core of this bundle are highly conserved.
Some protein known to contain an EB1-C domain are listed below:
This is a family of eukaryotic proteins which are variously described as either hypothetical protein, developmental protein or related to yeast SNF7. The family contains human CHMP1. CHMP1 (CHromatin Modifying Protein; CHarged Multivesicular body Protein), is encoded by an alternative open reading frame in the PRSM1 gene and is conserved in both complex and simple eukaryotes. CHMP1 contains a predicted bipartite nuclear localisation signal and distributes as distinct forms to the cytoplasm and the nuclear matrix in all cell lines tested.
Human CHMP1 is strongly implicated in multivesicular body formation. A multivesicular body is a vesicle-filled endosome that targets proteins to the interior of lysosomes. Immunocytochemistry and biochemical fractionation localise CHMP1 to early endosomes and CHMP1 physically interacts with SKD1/VPS4, a highly conserved protein directly linked to multivesicular body sorting in yeast. Similar to the action of a mutant SKD1 protein, over expression of a fusion derivative of human CHMP1 dilates endosomal compartments and disrupts the normal distribution of several endosomal markers. Genetic studies in Saccharomyces cerevisiae (Baker's yeast) further support a conserved role of CHMP1 in vesicle trafficking. Deletion of CHM1, the budding yeast homolog of CHMP1, results in defective sorting of carboxypeptidases S and Y and produces abnormal, multi-lamellar prevacuolar compartments. This phenotype classifies CHM1 as a member of the class E vacuolar protein sorting genes.
Named the YEATS family, after 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', this family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents ZPR1-type zinc finger domains. An orthologous protein found once in each of the completed archaeal genomes corresponds to a zinc finger-containing domain repeated as the N-terminal and C-terminal halves of the mouse protein ZPR1. ZPR1 is an experimentally proven zinc-binding protein that binds the tyrosine kinase domain of the epidermal growth factor receptor (EGFR); binding is inhibited by EGF stimulation and tyrosine phosphorylation, and activation by EGF is followed by some redistribution of ZPR1 to the nucleus. By analogy, other proteins with the ZPR1 zinc finger domain may be regulatory proteins that sense protein phosphorylation state and/or participate in signal transduction (see also.
Deficiencies in ZPR1 may contribute to neurodegenerative disorders. ZPR1 appears to be down-regulated in patients with spinal muscular atrophy (SMA), a disease characterised by degeneration of the alpha-motor neurons in the spinal cord that can arise from mutations affecting the expression of Survival Motor Neurons (SMN). ZPR1 interacts with complexes formed by SMN, and may act as a modifier that effects the severity of SMA.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Members of this family are related to the pre mRNA splicing factor PRP38 from yeast, therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation.
This domain is found in a large number of proteins including magnesium dependent endonucleases and phosphatases involved in intracellular signalling. Proteins this domain is found in include: AP endonuclease proteins, DNase I proteins, Synaptojanin an inositol-1,4,5-trisphosphate phosphatase and Sphingomyelinase.
This large family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits.
This repeat is found in the tail fibers of phage, for example protein Kbut bacterial homologues have also been identified. The repeats are about 40 residues long.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This is a group of cysteine peptidases which constitute MEROPS peptidase family C54 (Aut2 peptidase family, clan CA), which are a group of proteins of unknown function.
This entry represents a multi-helical domain composed of two all-alpha subdomains that is found as the C-terminal domain in cryptochrome proteins, as well as at the N-terminal of DNA photolyase where it acts as a FAD-binding domain (the N-terminal of DNA photolyase binds a light-harvesting cofactor).
Photolyases and cryptochromes are related flavoproteins that bind FAD. Photolyases harness the energy of blue light to repair DNA damage by removing pyrimidine dimers. Cryptochromes (CRY1 and CRY2) are blue light photoreceptors that mediate blue light-induced gene expression.
DNA photolyases are DNA repair enzymes that repair mismatched pyrimidine dimers induced by exposure to ultra-violet light. They bind to UV-damaged DNA containing pyrimidine dimers and, upon absorbing a near-UV photon (300 to 500 nm), they catalyse dimer splitting, breaking the cyclobutane ring joining the two pyrimidines of the dimer so as to split them into the constituent monomers; this process is called photoreactivation. DNA photolyases require two choromophore-cofactors for their activity. All monomers contain a reduced FAD moiety, and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a second chromophore. Either chromophore may act as the primary photon acceptor, peak absorptions occurring in the blue region of the spectrum and in the UV-B region, at a wavelength around 290nm.
6-Phosphogluconate dehydrogenase (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP). Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequence are highly conserved. The protein is a homodimer in which the monomers act independently: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet. NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket.
This family represents the NAD binding domain of 6-phosphogluconate dehydrogenase which adopts a Rossman fold. The C-terminal domain is described in
This domain is found in peptide chain release factors. Peptide chain release factors are important for protein synthesis since they direct the termination of translation in response to the peptide chain termination codons UAG and UAA. Bacteria contain RF1 and Eukaryotes contain RF2. These are structurally distinct but both contain the PCRF domain.
This domain is found in the release factor eRF1 which terminates protein biosynthesis by recognizing stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site .
This domain is also found in other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification.
This domain is found in the release factor eRF1 which terminates protein biosynthesis by recognizing stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site .
This domain is also found in other proteins which may also be involved in translation termination
This domain is found in the release factor eRF1 which terminates protein biosynthesis by recognizing stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site .
This domain is also found in other proteins which may also be involved in translation termination but this awaits experimental verification.
Nonsense-mediated mRNA decay (NMD) is a surveillance mechanism by which eukaryotic cells detect and degrade transcripts containing premature termination codons. Three 'up-frameshift' proteins, UPF1, UPF2 and UPF3, are essential for this process in organisms ranging from yeast, human to plants . Exon junction complexes (EJCs) are deposited ~24 nucleotides upstream of exon-exon junctions after splicing. Translation causes displacement of the EJCs, however, premature translation termination upstream of one or more EJCs triggers the recruitment of UPF1, UPF2 and UPF3 and activates the NMD pathway.
This family contains UPF3. The crystal structure of the complex between human UPF2 and UPF3b, which are, respectively, a MIF4G (middle portion of eIF4G) domain and an RNP domain (ribonucleoprotein-type RNA-binding domain) has been determined to 1.95A. The protein-protein interface is mediated by highly conserved charged residues in UPF2 and UPF3b and involves the beta-sheet surface of the UPF3b ribonucleoprotein (RNP) domain, which is generally used by these domains to bind nucleic acids. In UPF3b the RNP domain does not bind RNA, whereas the UPF2 construct and the complex do. It is clear that some RNP domains have evolved for specific protein-protein interactions rather than as nucleic acid binding modules.
The ATP-cone is an evolutionarily mobile, ATP-binding regulatory domain which is found in a variety of proteins including ribonucleotide reductases, phosphoglycerate kinases and transcriptional regulators.
In ribonucleotide reductase protein R1 from Escherichia coli this domain is located at the N-terminus, and is composed mostly of helices. It forms part of the allosteric effector region and contains the general allosteric activity site in a cleft located at the tip of the N-terminal region. This site binds either ATP (activating) or dATP (inhibitory), with the base bound in a hydrophobic pocket and the phosphates bound to basic residues. Substrate binding to this site is thought to affect enzyme activity by altering the relative positions of the two subunits of ribonucleotide reductase.
The function of this domain is unknown, it is found inand its relatives. It is found C-terminal to the
This entry represents the B3/B4 domain found in tRNA synthetase beta subunits as well as in some non-tRNA synthetase proteins. This domain has a 3-layer structure, and contains a beta-sandwich fold of unusual topology, and contains a putative tRNA-binding structural motif. In Thermus thermophilus, both the catalytic alpha- and the non-catalytic beta-subunits comprise the characteristic fold of the class II active-site domains. The presence of an RNA-binding domain, similar to that of the U1A spliceosomal protein, in the beta-subunit of tRNA synthetase indicates structural relationships among different families of RNA-binding proteins.
Aminoacyl-tRNA synthetases can catalyse editing reactions to correct errors produced during amino acid activation and tRNA esterification, in order to prevent the attachment of incorrect amino acids to tRNA. The B3/B4 domain of the beta subunit contains an editing site, which lies close to the active site on the alpha subunit. Disruption of this site abolished tRNA editing, a process that is essential for faithful translation of the genetic code.
Domain B5 is found in phenylalanine-tRNA synthetase beta subunits. This domain has been shown to bind DNA through a winged helix-turn-helix motif. Phenylalanine-tRNA synthetase may influence common cellular processes via DNA binding, in addition to its aminoacylation function.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
This domain is found at the N-terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition.
Potassium channels are the most diverse group of the ion channel family. They are important in shaping the action potential, and in neuronal excitability and plasticity. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.
These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers. In eukaryotic cells, K+ channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis.
All K+ channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K+ selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK). The 2TM domain family comprises inward-rectifying K+ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels.
Ca2+-activated K+ channels are a diverse group of channels that are activated by an increase in intracellular Ca2+ concentration. They are found in the majority of nerve cells, where they modulate cell excitability and action potential. Three types of Ca2+-activated K+ channel have been characterised, termed small-conductance (SK), intermediate conductance (IK) and large conductance (BK) respectively.
BK channels (also referred to as maxi-K channels) are widely expressed in the body, being found in glandular tissue, smooth and skeletal muscle, as well as in neural tissues. They have been demonstrated to regulate arteriolar and airway diameter, and also neurotransmitter release. Each channel complex is thought to be composed of 2 types of subunit; the pore-forming (alpha) subunits and smaller accessory (beta) subunits.
The alpha subunit of the BK channel was initially thought to share the characteristic 6TM organisation of the voltage-gated K+ channels. However, the molecule is now thought to possess an additional TM domain, with an extracellular N-terminus and intracellular C-terminus. This C-terminal region contains 4 predominantly hydrophobic domains, which are also thought to lie intracellularly. The extracellular N-terminus and the first TM region are required for modulation by the beta subunit. The precise location of the Ca2+-binding site that modulates channel activation remains unknown, but it is thought to lie within the C-terminal hydrophobic domains.
This presumed domain is found at the N terminus of some isoforms of the cytoskeletal muscle protein plectin as well as the ribosomal S10 protein. This domain may be involved in RNA binding.
This entry contains Pob3 which is a subunit of the heterodimeric yeast FACT complex (Spt16p-Pob3p). The FACT complex facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.
Allantoicase (also known as allantoate amidinohydrolase) is involved in purine degradation, facilitating the utilization of purines as secondary nitrogen sources under nitrogen-limiting conditions. While purine degradation converges to uric acid in all vertebrates, its further degradation varies from species to species. Uric acid is excreted by birds, reptiles, and some mammals that do not have a functional uricase gene, whereas other mammals produce allantoin. Amphibians and microorganisms produce ammonia and carbon dioxide using the uricolytic pathway. Allantoicase performs the second step in this pathway catalyzing the conversion of allantoate into ureidoglycolate and urea.
allantoate + H(2)0 = (S)-ureidoglycolate + urea
The structure of allantoicase is best described as being composed of two repeats (the allantoicase repeats: AR1 and AR2), which are connected by a flexible linker. The crystal structure, resolved at 2.4A resolution, reveals that AR1 has a very similar fold to AR2, both repeats being jelly-roll motifs, composed of four-stranded and five-stranded antiparallel beta-sheets. Each jelly-roll motif has two conserved surface patches that probably constitute the active site.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.
This group of cysteine peptidases belong to MEROPS peptidase family C50 (separase family, clan CD). The active site residues for members of this family and family C14 occur in the same order in the sequence: H,C.
The separases are caspase-like proteases, which plays a central role in the chromosome segregation. In yeast they cleave the rad21 subunit of the cohesin complex at the onset of anaphase. During most of the cell cycle, separase is inactivated by the securin/cut2 protein, which probably covers its active site.
Members of this family are essential for 40S ribosomal biogenesis. They play a role in the methylation reaction of pre-rRNA processing. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases.
Fumble is required for cell division in Drosophila. Mutants lacking fumble exhibit abnormalities in bipolar spindle organisation, chromosome segregation, and contractile ring formation. Analyses have demonstrated that it encodes three protein isoforms, all of which contain a domain with high similarity to the pantothenate kinases of Emericella nidulans and mouse. A role of fumble in membrane synthesis has been proposed.
The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs.
To carry out its biological functions, Vps5 forms the retromer complex with at least four other proteins: Vps17, Vps26, Vps29, and Vps35.Vps35 contains a central region of weaker sequence similarity, thought to indicate the presence of at least three domains.
The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs.
To carry out its biological functions, Vps5 forms the retromer complex with at least four other proteins: Vps17, Vps26, Vps29, and Vps35. This family of Vps26-proteins also contains Down syndrome critical region 3/A.
This is a family of proteins of unknown function.
Phf5 is a member of a novel murine multigene family that is highly conserved during evolution and belongs to the superfamily of PHD-finger proteins. At least one example, from Mus musculus (Mouse), may act as a chromatin-associated protein. The Schizosaccharomyces pombe (Fission yeast) ini1 gene is essential, required for splicing. It is localised in the nucleus, but not detected in the nucleolus and can be complemented by human ini1. The proteins of this family contain five CXXC motifs.
This is a small family of mainly hypothetical proteins of unknown function.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
This is a family of proteins related to the 30S ribosomal protein S5P from Sulfolobus acidocaldarius. Ribosomal protein S5 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies.
The PH (phosphorolytic) domain is responsible for 3'-5' exoribonuclease activity, although in some proteins this domain has lost its catalytic function. An active PH domain uses inorganic phosphate as a nucleophile, adding it across the phosphodiester bond between the end two nucleotides in order to release ribonucleoside 5'-diphosphate (rNDP) from the 3' end of the RNA substrate.
PH domains can be found in bacterial/organelle RNases and PNPases (polynucleotide phosphorylases), as well as in archaeal and eukaryotic RNA exosomes, the later acting as nano-compartments for the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA). Bacterial/organelle PNPases share a common barrel structure with RNA exosomes, consisting of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber . Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel.
This entry represents the phosphorolytic (PH) domain 2, which has a core 3-layer alpha/beta/alpha structure. This domain is found in bacterial/organelle PNPases and in archaeal/eukaryotic exosomes..
More information about these proteins can be found at Protein of the Month: RNA Exosomes.
Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.
The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.
The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.
Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.
Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome.
EF2 has five domains. This entry represents domain IV found in EF2 (or EF-G) of both prokaryotes and eukaryotes. The EF2-GTP-ribosome complex undergoes extensive structural rearrangement for tRNA-mRNA movement to occur. Domain IV, which extends from the 'body' of the EF2 molecule much like a lever arm, appears to be essential for the structural transition to take place.
More information about these proteins can be found at Protein of the Month: Elongation Factors.
ArgRIII has been demonstrated to be an inositol polyphosphate kinase which catalyses the reaction
ATP + 1D-myo-inositol 1,4,5-trisphosphate = ADP + 1D-myo-inositol 1,3,4,5-tetrakisphosphate.
TLC is a protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis , TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. Proteins containing this domain may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains.
Members of this family are components of the mitotic spindle. It has been shown that Ndc80 from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle.
Clag (cytoadherence linked asexual gene) is a malaria surface protein which has been shown to be involved in the binding of Plasmodium falciparum infected erythrocytes to host endothelial cells, a process termed cytoadherence. The cytoadherence phenomenon is associated with the sequestration of infected erythrocytes in the blood vessels of the brain, cerebral malaria. Clag is a multi-gene family in P. falciparum with at least 9 members identified to date. Orthologous proteins in the rodent malaria species Plasmodium chabaudi suggest that the gene family is found in other malaria species and may play a more generic role in cytoadherence.
The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.
Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins, which is important for importin-beta mediated transport.
Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. As a result, the N-terminal auto-inhibitory region on importin-alpha is free to loop back and bind to the major NLS-binding site, causing the cargo to be released. There are additional release factors as well.
More information about these proteins can be found at Protein of the Month: Importins.
The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices.
Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain.
These PAP/25A associated domains are found in uncharacterised eukaryotic proteins, a number of which are described as 'topoisomerase 1-related' though they appear to have little or no homology to topoisomerase 1. The signatures that define this group of sequences often occur towards the C-terminus after the PAP/25A core domain
All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologs). This complex is used primarily for nucleotide excision repair but also for some aspects of recombination repair. In yeast, Rad10 works as a heterodimer with Rad1, and is involved in nucleotide excision repair of DNA damaged with UV light, bulky adducts or cross-linking agents. The complex forms an endonuclease which specifically degrades single-stranded DNA.
Ercc1 and XPF (xeroderma pigmentosum group F-complementing protein) are two structure-specific endonucleases of a class of seven containing an ERCC4 domain. Together they form an obligate complex that functions primarily in nucleotide excision repair (NER), a versatile pathway able to detect and remove a variety of DNA lesions induced by UV light and environmental carcinogens, and secondarily in DNA inter-strand cross-link repair and telomere maintenance. This domain in fact binds simultaneously to both XPF and single-stranded DNA; this ternary complex explains the important role of Ercc1 in targeting its catalytic XPF partner to the NER pre-incision complex.
The YjeF N-terminal domains occur either as single proteins or fusions with other domains and are commonly associated with enzymes. In bacteria and archaea, YjeF N-terminal domains are often fused to a YjeF C-terminal domain with high structural homology to the members of a ribokinase-like superfamilyand/or belong to operons that encode enzymes of diverse functions: pyridoxal phosphate biosynthetic protein PdxJ; phosphopanteine-protein transferase; ATP/GTP hydrolase; and pyruvate-formate lyase 1-activating enzyme. In plants, the YjeF N-terminal domain is fused to a C-terminal putative pyridoxamine 5'-phosphate oxidase. In eukaryotes, proteins that consist of (Sm)-FDF-YjeF N-terminal domains may be involved in RNA processing.
The YjeF N-terminal domains represent a novel version of the Rossmann fold, one of the most common protein folds in nature observed in numerous enzyme families, that has acquired a set of catalytic residues and structural features that distinguish them from the conventional dehydrogenases. The YjeF N-terminal domain is comprised of a three-layer alpha-beta-alpha sandwich with a central beta-sheet surrounded by helices. The conservation of the acidic residues in the predicted active site of the YjeF N-terminal domains is reminiscent of the presence of such residues in the active sites of diverse hydrolases.
Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).
This entry represents the N-terminal domain of eukaryotic RPB5, which has a core structure consisting of 3 layers alpha/beta/alpha. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA.
The eukaryotic RNA polymerase subunits RPB4 and RPB7 form a heterodimer that reversibly associates with the RNA polymerase II core. Archaeal cells contain a single RNAP made up of about 12 subunits, displaying considerable homology to the eukaryotic RNAPII subunits. The RPB4 and RPB7 homologs are called subunits F and E, respectively, and have been shown to form a stable heterodimer. While the RPB7 homolog is reasonably well conserved, the similarity between the eukaryotic RPB4 and the archaeal F subunit is barely detectable.
The eukaryotic RNA polymerase subunits RPB4 and RPB7 form a heterodimer that reversibly associates with the RNA polymerase II core. Archaeal cells contain a single RNAP made up of about 12 subunits, displaying considerable homology to the eukaryotic RNAPII subunits. The RPB4 and RPB7 homologs are called subunits F and E, respectively, and have been shown to form a stable heterodimer. While the RPB7 homolog is reasonably well conserved, the similarity between the eukaryotic RPB4 and the archaeal F subunit is barely detectable.
Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.
The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.
The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.
Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.
This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).
Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. In yeast the SMP3 (YOR149C) has been implemented in plasmid stability.
Sec20 is a membrane glycoprotein associated with secretory pathway.
The BSD domain is an about 60-residue long domain named after the BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins in which it is found. Additionally, it is also found in several hypothetical proteins. The BSD domain occurs in one or two copies in a variety of species ranging from primal protozoan to human. It can be found associated with other domains such as the BTB domain (see or the U-box in multidomain proteins. The function of the BSD domain is yet unknown.
Secondary structure prediction indicates the presence of three predicted alpha helices, which probably form a three-helical bundle in small domains. The third predicted helix contains neighbouring phenylalanine and tryptophan residues - less common amino acids that are invariant in all the BSD domains identified and that are the most striking sequence features of the domain.
Some proteins known to contain one or two BSD domains are listed below:This domain is present in the CAATT-binding protein which is essential for growth and necessary for 60S ribosomal subunit biogenesis. Other proteins containing this domain stimulate transcription from the HSP70 promoter.
This entry represents eukaryotic glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system.
This domain is found at the C terminus of the mRNA capping enzyme. The mRNA capping enzyme in yeasts is composed of two separate chains: alpha a mRNA guanyltransferase and beta an RNA 5'-triphosphate. X-ray crystallography reveals a large conformational change during guanyl transfer by mRNA capping enzymes. Binding of the enzyme to nucleotides is specific to the GMP moiety of GTP. The viral mRNA capping enzyme is a monomer that transfers a GMP cap onto the end of mRNA that terminates with a 5'-diphosphate tail.
SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex and is also involved in the ubiquitin pathway. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1.
This entry represents a POZ domain with a core structure consisting of beta(2)/alpha(2)/beta(2)/alpha(2) in two layers, alpha/beta. This domain is found at the N-terminal of SKP1 proteins as well as in subunit D of the centromere DNA-binding protein complex Cbf3.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.
Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.
Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.
Glutamyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure.
Glutamine synthetase (GS) plays an essential role in the metabolism of nitrogen by catalyzing the condensation of glutamate and ammonia to form glutamine.
There seem to be three different classes of GS:
While the three classes of GS's are clearly structurally related, the sequence similarities are not so extensive.
Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.
Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.
This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain.
This family of short proteins have no known function. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C-terminus contains the strongest conservation.
ATPase family gene 1 (AFG1) ATPase is a 377 amino acid putative protein with an ATPase motif typical of the protein family including SEC18p PAS1, CDC48-VCP and TBP. AFG1 also has substantial homology to these proteins outside the ATPase domain. This family of proteins contains a P-loop motif.
Proteins in this entry belong to the Atg3 group of proteins and the Atg3 conjugation enzymes.
Autophagy is a degradative transport pathway that delivers cytosolic proteins to the lysosome (vacuole) and is induced by starvation. Cytosolic proteins appear inside the vacuole enclosed in autophagic vesicles. Autophagy significantly differs from other transport pathways by using double membrane layered transport intermediates, called autophagosomes. The breakdown of vesicular transport intermediates is a unique feature of autophagy. Autophagy can also function in the elimination of invading bacteria and antigens.
Atg3 is the E2 enzyme for the LC3 lipidation process. It is essential for autophagocytosis. The super protein complex, the Atg16L complex, consists of multiple Atg12-Atg5 conjugates. Atg16L has an E3-like role in the LC3 lipidation reaction. The activated intermediate, LC3-Atg3 (E2), is recruited to the site where the lipidation takes place.
Atg3 catalyses the conjugation of Atg8 and phosphatidylethanolamine (PE). Atg3 has an alpha/beta-fold, and its core region is topologically similar to canonical E2 enzymes. Atg3 has two regions inserted in the core region and another with a long alpha-helical structure that protrudes from the core region as far as 30 A.. It interacts with atg8 through an intermediate thioester bond between Cys-288 and the C-terminal Gly of atg8. It also interacts with the C-terminal region of the E1-like atg7 enzyme.
Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the lysosome/vacuole. Atg3 is a ubiquitin like modifier that is topologically similar to the canonical E2 enzyme. It catalyses the conjugation of Atg8 and phosphatidylethanolamine.
This domain is the N-terminal of Atg3 while the C-terminal is represented by
Proteins in this entry belong to the Atg3 group of proteins and the Atg3 conjugation enzymes.
Autophagy is a degradative transport pathway that delivers cytosolic proteins to the lysosome (vacuole) and is induced by starvation. Cytosolic proteins appear inside the vacuole enclosed in autophagic vesicles. Autophagy significantly differs from other transport pathways by using double membrane layered transport intermediates, called autophagosomes. The breakdown of vesicular transport intermediates is a unique feature of autophagy. Autophagy can also function in the elimination of invading bacteria and antigens.
Atg3 is the E2 enzyme for the LC3 lipidation process. It is essential for autophagocytosis. The super protein complex, the Atg16L complex, consists of multiple Atg12-Atg5 conjugates. Atg16L has an E3-like role in the LC3 lipidation reaction. The activated intermediate, LC3-Atg3 (E2), is recruited to the site where the lipidation takes place.
Atg3 catalyses the conjugation of Atg8 and phosphatidylethanolamine (PE). Atg3 has an alpha/beta-fold, and its core region is topologically similar to canonical E2 enzymes. Atg3 has two regions inserted in the core region and another with a long alpha-helical structure that protrudes from the core region as far as 30 A.. It interacts with atg8 through an intermediate thioester bond between Cys-288 and the C-terminal Gly of atg8. It also interacts with the C-terminal region of the E1-like atg7 enzyme.
Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The cysteine residue within the HPC motif is the putative active-site residue for recognition of the Apg5 subunit of the autophagosome complex.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.
Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.
This entry represents the beta-pinwheel repeat found at the C-terminal end of subunit A of topoisomerase IV (ParC) and subunit A of DNA gyrase (GyrA). DNA gyrase is the topoisomerase II found primarily in bacteria and archaea that consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. This is distinct from the topoisomerase II found in most eukaryotes, which consists of a single polypeptide, with the N- and C-terminal regions corresponding to gyrB and gyrA, respectively, and which is not represented in this entry.
The ability of DNA gyrase to introduce negative supercoils into DNA is mediated in part by the C-terminal domain of subunit A, which forms a beta-pinwheel fold that is similar to a beta-propeller but with a different blade topology, and which forms a superhelical spiral domain. This beta-pinwheel is capable of bending DNA by over 180 degrees over a 40 bp region, possibly by wrapping the DNA around the GyrA C-terminal beta-pinwheel domain.
In topoisomerase IV, although the C-terminal domain forms a similar superhelical spiral to that of DNA gyrase A, it assembles as a broken form of a beta-pinwheel as distinct from that of gyrA, due to the absence of a DNA gyrase-specific GyrA box motif. This difference may account for parC being less efficient than gyrA in mediating DNA-bending, leading to their divergence in terms of activity, where topoisomerase IV acts to relax positive supercoils, and DNA gyrase acts to introduce negative supercoils.
More information about this protein can be found at Protein of the Month: DNA Topoisomerase.
A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:
There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA.
This entry contains Utp11, a large ribonuclear protein that associates with snoRNA U3.
This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex. It also includes the Homo sapiens (Human) C1D protein and Saccharomyces cerevisiae (Baker's yeast) YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs and Sas10 which has been identified as a regulator of chromatin silencing. This entry also includes the human protein Neuroguidin, an initiation factor 4E (eIF4E)-binding protein.
A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:
There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA.
This domain is found at the C terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein and the yeast protein is called Utp12 or DIP2 Utp12 specifacally interacts with snoRNA U3 and with MPP10.
This domain is found in a family of proteins of unknown function. It appears to be found in eukaryotes and archaebacteria, and occurs associated with a potential metal-binding region in RNase L inhibitor, RLI.
PSP is a proline-rich domain of unknown function found in spliceosome associated proteins.
TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterised TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localise TRAPP to the Golgi.
Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents the WD-associated region found in coatomer subunits alpha, beta and beta' subunits. The alpha-subunit (RET1P) of the coatomer complex in Saccharomyces cerevisiae (Baker's yeast), participates in membrane transport between the endoplasmic reticulum and Golgi apparatus. The protein contains six WD-40 repeat motifs in its N-terminal region.
More information about these proteins can be found at Protein of the Month: Clathrin.
The Ccr4-Not complex is a global regulator of gene expression that is conserved from yeast to human. It affects genes positively and negatively and is thought to regulate transcription factor IID function. In Saccharomyces cerevisiae, it exists in two prominent forms and consists of at least nine core subunits: the five Not proteins (Not1p to Not5p), Caf1p, Caf40p, Caf130p and Ccr4p. The Ccr4-Not complex regulates many different cellular functions, including RNA degradation and transcription initiation. It may be a regulatory platform that senses nutrient levels and stress. Caf1p and Ccr4p, are directly involved in mRNA deadenylation, and Caf1p is associated with Dhh1p, a putative RNA helicase thought to be a component of the decapping complex. Pop2, a component of the Ccr4-Not complex, functions as a deadenylase.
The Ccr4-Not complex is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID.Radical SAM proteins catalyze diverse reactions, including unusual methylations, isomerization, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation. Evidence exists that these proteins generate a radical species by reductive cleavage of S:-adenosylmethionine (SAM) through an unusual Fe-S centre.
Ssl1-like proteins are 40 kDa subunits of the transcription factor II H complex. This domain is often found associated with the C2H2 type Zn-finger.
This RNA recognition motif 2 is found in Meiosis protein mei2. It is found C-terminal to the RNA-binding region RNP-1.
This entry represents a group of leucine carboxymethyltransferases which methylate the carboxyl group of leucine residues to form alpha-leucine ester residues. It includes LCTM1 which regulates the activity of serine/threonine phosphatase 2A (PP2A) through methylation of the C-terminal leucine residue of the catalytic subunit of PP2A . This affects the heteromultimeric composition of PP2A which in turn affects protein recognition and substrate specificity. Like many other methyltransferases LCTM1 uses S-adenosylmethionine (SAM) as the methyl donor. LCTM1 contains the common SAM-dependent methyltransferase core fold, with various insertions and additions creating a specific PP2A binding site. This entry also contains LCTM2, a homologue of LCTM1 which is not necessary for PP2A methylation and whose function is not clear.
Rcd1 (Required cell differentiation 1) -like proteins are found among a wide range of organisms. Rcd1 was initially identified as an essential factor in nitrogen starvation-invoked differentiation in fission yeast. This results largely from a defect in nitrogen starvation-invoked induction of ste11+, a key transcriptional factor gene required for the onset of sexual development. It is one of the most conserved proteins in eukaryotes, and its mammalian homologue is expressed in a variety of differentiating tissues. The mammalian Rcd1 is a novel transcriptional cofactor and is critical for retinoic acid-induced differentiation of F9 mouse teratocarcinoma cells, at least in part, via forming complexes with retinoic acid receptor and activation transcription factor-2 (ATF-2). Two of the members in this family have been characterised as being involved in regulation of Ste11 regulated sex genes.
The alpha/beta hydrolase fold is common to several hydrolytic enzymes of widely differing phylogenetic origin and catalytic function. The core of each enzyme is similar: an alpha/beta sheet, not barrel, of eight beta-sheets connected by alpha-helices. This entry describes a closely associated region, which is found in a number of lipases.
All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed the origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. This entry is subunit 2, which binds the origin of replication. It plays a role in chromosome replication and mating type transcriptional silencing.
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
The SR receptor is a monomer consisting of the loosely membrane-associated SR-alpha homologue FtsY, while the eukaryotic SR receptor is a heterodimer of SR-alpha (70 kDa) and SR-beta (25 kDa), both of which contain a GTP-binding domain. SR-alpha regulates the targeting of SRP-ribosome-nascent polypeptide complexes to the translocon. SR-alpha binds to the SRP54 subunit of the SRP complex. The SR-beta subunit is a transmembrane GTPase that anchors the SR-alpha subunit (a peripheral membrane GTPase) to the ER membrane. SR-beta interacts with the N-terminal SRX-domain of SR-alpha, which is not present in the bacterial FtsY homologue. SR-beta also functions in recruiting the SRP-nascent polypeptide to the protein-conducting channel.
This entry represents the alpha subunit of the SR receptor.
The 22 kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore-forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein involved in the development of early-onset glomerulosclerosis.
A member of this family found in Saccharomyces cerevisiae (Baker's yeast) is an integral membrane protein of the inner mitochondrial membrane and has been suggested to play a role in mitochondrial function during heat shock.
This entry represents the C-terminal domain found in DNA/pantothenate metabolism flavoproteins, which affects synthesis of DNA and pantothenate metabolism. These proteins contain ATP, phosphopantothenate, and cysteine binding sites. The structure of this domain has been determined in human phosphopantothenoylcysteine (PPC) synthetase and as the PPC synthase domain (CoaB) from the Escherichia coli coenzyme A bifunctional protein CoaBC. This domain adopts a 3-layer alpha/beta/alpha fold with mixed beta-sheets, which topologically resembles a combination of Rossmann-like and ribokinase-like folds. The structure of these proteins predicts a ping pong mechanism with initial formation of an acyladenylate intermediate, followed by release of pyrophosphate and attack by cysteine to form the final products PPC and AMP.
DNA replication in eukaryotes results from a highly coordinated interaction between proteins, often as part of protein complexes, and the DNA template. One of the key early steps leading to DNA replication is formation of the prereplication complex, or pre-RC. The pre-RC is formed by the sequential binding of the origin recognition complex (ORC), Cdc6 and Cdt1 proteins, and the MCM complex. Activation of the pre-RC into the initiation complex (IC) is achieved via the action of S-phase kinases, eventually leading to the loading of the replication machinery.
Recently, a novel replication complex, GINS (for Go, Ichi, Nii, and San; five, one, two, and three in Japanese), has been identified. The precise function of GINS is not known. However, genetic and two-hybrid interactions indicate that it mediates the loading of the enzymatic replication machinery at a step after the action of the S-phase kinases. Furthermore, GINS may be a part of the replication machinery itself, since it is found associated with replicating DNA. Electron microscopy of GINS shows that it forms a ring-like structure, reminiscent of the structure of PCNA, the DNA polymerase delta replication clamp.This observation, coupled with the observed interactions for GINS, indicates that the complex may represent the replication clamp for DNA polymerase epsilon.
The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts. This 100 kDa stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in other eukaryotes. This family of proteins represents the Psf2 component.
Members of this family are spindle pole body (SBP) components such as Spc97, Spc98 and gamma-tubulin. The SPB functions as the microtubule-organising centre in yeast, with the microtubule cytoskeleton playing an essential role in chromosome segregation, cellular organisation and vesicle trafficking in eukaryotic cells. In most cells, the centrosome is the primary microtubule-organising centre that nucleates and organises microtubules. Gamma-tubulin localises to centrosomes and is required for microtubule nucleation. In Saccharomyces cerevisiae, gamma-tubulin forms a stable complex with Spc97 and Spc98.
The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionines probably coordinate copper during the process of metal transport.
Yeast transcription factor IIIC (TFIIIC) is a multisubunit protein complex that interacts with two control elements of class III promoters called the A and B blocks. This family represents the subunit within TFIIIC involved in B-block binding. Although defined as a yeast protein, it is also found in a number of other organisms.
A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:
There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA.
Utp21 is a component of the SSU processome, which is required for pre-18S rRNA processing. It interacts with Utp18.
PDCD2 is localized predominantly in the cytosol of cells situated at the opposite pole of the germinal centre from the centroblasts as well as in cells in the mantle zone. It has been shown to interact with BCL6, an evolutionarily conserved Kruppel-type zinc finger protein that functions as a strong transcriptional repressor and is required for germinal centre development. The rat homologue, Rp8, is associated with programmed cell death in thymocytes.
The MIT domain is found in vacuolar sorting proteins, spastin (probable ATPase involved in the assembly or function of nuclear protein complexes), and a sorting nexin, which may play a role in intracellular trafficking.
This family of proteins are predicted to be alpha/beta-knot SAM-dependent RNA methyltransferases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.
Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.
This group of sequences contain aspartic endopeptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22B.
The peptidases were originally classified by hierarchical homology to the most conserved member - IMPAS 1. They are also known as signal peptide peptidase (SPP). They belong to the I-CliP family of peptidases. SPP cleaves cleaves remnant signal peptides left behind in the membrane by the action of signal peptidase and also plays key roles in immune surveillance and the maturation of certain viral proteins . SPPs do not require cofactors as demonstrated by expression in bacteria and purification of a proteolytically active form. The C-terminal region defines the functional domain, which is in itself sufficient for proteolytic activity.
Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region.
The amino-terminal module of the poxvirus D6R/NIR proteins defines a novel conserved DNA-binding domain (the KilA-N domain) that is found in a wide range of proteins of large bacterial and eukaryotic DNA viruses. Putative proteins with homology to the KilA-N domain have also been identified in Maverick transposable elements of the parabasalid protozoa Trichomonas vaginalis. The KilA-N domain has been suggested to be homologous to the fungal DNA-binding APSES domain (see. In all proteins shown to contain the KilA-N domain, it occurs at the extreme amino terminus accompanied by a wide range of distinct carboxy-terminal domains. These carboxy-terminal modules may be enzymes, such as the nuclease domains, or might mediate additional, specific interactions with nucleic acids or proteins, like the RING (see or CCCH fingers in the poxviruses. The KilA-N domain is predicted to adopt an alpha-beta fold with four conserved strands and at least two conserved helices. Some proteins known to contain a KilA-N domain are listed below:
Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.
The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.
The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.
In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.
This entry represents IscX proteins (also known as hypothetical protein YfhJ) that are part of the ISC system. IscX is active as a monomer. The structure of YfhJ is an orthogonal alpha-bundle. YfhJ is a small acidic protein that binds IscS, and contains a modified winged helix motif that is usually found in DNA-binding proteins. YfhJ/IscX can bind Fe, and may function as an Fe donor in the assembly of FeS clusters
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:
Based on their cellular localisation, PTPases are also classified as:
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.
This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types.
The Brix domain is found in a number of eukaryotic proteins including some from Saccharomyces cerevisiae and Homo sapiens, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins.
There are six (one archaean and five eukaryotic) protein families which have a similar domain architecture with a central globular Brix domain. They have an optional N- and obligatory C-terminal segments, which both have charged low-complexity regions.
Proteins from the Imp4/Brix superfamily appear to be involved in ribosomal RNA processing, which essential for the functioning of all cells. The N- and C-terminal halves of a member of the superfamily, Mil, show significant structural similarity to one another. This suggests an origin by means of an ancestral duplication. Both halves have the same fold as the anticodon-binding domain of class IIa aminoacyl-tRNA synthetases, with greater conservation seen in the N-terminal half. Structural evidence suggests that the Imp4/Brix superfamily proteins could bind single-stranded segments of RNA along a concave surface formed by the N-terminal half of their beta-sheet and a central alpha-helix.
The SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in eukaryotic chromosomal proteins. It is named after the proteins SWI3, RSC8 and MOIRA in which it was first recognised. This domain is predicted to mediate protein-protein interactions in the assembly of chromatin-protein complexes. The SWIRM domain can be linked to different domains, such as the ZZ-type zinc finger, the Myb DNA-binding domain, the HORMA domain, the amino-oxidase domain, the chromo domain, and the JAB1/PAD1 domain.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination. SWIM domains are also found in the homologous recombination protein Sws1, as well as in several hypothetical proteins.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents the HIT-type zinc finger, which contains 7 conserved cysteines and one histidine that can potentially coordinate two zinc atoms. It has been named after the first protein that originally defined the domain: the yeast HIT1 protein. The HIT-type zinc finger displays some sequence similarities to the MYND-type zinc finger. The function of this domain is unknown but it is mainly found in nuclear proteins involved in gene regulation and chromatin remodeling. This domain is also found in the thyroid receptor interacting protein 3 (TRIP-3) that specifically interacts with the ligand binding domain of the thyroid receptor.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
The Thg1 protein from Saccharomyces cerevisiae (Baker's yeast) is responsible for adding a GMP residue to the 5' end of tRNA His.
Methyltransferases (Mtases) are responsible for the transfer of methyl groups between two molecules. The transfer of the methyl group from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms. The reaction is catalyzed by Mtases and modifies DNA, RNA, proteins or small molecules, such as catechol, for regulatory purposes. Proteins in this entry belong to the RsmE family of Mtases, this is supported by crystal structural studying, which show a close structural homology to other known methyltransferases.
This entry contains RsmE of Escherichia coli, which specifically methylates the uridine in position 1498 of 16S rRNA in the fully assembled 30S ribosomal subunit.
The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae (Baker's yeast) contains a proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process, and the classes were called der for degradation in the ER. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins, suggesting that the function of the Der1 protein may be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. This family may also mediate degradation of misfolded proteins.
This protein previously of unknown biochemical function is essential in Escherichia coli. It has now been characterised as 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase, which converts 2C-methyl-D-erythritol 2,4-cyclodiphosphate (ME-2,4CPP) into 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate in the sixth step of nonmevalonate terpenoid biosynthesis. The family is restricted to bacteria, where it is widely but not universally distributed. No homology can be detected between this family and other proteins.
RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain comprised of the structural domains anchor and clamp. The clamp region (C-terminal) contains a zinc-binding motif. The clamp region is named due to its interaction with the clamp domain found in Rpb1. The domain also contains a region termed switch 4. The switches within the polymerase are thought to signal different stages of transcription.
RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the lobe domain. DNA has been demonstrated to bind to the concave surface of the lobe domain, and plays a role in maintaining the transcription bubble. Many of the bacterial members contain large insertions within this domain, a region known as dispensable region 1 (DRI).
RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the protrusion domain. The other lobe, RNA polymerase Rpb2, domain 2, is nested within this domain.
Quality control of intracellular proteins is essential for cellular homeostasis. Molecular chaperones recognise and contribute to the refolding of misfolded or unfolded proteins, whereas the ubiquitin-proteasome system mediates the degradation of such abnormal proteins. Ubiquitin-protein ligases (E3s) determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3.
Members of the U-box family of proteins constitute a class of ubiquitin-protein ligases (E3s) distinct from the HECT-type and RING finger-containing E3 families. Using yeast two-hybrid technology, all mammalian U-box proteins have been reported to interact with molecular chaperones or co-chaperones, including Hsp90, Hsp70, DnaJc7, EKN1, CRN, and VCP. This suggests that the function of U box-type E3s is to mediate the degradation of unfolded or misfolded proteins in conjunction with molecular chaperones as receptors that recognise such abnormal proteins.
Unlike the RING finger domain that is stabilised by Zn2+ ions coordinated by the cysteines and a histidine, the U-box scaffold is probably stabilised by a system of salt-bridges and hydrogen bonds. The charged and polar residues that participate in this network of bonds are more strongly conserved in the U-box proteins than in classic RING fingers, which supports their role in maintaining the stability of the U box. Thus, the U box appears to have evolved from a RING finger domain by appropriation of a new set of residues required to stabilise its structure, concomitant with the loss of the original, metal-chelating residues.
Sedlin is a 140 amino-acid protein with a putative role in endoplasmic reticulum-to-Golgi transport. Several missense mutations and deletion mutations in the SEDL gene, which result in protein truncation by frame shift, are responsible for spondyloepiphyseal dysplasia tarda, a progressive skeletal disorder (OMIM:313400). .
This family contains proteins from the Eukaryota; functionally they are uncharacterised.
This region is found in many but not all ATP-dependent DNA ligase enzymes. It is thought to be involved in DNA binding and in catalysis. In human DNA ligase I, and in Saccharomyces cerevisiae (Baker's yeast), this region was necessary for catalysis, and separated from the amino terminus by targeting elements. In Vaccinia virus this region was not essential for catalysis, but deletion decreases the affinity for nicked DNA and decreased the rate of strand joining at a step subsequent to enzyme-adenylate formation.
This group of sequences contain a conserved C-terminal domain which is found in the Schizosaccharomyces pombe (Fission yeast) protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. This domain is found in association with which is generally N-terminal and adjacent to this domain.
This group of sequences contain a conserved C-terminal domain which is found in the Schizosaccharomyces pombe (Fission yeast) protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. This domain is found in association with which is generally C-terminal and adjacent to this domain.
This region is found in many but not all ATP-dependent DNA ligase enzymes. It is thought to constitute part of the catalytic core of ATP dependent DNA ligase.
These proteins contain a conserved region found in the yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. GFP-tagged MSF1 localizes to mitochondria and is required for wild-type respiratory growth. This region is also found in a number of other eukaryotic proteins. The PRELI/MSF1 domain is an eukaryotic protein module which occurs in stand-alone form in several proteins, including the human PRELI protein and the yeast MSF1 protein, and as an amino-terminal domain in an orthologous group of proteins typified by human SEC14L1, which is conserved in all animals. In this group of proteins, the PRELI/MSF1 domain co-occurs with the CRAL-TRIO (see and the GOLD domains (see. The PRELI/MSF1 domain is approximately 170 residues long and is predicted to assume a globular alpha + beta fold with six beta strands and four alpha helices. It has been suggested that the PRELI/MSF1 domain may have a function associated with cellular membrane.
This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.
Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
This entry represents the epsilon subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex.
More information about these proteins can be found at Protein of the Month: Clathrin.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation.
Sec23 p and Sec24p are structurally related, folding into five distinct domains: a beta-barrel, a zinc-finger, an alpha/beta trunk domain, an all-helical region, and a C-terminal gelsolin-like domain. This entry describes an approximately 55-residue Sec23/24 zinc-binding domain, which lies against the beta-barrel at the periphery of the complex.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation.
Sec23 p and Sec24p are structurally related, folding into five distinct domains: a beta-barrel, a zinc-finger, an alpha/beta trunk domain, an all-helical region, and a C-terminal gelsolin-like domain. This entry describes the Sec23/24 alpha/beta trunk domain, which is formed from a single, approximately 250-residue segment plugged into the beta-barrel between strands beta-1 and beta-19. The trunk has an alpha/beta fold with a vWA topology, and it forms the dimer interface, primarily involving strand beta-14 on Sec23 and Sec24; in addition, the trunk domain of Sec23 contacts Sar1.
COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation.
Sec23 p and Sec24p are structurally related, folding into five distinct domains: a beta-barrel, a zinc-finger, an alpha/beta trunk domain, an all-helical region, and a C-terminal gelsolin-like domain. This entry describes the all-helical domain, which forms an approximately 105-residue segment with the C-terminal 30 residues. The linker between alpha-M and alpha-N contacts Sar1.
There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements, as summarised below:
Type I restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protects the organism against invading foreign DNA. Type I enzymes have three different subunits subunits - M (modification), S (specificity) and R (restriction) - that form multifunctional enzymes with restriction, methylase and ATPase activities. The S subunit is required for both restriction and modification and is responsible for recognition of the DNA sequence specific for the system. The M subunit is necessary for modification, and the R subunit is required for restriction. These enzymes use S-Adenosyl-L-methionine (AdoMet) as the methyl group donor in the methylation reaction, and have a requirement for ATP. They recognise asymmetric DNA sequences split into two domains of specific sequence, one 3-4 bp long and another 4-5 bp long, separated by a nonspecific spacer 6-8 bp in length. Cleavage occurs a considerable distance from the recognition sites, rarely less than 400 bp away and up to 7000 bp away. Adenosyl residues are methylated, one on each strand of the recognition sequence. These enzymes are widespread in eubacteria and archaea. In enteric bacteria they have been subdivide into four families: types IA, IB, IC and ID.
Type III restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. Type III enzymes are hetero-oligomeric, multifunctional proteins composed of two subunits, Res and Mod. The Mod subunit recognises the DNA sequence specific for the system and is a modification methyltransferase; as such it is functionally equivalent to the M and S subunits of type I restriction endonuclease. Res is required for restriction, although it has no enzymatic activity on its own. Type III enzymes recognise short 5-6 bp long asymmetric DNA sequences and cleave 25-27 bp downstream to leave short, single-stranded 5' protrusions. They require the presence of two inversely oriented unmethylated recognition sites for restriction to occur. These enzymes methylate only one strand of the DNA, at the N-6 position of adenosyl residues, so newly replicated DNA will have only one strand methylated, which is sufficient to protect against restriction. Type III enzymes belong to the beta-subfamily of N6 adenine methyltransferases, containing the nine motifs that characterise this family, including motif I, the AdoMet binding pocket (FXGXG), and motif IV, the catalytic region (S/D/N (PP) Y/F).
This entry represents the R subunit (HsdR) of type I restriction endonucleases, the Res subunit of type III endonucleases, and the B subunit of excinuclease ABC (uvrB).
Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.
Reaction of amidotransferase domain:
L-glutamine + H2O = L-glutamate + NH3
Reactions of FMN-binding domain:
2-oxoglutarate + NH3 = 2-iminoglutarate + H2O
2e + FMNox = FMNred
2-iminoglutarate + FMNred = L-glutamate + FMNoxThe central domain of glutamate synthase connects the N-terminal amidotransferase domain with the FMN-binding domain and has an alpha/beta overall topology.
This entry appears to represent a novel family of basic helix-loop-helix (bHLH) proteins that control differentiation and development of a variety of organs.
Human Nulp1 is a basic helix-loop-helix protein expressed broadly during early embryonic organogenesis. Over expression of human Nulp1 in COS-7 cells inhibits the transcriptional activity of serum response factor (SRF), suggesting that Nulp1 may act as a novel bHLH transcriptional repressor in the SRF signalling pathway to mediate cellular functions.
In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA. The crystal structure of bovine poly(A) polymerase bound to an ATP analogue at 2.5 A resolution has been determined. The structure revealed expected and unexpected similarities to other proteins. As expected, the catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase.
The C-terminal domain unexpectedly folds into a compact domain reminiscent of the RNA-recognition motif fold. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Furthermore, conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.
In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase which specifically incorporates ATP at the 3' end of mRNA. The crystal structure of bovine poly(A) polymerase bound to an ATP analog at 2.5 A resolutio has been determined. The structure revealed expected and unexpected similarities to other proteins. As expected, the catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase.
The central domain of Poly(A) polymerase shares structural similarity with the allosteric activity domain of ribonucleotide reductase R1, which comprises a four-helix bundle and a three-stranded mixed beta-sheet. Even though the two enzymes bind ATP, the ATP-recognition motifs are different.
This is a family of eukaryotic ribosomal biogenesis regulatory proteins.
This conserved region is found in a number of eukaryotic proteins, including the ribosome biogenesis protein (BMS) which may act as a molecular switch during maturation of the 40S ribosomal subunit in the nucleolus.
RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which is a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand.
RNA polymerase II is one of the three forms of RNA polymerase that exist in eukaryotic nuclei. The C-terminal region of the largest subunit of this oligomeric enzyme consists of the tandem repeat of a conserved heptapeptide. The number of repeats varies according to the species (for example there are 17 in Plasmodium, 26 in yeast, 44 in Drosophila, and 52 in mammals). The region containing these repeats is essential for the function of polymerase II. This repeated heptapeptide (called CT7n or CTD) is rich in hydroxyl groups. It probably projects out of the globular catalytic domain and may interact with the acidic activator domains of transcriptional regulatory proteins. It is also known to bind by intercalation to DNA. RNA polymerase II is activated by phosphorylation. The serine and threonine residues in the CT7n repeats are the target of such phosphorylation.
This presumed domain is found at the C terminus of lariat debranching enzyme. This domain is always found in association with a metallo-phosphoesterase domain RNA lariat debranching enzyme is capable of digesting a variety of branched nucleic acid substrates and multicopy single-stranded DNAs. The enzyme degrades intron lariat structures during splicing.
This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain.
Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.
The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilizing the SET domain.
The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.
This domain is associated with eukaryotic proteins of unknown function, which are hydrolase-like.
This family contains the lanthionine synthetase C-like proteins 1 and 2 which are related to the bacterial lanthionine synthetase components C (LanC). LANCL1(P40 seven-transmembrane-domain protein) and LANCL2 (testes-specific adriamycin sensitivity protein) are thought to be peptide-modifying enzyme components in eukaryotic cells. Both proteins are produced in large quantities in the brain and testes and may have role in the immune surveillance of these organs.
In Arabidopsis thaliana (Mouse-ear cress) GCR2 is a plasma-membrane abscisic acid receptor, which interacts with GPA1 to mediate all known ABA responsis in A. thaliana.
The Kri1 protein is also known as KRR1-interacting protein 1. The Saccharomyces cerevisiae member of this family is found to be required for the assembly of preribosomal 40S subunits in the nucleolus. KRR1 is highly expressed in dividing cells and its expression ceases almost completely when cells enter the stationary phase.
This entry represents a subgroup of the KRR1 interacting protein 1.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry prepresents the Zim17-type zinc finger motif thought to bind zinc. This domain is found in a number of eukaryotic proteins and is named after a short C-terminal motif of D(N/H)L. The domain is found in proteins having a novel zinc-finger essential for protein import into mitochondria.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
This motif is about 40 residues long and is probably formed of two alpha-helices. It is found in the Dpy-30 proteins, hence the motifs name. Dpy-30 from Caenorhabditis elegans is an essential component of dosage compensation machinery and loss of dpy-30 activity results in XX-specific lethality; in XO animals, Dpy-30 is required for developmental processes other than dosage compensation. In yeast, the homologue of DPY-30, Saf19p, functions as part of the Set1 complex that is necessary for the methylation of histone H3 at lysine residue 4; Set1 is a key part of epigenetic developmental control. There is also a human homologue of Dpy-30. This Dpy-30 region may be a dimerisation motif analogous that found in the cAMP-dependent protein kinase regulator, type II PKA, R subunit
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the connector domain (domain 2) found in proteins of the MutS family. The structure of the MutS connector domain consists of a parallel beta-sheet surrounded by four alpha helices, which is similar to the structure of the Holliday junction resolvase ruvC.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the clamp domain (domain 4) found in proteins of the MutS family. The clamp domain is inserted within the core domain at the top of the lever helices. It has a beta-sheet structure.
Adenylate kinases (ADK; are phosphotransferases that catalyse the Mg-dependent reversible conversion of ATP and AMP to two molecules of ADP, an essential reaction for many processes in living cells. In large variants of adenylate kinase, the AMP and ATP substrates are buried in a domain that undergoes conformational changes from an open to a closed state when bound to substrate; the ligand is then contained within a highly specific environment required for catalysis. Adenylate kinase is a 3-domain protein consisting of a large central CORE domain flanked by a LID domain on one side and the AMP-binding NMPbind domain on the other. The LID domain binds ATP and covers the phosphates at the active site. The substrates first bind the CORE domain, followed by closure of the active site by the LID and NMPbind domains.
Comparisons of adenylate kinases have revealed a particular divergence in the active site lid. In some organisms, particularly the Gram-positive bacteria, residues in the lid domain have been mutated to cysteines and these cysteine residues (two CX(n)C motifs) are responsible for the binding of a zinc ion. The bound zinc ion in the lid domain is clearly structurally homologous to Zinc-finger domains. However, it is unclear whether the adenylate kinase lid is a novel zinc-finger DNA/RNA binding domain, or that the lid bound zinc serves a purely structural function.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.
MutS is a modular protein with a complex structure, and is composed of:
Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.
This entry represents the core domain (domain 3) found in proteins of the MutS family. The core domain of MutS adopts a multi-helical structure comprised of two subdomains, which are interrupted by the clamp domain. Two of the helices in the core domain comprise the levers that extend towards the DNA.
Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
These metallopeptidases belong to MEROPS peptidase family M16 (clan ME). They include proteins, which are classified as non-peptidase homologues either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.
The peptidases in this group of sequences include:
These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown that this H-x-x-E-H motif is involved in enzymatic activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity. The mitochondrial processing peptidase consists of two structurally related domains. One is the active peptidase whereas the other, the C-terminal region, is inactive. The two domains hold the substrate like a clamp.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
This entry represents a probable zinc binding motif that contains four cysteines and may chelate zinc, known as the DPH-type after the diphthamide (DPH) biosynthesis protein in which it was first characterised, including the proteins DPH3 and DPH4. This domain is also found associated with N-terminal domain of heat shock protein DnaJdomain.
Diphthamide is a unique post-translationally modified histidine residue found only in translation elongation factor 2 (eEF-2). It is conserved from archaea to humans and serves as the target for diphteria toxin and Pseudomonas exotoxin A. These two toxins catalyse the transfer of ADP-ribose to diphtamide on eEF-2, thus inactivating eEF-2, halting cellular protein synthesis, and causing cell death. The biosynthesis of diphtamide is dependent on at least five proteins, DPH1 to -5, and a still unidentified amidating enzyme. DPH3 and DPH4 share a conserved region, which encode a putative zinc finger, the DPH-type or CSL-type (after the conserved motif of the final cysteine) zinc finger. The function of this motif is unknown.
More information about these proteins can be found at Protein of the Month: Zinc Fingers.
Alanine dehydrogenases and pyridine nucleotide transhydrogenase have been shown to share regions of similarity. Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. Pyridine nucleotide transhydrogenase catalyzes the reduction of NADP+ to NADPH with the concomitant oxidation of NADH to NAD+. This enzyme is located in the plasma membrane of prokaryotes and in the inner membrane of the mitochondria of eukaryotes. The transhydrogenation between NADH and NADP is coupled with the translocation of a proton across the membrane. In prokaryotes the enzyme is composed of two different subunits, an alpha chain (gene pntA) and a beta chain (gene pntB), while in eukaryotes it is a single chain protein. The sequence of alanine dehydrogenase from several bacterial species are related with those of the alpha subunit of bacterial pyridine nucleotide transhydrogenase and of the N-terminal half of the eukaryotic enzyme. The two most conserved regions correspond respectively to the N-terminal extremity of these proteins, represented in this entry, and to a central glycine-rich region which is part of the NAD(H)-binding site.
This putative domain is found in the MoeZ protein and the MoeB protein. The domain has two CXXC motifs that are only partly conserved. MoeZ is necessary for the synthesis of pyridine-2,6-bis(thiocarboxylic acid), a small secreted metabolite that has a high affinity for transition metals, increases iron uptake efficiency by 20% in Pseudomonas stutzeri, has the ability to reduce both soluble and mineral forms of iron, and has antimicrobial activity towards several species of bacteria. MoeB is the molybdopterin synthase activating enzyme in the molybdopterin cofactor biosynthesis pathway. Both these enzymes are members of a superfamily consisting of related but structurally distinct proteins that are members of pathways involved in the transfer of sulphur-containing moieties to metabolites and both also contain the UBA/THIF-type NAD/FAD binding fold.
This family consists of several uncharacterised eukaryotic proteins.
This family consists of several eukaryotic AAR2-like proteins. The Saccharomyces cerevisiae protein AAR2 is involved in splicing pre-mRNA of the a1 cistron and other genes that are important for cell growth.
This entry contains the Baculovirus immediate-early protein IE-0.
Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at invasion front in the placenta from early pregnancy. This family also includes the Saccharomyces cerevisiae protein ENP1. ENP1 is an essential protein in S. cerevisiae and is localised in the nucleus. It is thought that ENP1 plays a direct role in the early steps of rRNA processing as enp1 defective S. cerevisiae cannot synthesise 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits.
This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin and the PKD domain suggest an Ig-like fold so this family may be similar in function to the and protein families.
This family is a family of eukaryotic membrane proteins. It was previously annotated as including a putative receptor for human cytomegalovirus gH but this has has since been disputed. Analysis of the mouse Tapt1 protein (transmembrane anterior posterior transformation 1) has shown it to be involved in patterning of the vertebrate axial skeleton.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.
Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:
In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This signature defines the C-terminal proteolytic domain of the archael, bacterial and eukaryotic lon proteases, which are ATP-dependent serine peptidases belonging to the MEROPS peptidase family S16 (lon protease family, clan SF). In the eukaryotes the majority of the proteins are located in the mitochondrial matrix. In yeast, Pim1, is located in the mitochondrial matrix, is required for mitochondrial function, is constitutively expressed but is increased after thermal stress, suggesting that Pim1 may play a role in the heat shock response.
Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.
(Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).
Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no currently known function.
As well as Pirh2, the CHY-type zinc finger is also found in the following proteins:
The solution structure of this zinc finger has been solved and binds 3 zinc atoms as shown in the following schematic representation:
More information about these proteins can be found at Protein of the Month: Zinc FingersA number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see. Proteins in this group represent a subset of closely related proteins or modules, including the Escherichia coli tartrate dehydratase alpha chain and the N-terminal region of the class I fumarase (where the C-terminal region is homologous to the tartrate dehydratase beta chain). The activity of archaeal proteins in this group is unknown.
A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see. Proteins in this group represent a subset of closely related proteins or modules, including the Escherichia coli tartrate dehydratase beta chain and the C-terminal region of the class I fumarase (where the N-terminal region is homologous to the tartrate dehydratase alpha chain). The activity of the archaeal proteins in this group is unknown.
The process of vesicular fusion with target membranes depends on a set of SNAREs (SNAP-Receptors), which are associated with the fusing membranes. Target SNAREs (t-SNAREs) are localised on the target membrane and belong to two different families, the syntaxin-like family and the SNAP-25 like family. One member of each family, together with a v-SNARE localised on the vesicular membrane, are required for fusion.
The Syntaxins are type-I transmembrane proteins that contain several regions with coiled-coil propensity in their cytosolic part, the SNARE motif. SNAP-25 is a protein consisting of two coiled-coil regions, which is associated with the membrane by lipid anchors. SNARE motifs assemble into parallel four helix bundles stabilised by the burial of these hydrophobic helix faces in the bundle core. Monomeric SNARE motifs are disordered so this assembly reaction is accompanied by a dramatic increase in alpha-helical secondary structure. The parallel arrangement of SNARE motifs within complexes bring the transmembrane anchors, and the two membranes, into close proximity. Recently, it was shown that the two coiled-coil regions of SNAP-25 and one of the coiled-coil regions of the syntaxins are related. This domain is found in both Syntaxin and SNAP-25 families as well as in other proteins.
The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cogn