Schweitzer Fachinformationen
Wenn es um professionelles Wissen geht, ist Schweitzer Fachinformationen wegweisend. Kunden aus Recht und Beratung sowie Unternehmen, öffentliche Verwaltungen und Bibliotheken erhalten komplette Lösungen zum Beschaffen, Verwalten und Nutzen von digitalen und gedruckten Medien.
Volkhard Helms
Saarland University, Center for Bioinformatics, Saarland Informatics Campus, Postfach 15 11 50, 66041 Saarbrücken, Germany
The size of proteins ranges from very small proteins, such as the 20-amino acid miniprotein Trp cage, to the largest protein in the human body, titin, which consists of about 27?000 amino acids and has a molecular weight of 3 million Dalton. Generally, when speaking of typical proteins, we refer to compact proteins of about 80 to 500 amino acids (residues) in size. Tiessen et al. reported that archaeal proteins had the smallest average size (283 aa), followed by bacterial proteins (320 aa) and eukaryotic proteins (472 aa) [1]. Among eukaryotes, plant proteins (392 aa) had a smaller size, whereas animal proteins (486 aa) and proteins from fungi (487 aa) were larger.
The larger a single protein gets, the higher is the chance that it will be composed of multiple structurally distinct "domains." These are typically sequential parts of the protein sequence with a characteristic length between 100 and 200 amino acids [2]. For example, the protein Src kinase consists of an SH3 domain (that binds to proline-rich peptides), an SH2 domain (that binds to phosphorylated tyrosine residues), and the catalytic kinase domain, see Figure 1.1. In the inactive state, the SH3 domain will hold on to the linker connecting SH2 and catalytic domain that contains several prolines, and the SH2 domain will hold on to a phosphorylated tyrosine in the C-terminal tail of the catalytic domain. Thereby, all three domains are locked in a conformationally restricted state. Once activated by dephosphorylation of the tyrosine, these contacts are released, and the catalytic domain can undergo the characteristic Pacman-type opening/closing motion of protein kinases, enabling the binding of adenosine triphosphate (ATP). In the closed conformation, the active site residues catalyze transfer of the terminal ?-phosphate of ATP to a nearby tyrosine of a substrate protein bound on the Src kinase surface. The catalytic domain of kinases itself consists of two domain-like "lobes," a smaller N-terminal lobe (of about 80 aa) and a larger C-terminal lobe (of about 180 aa).
Figure 1.1 X-ray structure (PDB code 1AD5) of human Src kinase. The peptide sequence starts with an SH3 domain (top left), followed by an SH2 domain (bottom left) and then leads to the catalytic kinase domain (right). ATP is bound between small (top) and large lobe (bottom) of the kinase domain.
Source: Figure generated with NGL viewer.
Although multi-domain proteins exist in all life forms, more complex organisms (having a larger number of unique cell types) contain more unique domains and a larger fraction of multi-domain proteins: eukaryotes have more multi-domain proteins than prokaryotes, and animals have more multi-domain proteins than unicellular eukaryotes [3].
The composition of a protein depends on its environment and its posttranslational modifications, such as phosphorylation and sumoylation. For example, extracellular domains of most cell membrane proteins are often extensively glycosylated. Here, we will focus on the varying mixture of the 20 commonly occurring amino acids that make up most of all existing proteins. Water-soluble proteins possess a rather hydrophobic core and a polar surface that is in contact with the cytoplasm. This clear organizational principle provides the main driving force for the folding of water-soluble domains via the "hydrophobic effect."
Prokaryotic proteins contain more than 10% of leucine and about 9% of alanine residues, but rather few (only 1-2%) cysteine, tryptophan, histidine, and methionine residues [4]. Brüne et al. compared the amino acid composition of prokaryotic and eukaryotic proteins [5]. Eukaryotes have the highest variability for proline, cysteine, and asparagine. Amino acids showing high variability across species are lysine, alanine, and isoleucine, whereas histidine, tryptophan, and methionine vary the least. Cysteine is more common in eukaryotes than in archaea and bacteria, whereas isoleucine is less abundant in eukaryotes. The authors also analyzed the differential usage of amino acids in domains and linkers. Proline and glutamine, but to a smaller extent, polar and charged amino acids, are more common in linkers that are rather exposed to surrounding water. Globular domains contain larger fractions of hydrophobic amino acids, such as leucine and valine, and aromatic ones, such as phenylalanine and tyrosine.
Folded proteins contain two types of secondary structure elements, a-helices and ß-sheets. a-Helices have lengths between 9 and 37 residues with a peak at 11 amino acids [6]. ß-Sheets are considerably shorter, being 2-17 residues long with a peak at 5 residues [7]. The secondary structure content of proteins ranges from purely helical proteins, such as myoglobin, containing six a-helices (see Figure 1.2) over mixed a/ß proteins to so-called ß-barrels, such as green fluorescent protein (GFP), see Figure 1.3, or Omp membrane pores in the outer membranes of gram-negative bacteria. Secondary structure elements provide stability to the protein structure and serve, e.g to anchor the catalytic residues of the active site at precise positions from each other (see below). a-Helices are also the structural basis of coiled coils, see Figure 1.4, because the helices can nicely pack against each other. a-Helices are frequently used by transcription factors, such as GCN4, at the DNA-binding interface, where the a-helices can intercalate in the major or minor grooves of the DNA double helix.
Active sites of enzymes are locations where bound substrate molecules undergo chemical modifications while being bound to the enzyme. Figure 1.5 shows the active site of the serine protease chymotrypsinogen A with the characteristic catalytic residues serine, histidine, and aspartic acid. In principle, discussing enzymatic mechanisms is out of scope for this book, which mostly deals with interactions that proteins engage in. Some multienzyme complexes having multiple active sites assemble to enable the product of one reaction to be passed from one active site to other, where it becomes the substrate of a follow-up chemical reaction. Generally, access to active sites should not be precluded by binding to other interaction partners, although, in some cases, binding patches need to be close to the active site, e.g. when a kinase binds its substrate on a patch on the surface of the large lobe so that a phosphate group can be transferred from bound ATP to a serine residue of the bound substrate as mentioned before.
Figure 1.2 X-ray structure (PDB code 1MBN) of myoglobin from Physeter catodon. The porphyrin cofactor is anchored between six alpha helices.
Figure 1.3 X-ray structure of the green fluorescent protein from Aequorea victoria (PDB code 1EMA). The barrel-shaped structure is formed by 11 beta-strands surrounding a central alpha-helix holding the chromophore.
Source: Figure generated with UCSF Chimera.
Figure 1.4 X-ray structure of GCN4 dimer from S. cerevisiae forming a so-called coiled coil and bound here to DNA (PDB code 1YSA).
Figure 1.5 Catalytic triad - aspartic acid, histidine, serine - in the active site of a serine protease.
Source: European Molecular Biology Laboratory (EMBL).
Often, the active sites of enzymes are located on the protein surface, so that substrates can easily bind while remaining partially solvent exposed. A frequent structural motif is a flexible protein loop that reaches over the bound substrate, e.g. in HIV protease, see Figure 1.6. In other cases, the active site is located inside the protein, such as for cytochrome P450 enzymes or acetylcholine esterase. There, substrates need to pass into the protein structure through a channel that may be up to several nanometers long, see Figure 1.7. The main purpose of such an arrangement is to place the substrate in a low-dielectric cavity that enables complicated chemical reactions to take place. Note that the strength of electrostatic interactions is inversely proportional to the dielectric constant of the environment. In a low dielectric environment, charged protein residues can exert stronger electron-pulling or?pushing effects on the substrate. Enzyme active sites, ligand binding sites, or translocation pores of ion channels can either reside in individual protein units or in between the interfaces of multimers.
Figure 1.6 X-ray...
Dateiformat: ePUBKopierschutz: Adobe-DRM (Digital Rights Management)
Systemvoraussetzungen:
Das Dateiformat ePUB ist sehr gut für Romane und Sachbücher geeignet – also für „fließenden” Text ohne komplexes Layout. Bei E-Readern oder Smartphones passt sich der Zeilen- und Seitenumbruch automatisch den kleinen Displays an. Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!
Weitere Informationen finden Sie in unserer E-Book Hilfe.