Review
Epigenetic regulation of glycosylation is the quantum mechanics of biology

https://doi.org/10.1016/j.bbagen.2013.08.017Get rights and content

Highlights

  • The majority of proteins are glycosylated.

  • Glycan parts of proteins perform numerous structural and functional roles

  • There are no genetic templates for glycans, instead glycans are defined by dynamic interaction between genes and environment.

  • Epigenetic changes enable adaptation to variations in environment.

  • Epigenetic regulation of glyco—genes is a powerful evolutionary tool.

Abstract

Background

Most proteins are glycosylated, with glycans being integral structural and functional components of a glycoprotein. In contrast to polypeptides, which are fully encoded by the corresponding gene, glycans result from a dynamic interaction between the environment and a network of hundreds of genes.

Scope of review

Recent developments in glycomics, genomics and epigenomics are discussed in the context of an evolutionary advantage for higher eukaryotes over microorganisms, conferred by the complexity and adaptability which glycosylation adds to their proteome.

Major conclusions

Inter-individual variation of glycome composition in human population is large; glycome composition is affected by both genes and environment; epigenetic regulation of “glyco-genes” has been demonstrated; and several mechanisms for transgenerational inheritance of epigenetic marks have been documented.

General significance

Epigenetic recording of acquired characteristics and their transgenerational inheritance could be important mechanisms used by higher organisms to compete or collaborate with microorganisms.

Introduction

With the ability to sequence genomes in a matter of hours, along with the accompanying technical advances in other “omics,” biology is ripe for a scientific revolution analogous to the one which transformed the field of physics in the early 20th century. Newton's laws of motion are still as useful today as they were in the 17th century when first formulated. However, certain properties of matter could not be explained until a complete paradigm shift took place with the introduction of quantum mechanics. Biology today faces a similar challenge: with the theory of Darwinian evolution by natural selection still undisputed as a cornerstone of modern biology, certain aspects of adaptation to selection pressure cannot be adequately explained by changes in single protein structures alone. Rather, the complexity, which lies inside as well as outside of the genome itself and in the intricate network of interactions belonging to other “omics,” begins to emerge as an important evolutionary force. Two frequently overlooked “omics” – glycomics and epigenomics – are the missing pieces of the puzzle and a key to better understanding of biology, which might soon prove as important to that discipline as the introduction of quantum mechanics had been for physics. Despite the scarcity of hard evidence, the big picture is already emerging from the recent studies, which makes this exciting new field, i.e. epigenetics of glycosylation, ripe to be reviewed in the context of evolution.

One major point which simple Darwinian model of evolution fails to explain adequately is the huge difference in the rate of reproduction between prokaryotic microorganisms and higher eukaryotes, plants and animals in particular. For example, the majority of animals will have at most – and only if we consider extreme examples – several thousands of surviving offspring in their lifetimes, while a single bacterium can generate billions of progeny bacteria in a single day. Clearly, if higher eukaryotes are to keep up with this evolutionary arms race without getting overrun by sheer numbers, they have to look for a source of diversity and rapid adaptation elsewhere, but not in their reproductive capacity. This matching of the evolutionary rate (and the speed of general adaptation) is achieved in higher eukaryotes by modifying their proteins not only by direct change in the amino acid sequence – which takes a full generation to be established – but by attaching other molecules, such as glycans, to their surface, changing their function in this way and enormously increasing diversity, thus compensating for their slower reproduction rate.

Another point, which Darwinian evolution does not explain adequately, is the shaping of development and functional integration of trillions of cells in a multicellular organism. The way cells are organized into higher-order structures (tissues, organs) is written in the genome, but in a way that is not nearly as explicit as how the structure of a protein is encoded. An intricate and both functionally and structurally complex system such as the human brain is fully defined by a set of slightly more than 20,000 protein-coding genes [1], but it is obvious that the complexity required to produce such a delicate structure must go well beyond the simple, straightforward action of 20,000 elements. Indeed, recent progress in genomics paints a new picture, showing the protein-coding genes as blueprints for basic tools which a living cell uses to maintain homeostasis and highlights the importance of regulatory elements [2]. While features of a tool (protein structure) do reflect on the final product it was used to make, it is the application of that tool which mainly defines what is built. Very similar sets of genes can serve as blueprints for different organisms, and the main difference is how the (almost identical) genes are used (cf. human vs. chimpanzee). Recent results support this view by showing that a large number of regulatory elements in humans is lineage-specific [3], highlighting the importance of regulatory elements (vs. protein-coding genes) in defining the blueprint of an organism.

The genome, with its structural (protein-coding) and regulatory elements, defines an organism by serving as a template giving rise to a complex network of interacting biological molecules. Such modular biological networks capture some key properties of life — robustness and evolvability [4]. Much of the complexity required to adapt to the environment and organize cells into intricate assemblies is encoded in such networks, and therefore only indirectly in the genome. Major components of those biological networks integrating cellular processes across all the different “omics” are epigenetics — which adds to genome another layer of information about when, where and how a coding sequence will be read, and glycosylation — with its capacity to significantly alter protein structure and function.

Posttranslational modifications alter and enrich protein structure and function. By far the most complex of these modifications is glycosylation. The vast majority of human proteins are glycosylated [5], with most proteins targeted to the cell's membrane system getting the core glycan attached during their synthesis in the endoplasmic reticulum (N-linked glycosylation), with further processing and O-linked glycosylation occurring in the Golgi apparatus. Glycan parts of proteins perform numerous important structural and functional roles [6]. Actually, once the glycan part is added to the polypeptide backbone, it becomes completely irrelevant whether ―OH, ―NH2 or ―COO groups belong to the polypeptide or the glycan part. They all together form the integral molecular structure (Fig. 1) that performs specific physiological functions [7]. However, the big difference between a polypeptide and a glycoprotein is that there is no direct genetic template for glycan parts of glycoproteins. In contrast to polypeptides, which are fully defined by nucleotide sequence in the corresponding genes, glycans are defined by a large dynamic network of both genetic and environmental factors [8], [9]. In addition to genetic polymorphisms in the participating genes, regulation of gene expression, posttranslational modifications, and the activity of the corresponding proteins work together to determine the structure of a glycan. Through this process, the environment participates in shaping the final structure of a glycoprotein.

Biosynthesis of glycans requires many monosaccharide building blocks and their availability significantly affects structure of glycans and composition of the glycome [10]. Altered pH in Golgi [11], [12], oxygen concentration [13] and many other external factors also affect protein glycosylation. Subcellular localization of enzymes, activated monosaccharide donor substrates and glycan acceptor substrates can also affect the final outcome [14]. We are only beginning to understand the details of the intricate enzymatic network which controls the manner in which proteins are glycosylated. [15]. Recently initiated genome wide association studies (GWAS) of the human glycome [16], [17], [18] started to identify new and unexpected genes which are involved in this process and further progress in this field is expected to map the complex network of genes which regulates protein glycosylation [19].

All these “glyco-genes” (glycosyltransferases, glycosidases and other genes involved in complex biosynthetic pathways of glycans) are regulated on the transcriptional level not only by general transcription factors, but also by chromatin-modifying activities including ATP-dependent remodeling complexes as well as histone modifying complexes, which add/remove covalent groups (phosphate, acetyl and methyl groups, etc.) to/from histone tails. These chromatin-modifying activities act in concert with DNA methylation to create epigenetic information, which not only determines gene transcription status, but also changes this status in response to external and intrinsic signals, in order to achieve appropriate functional change in protein glycosylation (Fig. 2). The mediator role of epigenetic mechanisms between genes, environment [20] and the final glycoprotein structure and function has a great potential for evolution of multicellular life [9]. For example, the repertoire of glycan structures that can be produced by epigenetic changes in glyco-gene expression can be very large [21]. The addition of glycans to polypeptide backbones increases the complexity of the proteome by several orders of magnitude. This increased structural capacity and its dynamic flexibility enables complex eukaryotes to perform numerous complex functions. For example, fine tuning of IgG function and the regulation of the cell surface half-life of membrane proteins seem to be, at least by a large part, regulated by alternative glycosylation [22], [23]. The role of alternative glycosylation in the function of the important developmental regulator Notch has also been well documented [24]. Recent population studies of both total plasma glycome [25] and glycome of an individual protein [26] revealed great inter-individual differences in glycome composition, while individual glycome composition was remarkably stable [27], [28]. Up to 50% of the observed variations were heritable [25] with limited effects of directly acting environmental factors on the majority of glycans [29].

A particular gene expression pattern is established by epigenetic marks and then memorized, meaning inherited through cell divisions [30]. However, the epigenome also provides the genome with certain plasticity, owing to the possibility of epigenetic marks to change rapidly in response to environment (the so-called epigenetic on/off switch) and to the reversible nature of this change. These short-term memory epigenetic effects [31] are mostly achieved by quick alterations in histone marks, rather than changes in DNA methylation. DNA methylation also changes during a lifetime, either stochastically or in response to environmental factors. However, this change is less rapid and represents long-term memory effects [32], since an addition of the methyl group to a cytosine is a more stable epigenetic mark than are histone modifications. In order to have impact on evolution, the newly established alterations in the gene expression pattern should be passed through gametes to the next generation, with alterations in DNA methylation being a plausible mechanism. The body of evidence demonstrating transgenerational epigenetic inheritance in both plants and animals is growing rapidly [33], [34], [35], [36], [37], [38], [39], [40]; this phenomenon is developing into an exciting topic in the field of epigenetics. In their extensive review, Jablonka and Raz [41] gave an impressive table of over hundred examples of inherited epigenetic variations for organisms ranging from Caenorhabditis elegans to humans. Molecular mechanisms for transgenerational epigenetic inheritance have been extensively studied, but full molecular characterization of epigenetic transfer through gametes to the next generation/generations is not yet available for any organism.

Most of the studies performed in animals have identified incomplete epigenetic resetting of DNA methylation as the most probable mechanism for transfer of epigenetic information through gametes [42], [43], [44]. In order to pass to the next generation, DNA methylation variations (i.e. epialleles) have to slip through the two waves of epigenetic reprogramming — during gametogenesis and early embryogenesis [42]. There are some valuable examples of DNA methylation-mediated transgenerational inheritance by incomplete erasure of methylation marks [34], [43], [45], [46]. The recent prominent study of Skinner and coworkers [40] has shown how subtle environmentally induced changes in cytosine methylation can have a dramatic effect on the transcriptome of different tissues mediated by “epigenetic control regions,” even in the F3 generation. DNA methylation is mechanistically interrelated with other chromatin components such as histone modifications and/or action of small non-coding RNA molecules. However, data are scarce for histone-mediated transgenerational inheritance by incomplete replacement of histones by protamines [47], retention of the centromeric histone H3 variant CENP-A in mammalian sperm [48], or by direct modifications to sperm chromatin [39]. These, and other epigenetic inheritance systems, such as self-sustaining feedback loop, structural inheritance and small RNAs [41], have nevertheless not been as rigorously explored in animal models as they have been in plants [49], [50], [51].

An exciting (albeit the least explored) epigenetic inheritance system is the action of small non-coding RNAs of various origins. Evidence is accumulating that this mechanism can be responsible for epigenetic effects lasting through multiple generations. In rats and some other mammals, epigenetic effects mediated through non-coding RNAs are recorded within 3–4 generations [44], [52], [53] and in some insects even for 10–15 generations [41]. Recent outstanding study in C. elegans has shown that small interfering virus-derived viRNAs are involved in transgenerational epigenetic inheritance through 30 generations in the absence of the genetic template and even in the absence of the functional small RNA-generating machinery [35]. Mammalian spermatocites and oocytes are filled with piwi-interacting RNAs (piRNAs) [54], [55], responsible for silencing of retrotransposons and other repetitive elements in germ line cells. Therefore, these and some other similar, yet undiscovered, RNA molecules could be candidates for the transgenerational epigenetic inheritance through germ-line cells in humans.

Glycans are synthesized through complex biochemical pathways in which many genes are involved. The final glycan structure is as much influenced by genetic polymorphisms as by environmental factors where epigenetic mechanisms play mediator role between environment and the glyco-gene expression. Indeed, many glyco-genes with a role in normal development [56], [57] are epigenetically regulated (Table 1). These glyco-genes show different epigenetic regulation in normal cells and in cancer [58], [59], a connection which is sometimes established through the influence of epigenetically controlled glyco-genes on other cellular processes such as apoptosis [60]. There are many examples of expression of cancer specific glycans in many types of cancer such as colon cancer, where these are the products of epigenetic deregulation either by promoter methylation [61], [62] or by histone modifications [63]. Other examples include bladder [64], ovary [65], gastric [62] and pancreatic [66] cancer. Also, epigenetic deregulation of other glycosylation-related genes, such as transcription factors, is shown to have an effect on glycome composition and the disease outcome [67]. Treatments of cells in culture with epigenetic inhibitors reveal that N-glycome profiles drastically change, which is an indication that many glyco-genes and glycosylation-related genes are regulated both by DNA methylation and histone modifications [65], [68], [69]. Finally, tissue-specific epigenetic control of glyco-genes has been recently found in brain [57], which implies a role of glycosylation in development. Hard evidence is thus accumulating to support the very important role of epigenetically controlled glycosylation in differentiation and adaptation. Also, epigenetic regulation of protein glycosylation might represent an important road from homeostasis to complex diseases such as diabetes [67], cardiovascular diseases, or cancer.

Evolutionary significance of epigenetic variations and epimutations has been widely discussed by Jablonka and Raz [41]. By combining these mechanisms for the inheritance of “acquired” characteristics, with the power of glycosylation machinery to create novel structures, higher organisms could have generated a powerful mechanism for creation of large structural variability through environmentally mediated, transgenerationally inherited epigenetic changes. The evolutionary impact of this mechanism could be immense. Glycans are the main receptors for virtually all pathogenic and commensal microorganisms and higher organisms have complex mechanisms to modulate these interactions [70], [71]. Human populations exposed to pathogens develop resistance mechanisms, which are poorly understood, but the presence of these mechanisms is clearly evident from devastating effects of relatively benign diseases like smallpox, chicken pox, or measles which decimated native American populations after being transferred from Europe [72]. It is tempting to speculate that the European resistance to diseases endemic to the Old World resulted from gene expression patterns developed as an adaptation to specific pathogens, leading to adaptive glycosylation in the immune system, which was passed to the next generations by epigenetic mechanisms. This speculation is not as far-fetched as it might sound at first — glycans have a well-known role in modulating immunity, especially the IgG-class antibodies. Through stable epigenetic alteration in glyco-gene expression, complex organisms could develop and maintain novel structural features without introducing probably deleterious changes (such as mutations) in their genomes (Fig. 2). If epigenetic inheritance systems are able to transmit information for newly created structures to the next generation/generations through germ-line cells, this would give complex organisms a powerful tool to compete with high speed of evolution of pathogenic microorganisms. For example, at the moment it is only fine details of glycan structures which make humans resistant to H5N1 avian influenza virus [73].

With large and currently unmanageable amounts of data generated in the research of various “omics,” a paradigm shift is beginning to take place in the field of biology, where reductionism is giving way to the study of life as a complex system. While we begin to make sense of the vast amounts of genomic, epigenomic, transcriptomic, proteomic, glycomic, metabolomic, lipidomic and other data, a big picture emerges and we begin to understand the biological networks that hide most of the complexity of life. Epigenetic regulation of glycosylation is beginning to move into the spotlight because of the prominent role in that network, where it generates the diversity that higher eukaryotes require to assemble complex structures, adapt to the ever changing environment and interact with microorganisms. By looking beyond the raw genome data, we are beginning to see a new, more insightful picture of life — as that picture crystallizes before us, it promises to truly become the “quantum mechanics of biology.”

References (75)

  • G.W. van der Heijden et al.

    Transmission of modified nucleosomes from the mouse male germline to the zygote and subsequent remodeling of paternal chromatin

    Dev. Biol.

    (2006)
  • U. Syrbe et al.

    Differential regulation of P-selectin ligand expression in naive versus memory CD4 + T cells: evidence for epigenetic regulation of involved glycosyltransferase genes

    Blood

    (2004)
  • Y. Kizuka et al.

    Brain-specific expression of N-acetylglucosaminyltransferase IX (GnT-IX) is regulated by epigenetic histone modifications

    J. Biol. Chem.

    (2011)
  • Y.S. Kim et al.

    Aberrant expression of carbohydrate antigens in cancer: the role of genetic and epigenetic regulation

    Gastroenterology

    (2008)
  • A. Caretti et al.

    DNA methylation and histone modifications modulate the beta1,3 galactosyltransferase beta3Gal-T5 native promoter in cancer cells

    Int. J. Biochem. Cell Biol.

    (2012)
  • Y. Chihara et al.

    Loss of blood group A antigen expression in bladder cancer caused by allelic loss and/or methylation of the ABO gene

    Lab. Invest.

    (2005)
  • Y. Ide et al.

    Aberrant expression of N-acetylglucosaminyltransferase-IVa and IVb (GnT-IVa and b) in pancreatic cancer

    Biochem. Biophys. Res. Commun.

    (2006)
  • T. Horvat et al.

    Epigenetic modulation of the HeLa cell membrane N-glycome

    Biochim. Biophys. Acta

    (2012)
  • L.M. Chen et al.

    In vitro evolution of H5N1 avian influenza virus toward human-type receptor specificity

    Virology

    (2012)
  • J. Serpa et al.

    Expression of Lea in gastric cancer cell lines depends on FUT3 expression regulated by promoter methylation

    Cancer Lett.

    (2006)
  • I.H.G.S. Consortium

    Finishing the euchromatic sequence of the human genome

    Nature

    (2004)
  • E. Pennisi

    Genomics. ENCODE project writes eulogy for junk DNA

    Science

    (2012)
  • L.D. Ward et al.

    Evidence of abundant purifying selection in humans for recently acquired regulatory functions

    Science

    (2012)
  • A. Hintze et al.

    Evolution of complex modular biological networks

    PLoS Comput. Biol.

    (2008)
  • K.W. Moremen et al.

    Vertebrate protein glycosylation: diversity, synthesis and function

    Nat. Rev. Mol. Cell Biol.

    (2012)
  • R.T. Lee et al.

    Glycoproteomics: protein modifications for versatile functions

    EMBO Rep.

    (2005)
  • V. Zoldos et al.

    Genomics and epigenomics of the human glycome

    Glycoconj. J.

    (2013)
  • V. Zoldoš et al.

    Epigenetic regulation of protein glycosylation

    Biomol. Concepts

    (2010)
  • A. Rivinoja et al.

    Elevated Golgi pH impairs terminal N-glycosylation by inducing mislocalization of Golgi glycosyltransferases

    J. Cell. Physiol.

    (2009)
  • G. Lauc et al.

    Complex Genetic Regulation of Protein Glycosylation

    Mol. Biosyst.

    (2010)
  • J.E. Huffman et al.

    Polymorphisms in B3GAT1, SLC9A9 and MGAT5 are associated with variation within the human plasma N-glycome of 3533 European adults

    Hum. Mol. Genet.

    (2011)
  • G. Lauc et al.

    Genomics meets glycomics — the first GWAS study of human N-glycome identifies HNF1alpha as a master regulator of plasma protein fucosylation

    PLoS Genet.

    (2010)
  • G. Lauc et al.

    Loci associated with N-glycosylation of human immunoglobulin g show pleiotropy with autoimmune diseases and haematological cancers

    PLoS Genet.

    (2013)
  • R. Jaenisch et al.

    Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals

    Nat. Genet.

    (2003)
  • G. Lauc et al.

    Protein glycosylation — an evolutionary crossroad between genes and environment

    Mol. Biosyst.

    (2010)
  • J.W. Dennis et al.

    Adaptive regulation at the cell surface by N-glycosylation

    Traffic

    (2009)
  • A. Knežević et al.

    Variability, heritability and environmental determinants of human plasma N-glycome

    J. Proteome Res.

    (2009)
  • Cited by (0)

    View full text