Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letters to Nature
  • Published:

Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana

Abstract

The genome of the flowering plant Arabidopsis thaliana has five chromosomes1,2. Here we report the sequence of the largest, chromosome 1, in two contigs of around 14.2 and 14.6 megabases. The contigs extend from the telomeres to the centromeric borders, regions rich in transposons, retrotransposons and repetitive elements such as the 180-base-pair repeat. The chromosome represents 25% of the genome and contains about 6,850 open reading frames, 236 transfer RNAs (tRNAs) and 12 small nuclear RNAs. There are two clusters of tRNA genes at different places on the chromosome. One consists of 27 tRNAPro genes and the other contains 27 tandem repeats of tRNATyr-tRNATyr-tRNASergenes. Chromosome 1 contains about 300 gene families with clustered duplications. There are also many repeat elements, representing 8% of the sequence.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Density of various features along chromosome 1.
Figure 2: Clusters of tRNA genes in chromosome 1.
Figure 3

References

  1. Goodman, H., Ecker, J. R. & Dean, C. The genome of Arabidopsis thaliana. Proc. Natl Acad. Sci. USA 93, 10831– 10835 (1995).

    Article  Google Scholar 

  2. Meyerowitz, E. M. in Arabidopsis (eds Meyerowitz, E. M. & Somerville, C.) 21–36 (Cold Spring Harbor Press, Cold Spring Harbor, NY, 1994).

    Google Scholar 

  3. Goffeau, A. et al. Life with 6000 genes. Science 274, 546–567 (1996).

    Article  ADS  CAS  PubMed  Google Scholar 

  4. C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282, 2012– 2046 (1998).

    Article  ADS  Google Scholar 

  5. Adams, M. D. The genome sequence of Drosophila melanogaster Science 287, 2185–2195 (2000).

    Article  PubMed  Google Scholar 

  6. Lin, X. et al. Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402, 761– 768 (1999).

    Article  ADS  CAS  PubMed  Google Scholar 

  7. Mayer, K. Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana . Nature 402, 769–777 (1999).

    Article  ADS  CAS  PubMed  Google Scholar 

  8. Mozo, T. et al. A complete BAC-based physical map of the Arabidopsis thaliana genome. Nature Genet. 22, 271– 275 (1999).

    Article  CAS  PubMed  Google Scholar 

  9. Marra, M. et al. A map for sequence analysis of the Arabidopsis thaliana genome. Nature Genet. 22, 265– 270 (1999).

    Article  CAS  PubMed  Google Scholar 

  10. Creusot, F. et al. The CIC library: a large insert YAC library for genome mapping in Arabidopsis thaliana. Plant J. 8, 763–770 (1995).

    Article  CAS  PubMed  Google Scholar 

  11. Ewens, W. J. et al. Genome mapping with anchored clones: theoretical aspects. Genomics 11, 799–805 (1991).

    Article  CAS  PubMed  Google Scholar 

  12. Venter, J. C., Smith, H. O. & Hood, L. A new strategy for sequencing. Nature 381, 364–366 (1996).

    Article  ADS  CAS  PubMed  Google Scholar 

  13. Choi, S., Creelman, R. A., Mullet, J. E. & Wing, R. Construction and characterization of a bacterial artificial chromosome library of Arabidopsis thaliana. Plant Mol. Biol. Rep. 13, 124–128 (1995).

    Article  Google Scholar 

  14. Mozo, T., Fischer, S., Shizuya, H. & Altmann, T. Construction and characterization of the IGF Arabidopsis BAC library. Mol. Gen. Genet. 258, 562–570 (1998).

    Article  CAS  PubMed  Google Scholar 

  15. Round, E. K., Flowers, S. K. & Richards, E. J. Arabidopsis thaliana centromere regions: genetic map positions and repetitive DNA structure. Genome Res. 7, 1045–1053 (1997).

    Article  CAS  PubMed  Google Scholar 

  16. Richards, E. J., Chao, S., Vongs, A. & Yang, J. Characterization of Arabidopsis thaliana telomeres isolated in yeast. Nucleic Acids Res. 20, 4039–4046 (1992).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Lister, C. & Dean, C. Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J. 4, 745–750 ( 1993).

    Article  CAS  Google Scholar 

  19. The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana . Nature 408, 796–815 (2000)..

    Article  ADS  Google Scholar 

  20. Beier, D., Stange, N., Gross, H. J. & Beier, H. Nuclear tRNA(Tyr) genes are highly amplified at a single chromosomal site in the genome of Arabidopsis thaliana. Mol. Gen. Genet. 225, 72–80 (1991).

    Article  CAS  PubMed  Google Scholar 

  21. Copenhaver, G. P. et al. Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286, 2468– 2474 (1999).

    Article  CAS  PubMed  Google Scholar 

  22. Conner, J. A., Conner, P., Nasrallah, M. E. & Nasrallah, J. B. Comparative mapping of the Brassica S locus region and its homolog in Arabidopsis: implications for the evolution of mating systems in the Brassicaceae. Plant Cell 10, 801– 812 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Rottmann, W. E. et al. 1-aminocyclopropane-1-carboxylate synthase in tomato is encoded by a multigene family whose transcription is induced during fruit and floral senescence. J. Mol. Biol. 222, 937– 961 (1991).

    Article  CAS  PubMed  Google Scholar 

  24. Salanoubat, M. et al. Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana. Nature 408, 820– 822 (2000).

    Article  CAS  PubMed  Google Scholar 

  25. Tabata, S. et al. Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana. Nature 408, 823– 826 (2000).

    Article  CAS  PubMed  Google Scholar 

  26. Chory, J. et al. National Science Foundation-sponsored workshop report: “The 2010 Project” functional genomics and the virtual plant. A blueprint for understanding how plants are built and how to improve them. Plant Physiol. 123, 423–426 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Lockhart, D. J. & Winzeler, E. A. Genomics, gene expression and DNA arrays. Nature 405, 827–836 (2000).

    Article  CAS  PubMed  Google Scholar 

  28. Ecker, J. R. PFGE and YAC analysis of the Arabidopsis genome. Methods 1, 186–194 ( 1990).

    Article  CAS  Google Scholar 

  29. Oefner, P. J. et al. Efficient random subcloning of DNA sheared in a recirculating point-sink flow system. Nucleic Acids Res. 24, 3879–3886 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Dietrich, F. S. et al. The nucleotide sequence of Saccharomyces cerevisiae chromosome V. Nature (Suppl.) 387, 78– 81 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Marziali, A., Willis, T. D., Federspiel, N. A. & Davis, R. W. An automated sample preparation system for large-scale DNA sequencing. Genome Res. 9, 457–462 ( 1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using Phred I. Accuracy assessment. Genome Res. 8, 175– 185 (1998).

    Article  CAS  PubMed  Google Scholar 

  33. Ewing, B. & Green, P. Base-calling of automated sequencer traces using Phred II. Error probabilities. Genome Res. 8, 186–194 ( 1998).

    Article  CAS  PubMed  Google Scholar 

  34. Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195–202. (1998).

    Article  CAS  PubMed  Google Scholar 

  35. Uberbacher, E. C. & Mural, R. J. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. Proc. Natl Acad. Sci. USA 88, 11261– 11265 (1991).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  36. Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).

    Article  CAS  PubMed  Google Scholar 

  37. Salzberg, S. L., Pertea, M., Delcher, A. L., Gardner, M. J. & Tettelin, H. Interpolated Markov models for eukaryotic gene finding. Genomics 59, 24 –31 (1999).

    Article  CAS  PubMed  Google Scholar 

  38. Lukashin, A. V. & Borodovsky, M. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res. 26, 1107–1115 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Hebsgaard, S. M. et al. Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res. 24, 3430–3452 ( 1996).

    Article  Google Scholar 

  40. Huang, X., Adams, M. D., Zhou, H. & Kerlavage, A. R. A tool for analyzing and annotating genomic sequences. Genomics 46, 37–45 (1997).

    Article  CAS  PubMed  Google Scholar 

  41. Frishman, D. & Mewes, H.-W. PEDANTic genome analysis. Trends Genet. 13, 415–416 (1997).

    Article  CAS  Google Scholar 

  42. Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 ( 1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Emanuelsson, O., Nielsen, H., Brunak, S. & von Heijne, G. Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300, 1005– 1016 (2000).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank K. Mayer and H. Schoof of MIPS for discussions; S. Rhee and E. Huala of TAIR for sequences for the RI markers; and R. Wells for editing the manuscript. This work was funded by National Science Foundation/US Department of Energy/US Department of Agriculture (NSF/DOE/USDA) grants to the SPP Consortium and TIGR.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Athanasios Theologis or Joseph R. Ecker.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Theologis, A., Ecker, J., Palm, C. et al. Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana . Nature 408, 816–820 (2000). https://doi.org/10.1038/35048500

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/35048500

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing