Abstract
Clustering has become one of the fundamental tools for analyzing gene expression and producing gene classifications. Clustering models enable finding patterns of similarity in order to understand gene function, gene regulation, cellular processes and sub-types of cells. The clustering results however have to be combined with sequence data or knowledge about gene functionality in order to make biologically meaningful conclusions. In this work, we explore a new model that integrates gene expression with sequence or text information.
Similar content being viewed by others
References
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein, Proc. Natl. Acad. Sci. USA 95, 14863 (1998).
T. R. Golub et al., Science 286, 531 (1999).
J. Khan et al., Nature Medicine 7, 673 (2001).
D. Stekel, Microarray Bioinformatics (Cambridge Univ., Cambridge, 2003).
F. H. C. Crick, Symp. Soc. Exp. Biol. XII, 139 (1958).
F H. C. Crick, Nature 227, 561 (1970).
J. Ihmels et al., Nature Genetics 31, 370 (2002).
R. B. Altman and S. Raychaudhuri, Curr. Opin. Struct. Biol. 11, 340 (2001).
S. Raychaudhuri, J. T. Chang, F Imam, and R. B. Altman, Nucl. Acids Res. 31, 4553 (2003).
M. Angelova and C. Myers, J. Phys. Conf. Ser. 128, 012030 (2008).
J. S. Verducci et al., Physiol. Genomics 25, 355 (2006).
A. P. Demster, N. M. Laird, and D. B. Rubin, J. R. Stat. Soc. Ser. B 39(1), 1 (1977).
C. E. Shannon, Bell Syst. Tech. J. 27, 389, 623 (1948).
D. Jiang, C. Tang, and A. Zhang, IEEE Trans. Knowl. Data Eng. 16, 1370 (2004).
E. Parzen, Ann. Math. Stat. 33, 1065 (1962).
F.Marincs, I. W. Manfield, J. A. Stead, et al., Biochem. J. 396, 227 (2006).
H. Shatkay, S. Edwards, and M. Boguski, IEEE Inte11. Syst. 17 (2), 45, (2002).
G. Gazdar and C. Mellish, Natural Language Processing in Prolog (Addison-Wesley, Apr. 1989).
G. Aston and L. Burnard, The BNC Handbook: Exploring the British National Corpus with SARA (Edinburgh Univ., Edinburgh, 1998).
G. Leech, P. Rayson, and A. Wilson, Word Frequencies in Written and Spoken English: Based on the British National Corpus (Longman, London, 2001}); http://ucrel.lancs.ac.uk/bncfreq/
P. Rayson and R. Garside, in Proc. of the ACL Workshop on Comparing Corpora 2000, Hong Kong, Oct. 2000, p. 1.
T. Dunning, Computat. Linguistics 19, 61 (1993).
R. C. Moore, in Proc. of the 2004 Conf. on Empirical Methods in Natural Language Processing (EMNLP’04), Barselona, 2004, p. 333.
M. P. Oakes and M. Farrow, Lit. Linguist. Computing 22, 85 (2007).
G. Karypis, Technical Report No. 02-017, Univ. of Minnesota (2002); http://wwwusers. cs.umn.edu/karypis/cluto/
Author information
Authors and Affiliations
Corresponding author
Additional information
The text was submitted by the author in English.
Rights and permissions
About this article
Cite this article
Angelova, M., Ellman, J. Combined clustering models for the analysis of gene expression. Phys. Atom. Nuclei 73, 242–246 (2010). https://doi.org/10.1134/S1063778810020067
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1063778810020067