Abstract
We review the present status of the studies of DNA sequences using methods of statistical physics. We present evidence, based on systematic studies of the entire GenBank database, supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range, i.e., base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the DNA. We discuss the mechanisms of molecular evolution that may lead to the presence of long-range power-law correlations in noncoding DNA and their absence in coding DNA. One such mechanism is the simple repeat expansion, which recently has attracted the attention of the biological community in conjunction with genetic diseases. We also review new tools - e.g., detrended fluctuation analysis - that are useful for studies of complex hierarchical DNA structure.
Original language | English |
---|---|
Pages (from-to) | 430-438 |
Number of pages | 9 |
Journal | Physica A: Statistical Mechanics and its Applications |
Volume | 249 |
Issue number | 1-4 |
DOIs | |
State | Published - 2 Jan 1998 |
Bibliographical note
Funding Information:We are grateful to many individuals, including R. Mantegna, M.E. Matsa, S.M. Ossadnik, F. Sciortino and M. Simons for major contributions to those results reviewed here that represent collaborative research efforts. We also wish to thank C. Cantor, C. DeLisi, M. Frank-Kamenetskii, A.Yu. Grosberg, I. Labat, L. Liebovitch, G.S. Michaels, P. Munson, R. Nussinov, R.D. Rosenberg, E.I. Shakhnovich, M.F. Shlesinger and E.N. Trifonov for valuable discussions. Partial support was provided by the National Science Foundation, National Institutes of Health (Human Genome Project), the G. Harold and Leila Y. Mathers Charitable Foundation, the Israel–USA Binational Science Foundation, Israel Academy of Sciences, and (to C.-K.P.) by an NIH/NIMH First Award.
Keywords
- DNA
- Long-range correlations
- Mutations