Analysis of DNA sequences using methods of statistical physics

S. V. Buldyrev, N. V. Dokholyan, A. L. Goldberger, S. Havlin, C. K. Peng, H. E. Stanley, G. M. Viswanathan

Research output: Contribution to journalArticlepeer-review

150 Scopus citations

Abstract

We review the present status of the studies of DNA sequences using methods of statistical physics. We present evidence, based on systematic studies of the entire GenBank database, supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range, i.e., base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the DNA. We discuss the mechanisms of molecular evolution that may lead to the presence of long-range power-law correlations in noncoding DNA and their absence in coding DNA. One such mechanism is the simple repeat expansion, which recently has attracted the attention of the biological community in conjunction with genetic diseases. We also review new tools - e.g., detrended fluctuation analysis - that are useful for studies of complex hierarchical DNA structure.

Original languageEnglish
Pages (from-to)430-438
Number of pages9
JournalPhysica A: Statistical Mechanics and its Applications
Volume249
Issue number1-4
DOIs
StatePublished - 2 Jan 1998

Bibliographical note

Funding Information:
We are grateful to many individuals, including R. Mantegna, M.E. Matsa, S.M. Ossadnik, F. Sciortino and M. Simons for major contributions to those results reviewed here that represent collaborative research efforts. We also wish to thank C. Cantor, C. DeLisi, M. Frank-Kamenetskii, A.Yu. Grosberg, I. Labat, L. Liebovitch, G.S. Michaels, P. Munson, R. Nussinov, R.D. Rosenberg, E.I. Shakhnovich, M.F. Shlesinger and E.N. Trifonov for valuable discussions. Partial support was provided by the National Science Foundation, National Institutes of Health (Human Genome Project), the G. Harold and Leila Y. Mathers Charitable Foundation, the Israel–USA Binational Science Foundation, Israel Academy of Sciences, and (to C.-K.P.) by an NIH/NIMH First Award.

Funding

We are grateful to many individuals, including R. Mantegna, M.E. Matsa, S.M. Ossadnik, F. Sciortino and M. Simons for major contributions to those results reviewed here that represent collaborative research efforts. We also wish to thank C. Cantor, C. DeLisi, M. Frank-Kamenetskii, A.Yu. Grosberg, I. Labat, L. Liebovitch, G.S. Michaels, P. Munson, R. Nussinov, R.D. Rosenberg, E.I. Shakhnovich, M.F. Shlesinger and E.N. Trifonov for valuable discussions. Partial support was provided by the National Science Foundation, National Institutes of Health (Human Genome Project), the G. Harold and Leila Y. Mathers Charitable Foundation, the Israel–USA Binational Science Foundation, Israel Academy of Sciences, and (to C.-K.P.) by an NIH/NIMH First Award.

FundersFunder number
NIH/NIMH
National Science Foundation
National Institutes of Health
G. Harold and Leila Y. Mathers Charitable Foundation
Israel Academy of Sciences and Humanities

    Keywords

    • DNA
    • Long-range correlations
    • Mutations

    Fingerprint

    Dive into the research topics of 'Analysis of DNA sequences using methods of statistical physics'. Together they form a unique fingerprint.

    Cite this