Linguistic features of noncoding DNA sequences

R. N. Mantegna, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C. K. Peng, M. Simons, H. E. Stanley

Research output: Contribution to journalArticlepeer-review

251 Scopus citations

Abstract

We extend the Zipf approach to analyzing linguistic texts to the statistical study of DNA base pair sequences and find that the noncoding regions are more similar to natural languages than the coding regions. We also adapt the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and demonstrate that noncoding regions in eukaryotes display a smaller entropy and larger redundancy than coding regions, supporting the possibility that noncoding regions of DNA may carry biological information.

Original languageEnglish
Pages (from-to)3169-3172
Number of pages4
JournalPhysical Review Letters
Volume73
Issue number23
DOIs
StatePublished - 5 Dec 1994
Externally publishedYes

Fingerprint

Dive into the research topics of 'Linguistic features of noncoding DNA sequences'. Together they form a unique fingerprint.

Cite this