Markov processes: Linguistics and Zipf's Law

Research output: Contribution to journalArticlepeer-review

90 Scopus citations

Abstract

It is shown that a 2-parameter random Markov process constructed with N states and biased random transitions gives rise to a stationary distribution where the probabilities of occurrence of the states, P(k), k=1,...,N, exhibit the following three universal behaviors which characterize biological sequences and texts in natural languages: (a) the rank-ordered frequencies of occurrence of words are given by Zipf's law P(k)1/kρ, where ρ(k) is slowly increasing for small k; (b) the frequencies of occurrence of letters are given by P(k)=A-Dln(k); and (c) long-range correlations are observed over long but finite intervals, as a result of the quasiergodicity of the Markov process.

Original languageEnglish
Pages (from-to)4559-4562
Number of pages4
JournalPhysical Review Letters
Volume74
Issue number22
DOIs
StatePublished - 1995

Fingerprint

Dive into the research topics of 'Markov processes: Linguistics and Zipf's Law'. Together they form a unique fingerprint.

Cite this