Identifying universals of text translation

Ido Kanter, Haggai Kfir, Brenda Malkiel, Miriam Shlesinger

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


Straightforward quantitative analyses of authentic texts have allowed linguists and translation scholars to discern patterns in individual languages as well as features which set translations apart from originals (Baker, 1993; Chesterman, 2004). A language can also be studied statistically, an approach epitomized by the application of Zipf's Law (Zipf, 1949), which states that word-frequency distributions follow an almost identical curve regardless of language. To date, no universal behaviour governing the joint probability distribution of words in two or more languages has been either proposed or observed. This study identifies new universals which characterize the mutual overlaps between a corpus of original English and three corpora of translated English. Specifically, it suggests a remarkable similarity in (a) the number of types unique to each translated corpus, and (b) the number of types common to the original-English corpus and each of the translated corpora. We argue that these universal behaviours can be used both to determine the ontological status of an unidentified language (whether it is an original or a translation) and to identify the source language of a translation.

Original languageEnglish
Pages (from-to)35-43
Number of pages9
JournalJournal of Quantitative Linguistics
Issue number1
StatePublished - 2006


Dive into the research topics of 'Identifying universals of text translation'. Together they form a unique fingerprint.

Cite this