Abstract
Straightforward quantitative analyses of authentic texts have allowed linguists and
translation scholars to discern patterns in individual languages as well as features
which set translations apart from originals1,2
. A language can also be studied
statistically, an approach epitomized by the application of Zipf's Law3
, which states
that word-frequency distributions follow an almost identical curve regardless of
language. To date, no universal law governing the joint probability distribution of
words in two or more languages has been either proposed or observed. This study
identifies new universal behaviours which characterize the mutual overlaps between
a corpus of original English and three corpora of translated English. Specifically, it
suggests a remarkable similarity in (a) the number of types unique to each
translated corpus, and (b) the number of types common to the original-English
corpus and each of the translated corpora. We argue that these universal
behaviours can be used both to determine the ontological status of an unidentified
1
language (whether it is an original or a translation) and to identify the source
language of a translation.
| Original language | American English |
|---|---|
| Journal | Journal of Quantitive Linguistics |
| Issue number | 13(1) |
| State | Published - 2006 |
Fingerprint
Dive into the research topics of 'Identifying Universal Laws of Text Translation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver