TY - JOUR
T1 - Identifying Universal Laws of Text Translation
AU - Kanter, I.
AU - Kfir, H
AU - Malkiel, B
AU - Shlesinger, M
PY - 2006
Y1 - 2006
N2 - Straightforward quantitative analyses of authentic texts have allowed linguists and
translation scholars to discern patterns in individual languages as well as features
which set translations apart from originals1,2
. A language can also be studied
statistically, an approach epitomized by the application of Zipf's Law3
, which states
that word-frequency distributions follow an almost identical curve regardless of
language. To date, no universal law governing the joint probability distribution of
words in two or more languages has been either proposed or observed. This study
identifies new universal behaviours which characterize the mutual overlaps between
a corpus of original English and three corpora of translated English. Specifically, it
suggests a remarkable similarity in (a) the number of types unique to each
translated corpus, and (b) the number of types common to the original-English
corpus and each of the translated corpora. We argue that these universal
behaviours can be used both to determine the ontological status of an unidentified
1
language (whether it is an original or a translation) and to identify the source
language of a translation.
AB - Straightforward quantitative analyses of authentic texts have allowed linguists and
translation scholars to discern patterns in individual languages as well as features
which set translations apart from originals1,2
. A language can also be studied
statistically, an approach epitomized by the application of Zipf's Law3
, which states
that word-frequency distributions follow an almost identical curve regardless of
language. To date, no universal law governing the joint probability distribution of
words in two or more languages has been either proposed or observed. This study
identifies new universal behaviours which characterize the mutual overlaps between
a corpus of original English and three corpora of translated English. Specifically, it
suggests a remarkable similarity in (a) the number of types unique to each
translated corpus, and (b) the number of types common to the original-English
corpus and each of the translated corpora. We argue that these universal
behaviours can be used both to determine the ontological status of an unidentified
1
language (whether it is an original or a translation) and to identify the source
language of a translation.
UR - https://scholar.google.co.il/scholar?q=Identifying+Universal+Laws+of+Text+Translation&btnG=&hl=en&as_sdt=0%2C5
M3 - Article
JO - Journal of Quantitive Linguistics
JF - Journal of Quantitive Linguistics
IS - 13(1)
ER -