TY - JOUR
T1 - Identifying universals of text translation
AU - Kanter, Ido
AU - Kfir, Haggai
AU - Malkiel, Brenda
AU - Shlesinger, Miriam
PY - 2006
Y1 - 2006
N2 - Straightforward quantitative analyses of authentic texts have allowed linguists and translation scholars to discern patterns in individual languages as well as features which set translations apart from originals (Baker, 1993; Chesterman, 2004). A language can also be studied statistically, an approach epitomized by the application of Zipf's Law (Zipf, 1949), which states that word-frequency distributions follow an almost identical curve regardless of language. To date, no universal behaviour governing the joint probability distribution of words in two or more languages has been either proposed or observed. This study identifies new universals which characterize the mutual overlaps between a corpus of original English and three corpora of translated English. Specifically, it suggests a remarkable similarity in (a) the number of types unique to each translated corpus, and (b) the number of types common to the original-English corpus and each of the translated corpora. We argue that these universal behaviours can be used both to determine the ontological status of an unidentified language (whether it is an original or a translation) and to identify the source language of a translation.
AB - Straightforward quantitative analyses of authentic texts have allowed linguists and translation scholars to discern patterns in individual languages as well as features which set translations apart from originals (Baker, 1993; Chesterman, 2004). A language can also be studied statistically, an approach epitomized by the application of Zipf's Law (Zipf, 1949), which states that word-frequency distributions follow an almost identical curve regardless of language. To date, no universal behaviour governing the joint probability distribution of words in two or more languages has been either proposed or observed. This study identifies new universals which characterize the mutual overlaps between a corpus of original English and three corpora of translated English. Specifically, it suggests a remarkable similarity in (a) the number of types unique to each translated corpus, and (b) the number of types common to the original-English corpus and each of the translated corpora. We argue that these universal behaviours can be used both to determine the ontological status of an unidentified language (whether it is an original or a translation) and to identify the source language of a translation.
UR - http://www.scopus.com/inward/record.url?scp=43249166704&partnerID=8YFLogxK
U2 - 10.1080/09296170500500983
DO - 10.1080/09296170500500983
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:43249166704
SN - 0929-6174
VL - 13
SP - 35
EP - 43
JO - Journal of Quantitative Linguistics
JF - Journal of Quantitative Linguistics
IS - 1
ER -