A-to-I RNA editing is apparently the most abundant post-transcriptional modification in primates. Virtually all editing sites reside within the repetitive Alu SINEs. Alu sequences are the dominant repeats in the human genome and thus are likely to pair with neighboring reversely oriented repeats and form double-stranded RNA structures that are bound by ADAR enzymes. Editing levels vary considerably between different adenosine sites within Alu repeats. Part of the variability has been explained by local sequence and structural motifs. Here, we focus on global characteristics that affect the editability at the Alu level. We use large RNA-seq data sets to analyze the editing levels in 203 798 Alu repeats residing within human genes. The most important factor affecting Alu editability is its distance to the closest reversely oriented neighbor-average editability decays exponentially with this distance, with a typical distance of ∼800 bp. This effect alone accounts for 28% of the total variance in editability. In addition, the number of Alu repeats of the same and reverse strand in the genomic vicinity, the expressed strand of the Alu, Alu's length and subfamily and the occurrence of reversely oriented neighbor in the same intron\exon all contribute, to a lesser extent, to the Alu editability.
Bibliographical noteFunding Information:
European Research Council ; I-CORE Program of the Planning and Budgeting Committee and the Israel Science Foundation [41/11]; Legacy Heritage Biomedical Science Partnership, Israel Science Foundation [1466/10]; Israel Science Foundation [379/12 to E.E.]. Source of open access funding: Israel Science Foundation [379/12 to E.E.]. Conflict of interest statement. None declared.