Simple sequence repeats in Escherichia coli: Abundance, distribution, composition, and polymorphism

Riva Gur-Arie, Cyril J. Cohen, Yuval Eitan, Leora Shelef, Eric M. Hallerman, Yechezkel Kashi

Research output: Contribution to journalArticlepeer-review

188 Scopus citations

Abstract

Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.

Original languageEnglish
Pages (from-to)62-71
Number of pages10
JournalGenome Research
Volume10
Issue number1
StatePublished - Jan 2000
Externally publishedYes

Fingerprint

Dive into the research topics of 'Simple sequence repeats in Escherichia coli: Abundance, distribution, composition, and polymorphism'. Together they form a unique fingerprint.

Cite this