Comparative in silico analysis of EST-SSRs in angiosperm and gymnosperm tree genera21 Oct 2014
BACKGROUND: Simple Sequence Repeats (SSRs) derived from Expressed Sequence Tags (ESTs) belong to the expressed fraction of the genome and are important for gene regulation, recombination, DNA replication, cell cycle and mismatch repair. Here, we present a comparative analysis of the SSR motif distribution in the 5′UTR, ORF and 3′UTR fractions of ESTs across selected genera of woody trees representing gymnosperms (17 species from seven genera) and angiosperms (40 species from eight genera). RESULTS: Our analysis supports a modest contribution of EST-SSR length to genome size in gymnosperms, while EST-SSR density was not associated with genome size in neither angiosperms nor gymnosperms. Multiple factors seem to have contributed to the lower abundance of EST-SSRs in gymnosperms that has resulted in a non-linear relationship with genome size diversity. The AG/CT motif was found to be the most abundant in SSRs of both angiosperms and gymnosperms, with a relative increase in AT/AT in the latter. Our data also reveals a higher abundance of hexamers across the gymnosperm genera. CONCLUSIONS: Our analysis provides the foundation for future comparative studies at the species level to unravel the evolutionary processes that control the SSR genesis and divergence between angiosperm and gymnosperm tree species.