Benchmark Proposal #1

Benchmark for quality assessment of de novo repeat identification and genome annotation


Florian Maumus and Hadi Quesneville



Benchmark_Proposal_URGI   (revision 22-Aug-14)



This dataset aims to help assessing the sensitivity and specificity of de novo repeat detection and annotation tools using the A. thaliana genome. It uses the coverage of known repeats as a proxy for sensitivity and the coverage of a simulated genome as a proxy of specificity.



Description Comments
Type Real and simulated
Primary Uses Measure sensitivity and specificity
Taxa Arabidopsis thaliana
Source F. Maumus
Documentation In progress
Version v0.1



Please see the original document for details (Word format) : Benchmark_Proposal_URGI .



Files are available for download from here : .


1.              Maumus F, Quesneville H: Deep investigation of Arabidopsis thaliana junk DNA reveals a continuum between repetitive elements and genomic dark matter. PLoS One 2014, 9(4):e94101.

2.              Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J: Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 2005, 110(1-4):462-467.

3.              Buisine N, Quesneville H, Colot V: Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets. Genomics 2008, 91(5):467-475.

4.              Smit AFA, Hubley R, Green P: RepeatMasker Open-3.0. http://wwwrepeatmaskerorg 1996-2010.

5.              Quesneville H, Bergman CM, Andrieu O, Autard D, Nouaud D, Ashburner M, Anxolabehere D: Combined evidence annotation of transposable elements in genome sequences. PLoS Comput Biol 2005, 1(2):166-175.

6.              Flutre T, Duprat E, Feuillet C, Quesneville H: Considering transposable element diversification in de novo annotation approaches. PLoS One 2011, 6(1):e16526.

One Response to Benchmark Proposal #1

  1. Douglas Hoen says:

    Thanks Florian! Are the “blue files” available for download?

