AsmVar: tools and exemplar data. ================================ Liu, S; Huang, S; Rao, J; Ye, W; GenomeDK Consortium; Krogh, A; Wang, J; Schierup, M, H; Villesen, P; Xu, X; Li, N; Kristiansen, K; Soerensen, T, I; Hansen, T; Pedersen, O; Brunak, S; Gupta, R; Rasmussen, S; Lund, O; Bolund, L; Borglum, A, D; Eiberg, H; Flindt, E, N; Xu, R; Sun, J; Liu, H; Besenbacher, S; Grove, J; Als, T, D; Lescai, F; Mailund, T; Friborg, R, M; Pedersen, C, N; Chang, Y; Li, S; Guo, X; Cao, H; Ye, C; Maretty, L; Sibbesen, J, A; Albrechtsen, A; Bork-Jensen, J; Have, C, T; Izarzugaza, J, M; Belling, K; Yadav, R (2015): AmsVar: tools and exemplar data. GigaScience Database. http://dx.doi.org/10.5524/100173 Summary: -------- This software has been released under the MIT License Copyright 2014-2015. Here we present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variants and novel sequence in population-scale de novo assemblies at single nucleotide resolution. You may also learn additional information via the "DemoPipelineGuideline.sh" in the AsmVar software package hosted in Github (https://github.com/ShujiaHuang/AsmVar). Files: ------ AsmVar-master.zip - GitHub repository archived copy taken at time of publication, please see current project page for most recent updates https://github.com/ShujiaHuang/AsmVar example_data_NA12878.zip - compressed archive of example data and example results files for the public sample NA12878, see below for component file descriptions. Files within example_data_NA12878.zip archive: ---------------------------------------------- NA12878.APLG.20.vcf - raw variation calls BEFORE realignment using the AGE algorithm from Module A "Global assemblly-vs-assembly alignment and local realignment". NA12878.APLG.AltAlign.20.vcf.gz - raw variation calls AFTER realignment using the AGE algorithm from Module A "Global assemblly-vs-assembly alignment and local realignment". NA12878.APLG.Genotyping.20.vcf - contains the variants and the genotypes based on Module C. NA12878.APLG.Recal.20.vcf - adds variant quality score to the variants based on Module D. NA12878.APLG.recal.fs.Figure.png - displays the distribution of the technical feature measurements as a function of the quality score for both data sets including the postive training set and the negative training set. NA12878.APLG.Recal.ROC.Figure.pdf - displays the ROC curve and you should select the threshold of the quality score based on the curve. NA12878.APLG.svqc.20.vcf - contains the final variantion set from AsmVar after filtration of unqualified variants.