The Avian phylogenomics project: Comparative phylogenomics. ============================================================ The files deposited in GigaDB are the results and intermediate files associated with the phylogenomics paper published by the Avian Genome Consortium and its members in the area of comparative phylogenomics. These data are provided for transparency of research findings and to aid others in further studies, please cite the GigaDB reference below if you use these data in full or part: Jarvis, ED; Mirarab, S; Aberer, A; Houde, P; Li, C; Ho, S; Faircloth, BC; Nabholz, B; Howard, JT; Suh, A; Weber, CC; Fonseca, RR; Alfaro-Nunez, A; Narula, N; Liu, L; Burt, D; Ellegren, H; Edwards, SV; Stamatakis, A; Mindell, DP; Cracraft, J; Braun, EL; Warnow, T; Jun, W; Gilbert, MTP; Zhang, G (2014): Phylogenomic analyses data of the avian phylogenomics project. GigaScience Database. http://dx.doi.org/10.5524/101041 The data are divided into 5 directories on the FTP server, each directory can also be downloaded in its entirety as a compressed archive file (.tar.gz). Most of the subdirectories also contain another readme file describing the content of that directory, although for full details one should refer to the associated manuscripts (that can be found on the GigaDB page http://dx.doi.org/10.5524/101041). 1 - Concatenated_alignments 2 - FASTA_files_of_loci_datasets 3 - Newick_tree_files 4 - Scripts 5 - Transposable_elements ------------------------------------------------------------ 1- Concatenated_alignments ========================== These are the concatenated alignments that have been used in various ExaML and RAxML analysis. ./Clocklike exon alignments ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Concatenated c12 DNA sequence alignments from the 1156 clocklike genes used for the dating analyses: /c12.DNA.alignment.1156.clockliketxt.txt /c12.DNA.alignment.clocklike.readme.txt /c12.DNA.alignment.1156.clocklike.zip /c12.DNA.alignment.clocklike.txt.zip ./Indel_sequence_alignments ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Indel encoding for exons, introns, and UCEs is given. A README file describes the content: /README /exon.indel.matrix.tar.gz /intron400.nuc-indel.matrices.tar.gz /intron.exon.uce.indel.matrices.tar.gz /intron.indel.matrices.tar.gz /uce.indel.matrices.tar.gz ./RAxML Concatenation ~~~~~~~~~~~~~~~~~~~~~ UCE concatenated alignments with and without the alligator: /uce-filtered-alignments-w-gator-concatenated.phylip.gz /uce-filtered-alignments-without-gator-concatenated.phylip.gz ./examl-concatenations ~~~~~~~~~~~~~~~~~~~~~~ Alignments used for all the ExaML concatenation analyses. .phy files are the alignments and .model files give the partitions: /Exon.AminoAcid.ExaML.partitioned/aln.model /Exon.AminoAcid.ExaML.partitioned/aln.phy /Exon.c123.ExaML.partitioned/exon-c123-rebuilt.model /Exon.c123.ExaML.partitioned/exon-c123-rebuilt.phy /Exon.c123.ExaML.unpartititoned/exon-c123-rebuilt.phy /Exon.c123-RY.ExaML.unpartitioned/aln.model /Exon.c123-RY.ExaML.unpartitioned/aln.phy /Exon.c12.ExaML.partitioned/exon-c12-rebuilt.model /Exon.c12.ExaML.partitioned/exon-c12-rebuilt.phy /Exon.c12.ExaML.unpartitioned/aln.phy /Exon.c1.ExaML.unpartitioned/aln.phy.c1.phy /Exon.c2.ExaML.unpartitioned/aln.phy.c2.phy /Exon.c3.ExaML.unpartitioned/aln.phy.c3.phy /Intron/aln.model /Intron/aln.phy /TEIT.RAxML/intron.exon.uce.indels.concat.noout.len1.phy /TENT+c3.ExaML/exon-c123.model /TENT+c3.ExaML/exon-c123.phy /TENT.ExaML.100%/ALL.aln.reduced /TENT.ExaML.100%/ALL.part.reduced /TENT.ExaML.25%/ALL.aln /TENT.ExaML.25%/ALL.part /TENT.ExaML.50%/ALL.aln /TENT.ExaML.50%/ALL.part /TENT+outgroup.ExaML/aln.model /TENT+outgroup.ExaML/aln.phy /TENT.RAxML.75%/ALL.aln /TENT.RAxML.75%/ALL.part /WGT.ExaML/ALL.aln ./High_and_low_variance_exons_and_introns ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Alignments of exons and introns with highest and lowest variance, as described in the paper. /High_variance_exons/Exon.heterogenous.c12 /High_variance_exons/Exon.heterogenous.c123 /High_variance_introns/concatIntronNooutMSAhigh.fasta.gz /Low_variance_exons/Exon.homogenous.c12 /Low_variance_exons/Exon.homogenous.c123 /Low_variance_introns/concatIntronNooutMSAlow.fasta.gz ------------------------------------------------------------ 2- FASTA_files_of_loci_datasets =============================== This directory includes filtered and unfiltered alignments of individual loci. ./Filtered_sequence_alignments ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This directory includes filtered alignments. These are alignments used in all the phylogenetic analyses. /2516_Introns/introns-filtered-sate-alignments-with-and-without-outgroups.tar.gz # Includes both alignments with and without outgroups /3769_UCEs_+_1000bp_flanking/uce-probes-used.fasta.gz # Probes targeting UCE loci shared among vertebrate taxa /3769_UCEs_+_1000bp_flanking/uce-assembled-loci-from-probe-matches.tar # UCE loci assembled from probe+flank slices from each genome /3769_UCEs_+_1000bp_flanking/uce-raw-genome-slices-of-probe-matches.tar # Probe+flank slices around locations matching probes targeting UCE loci /3769_UCEs_+_1000bp_flanking/uce-raw-lastz-results-of-probe-matches.tar # LASTZ results of mapping probes onto genome assemblies /3769_UCEs_+_1000bp_flanking/uce-filtered-alignments-w-gator.tar.gz # UCE individual alignments without outgroups /3769_UCEs_+_1000bp_flanking/uce-filtered-alignments-without-gator.tar.gz # UCE individual alignments with outgroups /8295_Amino_Acids/pep-filtered-sate-alignments-noout.tar.gz # Amino acid alignments with outgroups removed /8295_Amino_Acids/pep-filtered-sate-alignments-original.zip # Amino acid alignments with outgroups included /8295_Exons/pep2cds-filtered-sate-alignments-original.zip # DNA alignments (Amino acid alignments translated to DNA) with outgroups included /8295_Exons/pep2cds-filtered-sate-alignments-noout.tar.gz # DNA alignments (Amino acid alignments translated to DNA) without outgroups /8295_Exons/42-exon-genes-removed.txt /Supergenes_generated_from_statistical_binning/supergene-alignments.tar.bz2 # supergene alignments with partition files showing genes put in each bin and their boundaries in the concatenated alignment ./Unfiltered_sequence_alignments ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This directory includes unfiltered alignments. These alignments are Not used phylogenetic analyses, but are provided for completeness. /uce-unfiltered-alignments-w-gator.tar.gz # UCE alignments with outgroup /pep2cds-unfiltered-alignemtns-original.zip # unfiltered SATe+Prank alignments of Exons (translated from AA to DNA) used for the filtering step /pep-unfiltered-alignemtns-original.zip # unfiltered SATe+Prank alignments of Exons (AA) used for the filtering step /introns-unfiltered-alignments-original.zip # intron SATe alignments before filtering with outgroups included /introns-unfiltered-alignments-noout.zip # intron SATe alignments before filtering with outgroups included /Unfiltered_whole_genome_alignment_README.txt # Reference for downloading unfiltered whole genome alignment ------------------------------------------------------------ 3- Newick_tree_files ==================== /Newick_gene_tree_files ~~~~~~~~~~~~~~~~~~~~~~~ The subdirectories here provide all the gene trees and super gene trees (i.e. binned) used in the paper, see the README file in this directory for more details about each of these files. /README /Bootstrap_replicates_of_ML_gene_trees/bootstrap-genetrees.tar.gz /Bootstrap_replicates_of_supergene_trees_used_in_MP-EST_analyses/bootstrap-supergenetrees.tar.gz /ML-bestML-supergene_trees_used_in_MP-EST_analyses/ml-supergenetrees.tar.gz /Partition_files_loci_bins_for_MP-EST_analyses/partitions-supergenetrees.tar.gz /ML-bestML-gene_trees/ml-genetrees.tar.gz /Newick_species_tree_files ~~~~~~~~~~~~~~~~~~~~~~~~~~ Newick files for 35 species trees using different genomic partitions and methods. Individual bootstrap files are also given in an archive file /Exon.AminoAcid.ExaML.partitioned.tre /Exon.c1.ExaML.unpartitioned.tre /Exon.c2.ExaML.unpartitioned.tre /Exon.c3.ExaML.unpartitioned.tre /Exon.c12.ExaML.partitioned.tre /Exon.c12.ExaML.unpartitioned.tre /Exon.c123-RY.ExaML.unpartitioned.tre /Exon.c123.ExaML.partitioned.tre /Exon.c123.ExaML.unpartitioned.tre /Exon.RAxML.Heterogenous.c12.tre /Exon.RAxML.Heterogenous.c123.tre /Exon.RAxML.Homogenous.c12.tre /Exon.RAxML.Homogenous.c123.tre /Intron.MP-EST.binned.tre /Intron.MP-EST.unbinned.tre /Intron.RAxML.partitioned.tre /Intron.RAxML.unpartitioned.tre /Literature.DNAxDNA.SibleyAhlquist.tre /Literature.Mitochondrial.Pacheco.tre /Literature.Morphology.LivezeyZusi.tre /Literature.Nuclear_genes.Hackett.tre /TEIT.RAxML.tre /TENT.ExaML.25%.tre /TENT.ExaML.50%.tre /TENT.ExaML.75%.tre /TENT.ExaML.tre /TENT.MP-EST.binned.tre /TENT.MP-EST.unbinned.tre /TENT+c3.ExaML.tre /TENT+outgroup.ExaML.tre /UCE.RAxML.unpartitioned.tre /WGT.ExaML.alternative.tre /WGT.ExaML.best.tre /Intron.RAxML.Homogenous.tre /Intron.RAxML.Heterogenous.tre /bootstrap-replicates.zip # the individual bootstrap replicates for the ML trees that we estimated /Newick_timetree_species_files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Newick files of the 11 timetrees (chronograms) listed in Table 1. /Chronogram01.TENT.ExAML.tre /Chronogram02.TENT.ExAML.max865.tre /Chronogram03.TENT.ExAML.Allig247.tre /Chronogram04.TENT.ExAML.no-outgroup.tre /Chronogram05.TENT.ExAML.no-outgroup.max865.tre /Chronogram06.TENT.MP-EST.tre /Chronogram07.WGT.ExAML.alternative.tre /Chronogram08.WGT.ExAML.best.tre /Chronogram09.Intron.ExAML.unpartitioned.tre /Chronogram10.UCE.RAxML.tre /Chronogram11.Exon.c123.RaXML.partitioned.tre ------------------------------------------------------------ 4- Scripts ========== A selection of scripts used in the generation of this dataset. Files ----- ./Scripts/namemap/name.csv # used for mapping species name from 5 letter codes to complete names. ./Scripts/namemap/mapsequences.py # used for mapping species name from 5 letter codes to complete names. ./Scripts/namemap/README.mapping # details of namemap scripts ./Scripts/filtering-dna/README.txt # used for filtering DNA alignments ./Scripts/filtering-dna/filter_alignment_fasta_v1.3B.pl # used for filtering DNA alignments ./Scripts/filtering-dna/filter_alignment_maf_v1.1B.pl # used for filtering DNA alignments ./Scripts/filtering-aa/spotProblematicSeqsModules.py # used for filtering AA alignments ./Scripts/filtering-aa/spotProblematicSeqsBase-gaps_allSeqs.py # used for filtering AA alignments ./Scripts/filtering-aa/spotProblematicSeqsBase-W12S4.py # used for filtering AA alignments ./Scripts/filtering-aa/blosum62.txt Scripts # used for handling indell analysis ./Scripts/indels/avianphylogenome_indelanalysis_20141112.tar.gz # used for handling indell analysis ------------------------------------------------------------ 5- Transposable_elements ======================== These are the transposable elements of owl, as described in the paper. Files ----- ./Transposable_elements/owl_TE_marker_Table.txt