Supporting data for "Clusterflock: A Flocking Algorithm for Isolating Congruent Phylogenomic Datasets" ====================================================================================================== Narechania, A; Baker, R; DeSalle, R; Mathema, B; Kolokotronis, S; Kreiswirth, B; Planet, P J. (2016) GigaScience Database. http://dx.doi.org/10.5524/100247 Summary: ------- Clusterflock is an open source tool that can be used to discover horizontally transferred genes, recombined areas of chromosomes, and the phylogenetic .core. of a genome. Though we use it in an evolutionary context, it is generalizable to any clustering problem. Users can write extensions to calculate any distance metric on the unit interval and use these distances to .flock. any type of data. Clusterflock continues to be under active development and users are encouraged to seek the most upto date version of the software from the github page https://github.com/narechan/clusterflock Files: ------ clusterflock_data_archive.tar.gz - This archive contains sequences, LD matrices, and R scripts used to construct the simulation curves in Figures 3 and 4, and the S. aureus data used to test clusterflock's performance on an known large-scale hybridization event. clusterflock-master.zip - GitHub archival copy, downloaded 30-09-2016. Please see the GitHub for the most recent updates https://github.com/narechan/clusterflock flocking_vid.mp4 - Auto-detected Flocks per Frame. Here we show the average number of flocks detected at any given point along a 1000 frame simulation for the S. aureus simulation. The OPTICS spatial clustering algorithm was used to auto-detect flocks in the 100 replicate frames at each point along simulation. Seed clusters form very early and later move to intercept one another. Congruent flocks will absorb one another while incongruent flocks repel.