DTscore Web Page
DTscore is a distance based tamdem duplication tree reconstruction algorithm. It is based on a simple tandem duplication model, which assumes unequal recombination (crossover) as the only duplication mechanism. All it takes as input is a distance matrix between copies. In this matrix, the rows and columns need to be ordered in the same way as the copies are ordered on the locus. DTscore can be applied to relatively large datasets (more than a hundred copies). Distances can be calculated using programs such as DNADIST (nucleotide sequences) or PROTDIST (protein sequences) from the PHYLIP package. Heterogeneous rates of substitution among sites can be dealt with using the GAMMA method, for example.
DTscore was developed by Olivier Elemento; comments are welcome.
Elemento O. and Gascuel O. 2002. A fast and accurate distance algorithm to reconstruct tandem duplication trees. Bioinformatics 18 :S92-S99.
Gascuel O., Hendy M., Jean-Marie A., McLachlan R. 2003. The Combinatorics of Tandem Duplication Trees. Systematic Biology, in press.
Elemento O., Gascuel O., Lefranc M.P. 2002. Reconstructing the duplication history of tandemly repeated genes . Molecular Biology and Evolution 19 :278-288.
Elemento O., Gascuel O., Lefranc M.P. 2001. Reconstruction de l'histoire de duplication de gènes répétés en tandem. In Actes des Journées Ouvertes Biologie Informatique Mathématiques. pp9-11
Executables, C source code, and test files
This is the C source code
of DTscore, compatible with most Unix systems, Windows and PowerMac.
On Unix, it can be compiled using the following command :
cc DTscore.c -o DTscore
It reads distance
matrix(ces) in PHYLIP square format, and return a single unrooted tree in standard Newick format.
This is the Windows binary executable of of DTscore. It runs on Windows 98 and latter (and possibly on Windows 95). The distance matrix must be in PHYLIP square format. If no argument is given, the program prompts for input and output file names.
This is the Linux binary executable of DTscore. It should run on any recent Linux distribution . The distance matrix must be in PHYLIP square format. If no argument is given, the program prompts for input and output file names.
This is the TRGV data set described in (Elemento et al. 2001, 2002). It is a small 9-copies data set that you can use to test the program.
Simply type :
This is the duplication tree that you should obtain with TEST DATASET above.