Resources:
A list of different ressources developped during the Phyl-ARIANE project.
Databases
HOGENOM
|
|
HOGENOM is a database of homologous genes from fully sequenced organisms (bacteria, archeaea and eukarya), structured under ACNUC sequence database management system. It allows to select sets of homologous genes among species, and to visualize multiple alignments and phylogenetic trees. It is as well possible to search for orthologous genes in a wide range of taxons. Thus HOGENOM is particularly useful for comparative sequence analysis, phylogeny and molecular evolution studies. More generaly, HOGENOM gives an overall view of what is known about a peculiar gene family. Note that HOGENOM is splitted into two databases: HOGENOM contains the protein sequences while HOGENOMDNA contains the nucleotide sequences. Protein sequences of HOGENOM have been generated by translating the CDS of HOGENOMDNA and using associated cross-references to generate the annotations. |
ORTHOMAM
|
|
Molecular data play a key role in phylogenetic inference. Mammalian systematics provides us with a clear example, with several previously open evolutionary questions now able to be answered. However, molecular studies have until present used only a handful of classic markers and have not attempted to utilise the information contained within the increasingly large pool of mammalian genome sequences. The identification and utilisation of potentially new informative markers from this pool can help to further resolve the mammalian phylogenetic tree. The EnsEMBL database was used to decide on a set of single-copy orthologous markers from those mammalian genomes available. Exons of reasonable length for further amplification from genomic DNA and sequencing in additional species were then selected. The phylogenetic utility and the evolutionary characteristics of these candidate markers were then evaluated using a homemade bioinformatics pipeline. The resulting OrthoMaM database can be interrogated through this website. The current OrthoMaM release is based on EnsEMBL v54. It now includes 6447 exons and 12958 CDS candidate markers for up to 33 taxa. |
Software
MPR
|
Tree reconciliation is a computational approach to explain the discrepancy between a gene and a species tree (or between a parasite and a host tree) by postulating a number of evolutionary events such as speciations, duplications, transfers and losses. This approach has important applications in ecology and biogeography but also in genomics to estimate evolutionary scenarios that shaped the content of contemporary genomes and to infer orthology relationships between sequences. MPR is a fast and exact algorithm for reconciling a gene tree and a dated species tree. It optimizes a parsimony criterion according to a model that incorporates all the above-mentioned events (speciation, duplication, loss and transfer). In our experimental study (based on simulated data), parsimony is shown to be an effective criterion to recover the macro-evolutionary events in realistic zones of the parameter space. |
PhySIC_IST
|
|
Supertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescence. When topological conflicts occur among source trees, liberal methods infer supertrees containing the most frequent alternative, while veto methods infer supertrees not contradicting any source tree, i.e. discard all conflicting resolutions. When the source trees host a significant number of topological conflicts or have a small taxon overlap, supertree methods of both kinds can propose poorly resolved, hence uninformative, supertrees. To overcome this problem, PhySIC_IST propose to infer non-plenary supertrees, i.e. supertrees that do not necessarily contain all the taxa present in the source trees, discarding those whose position greatly differs among source trees or for which insufficient information is provided. |
ScripTree
|
|
There is a large amount of tools for interactive display of phylogenetic trees. However, there is a shortage of tools for the automation of tree rendering. Scripting phylogenetic graphics would enable the saving of graphical analyses involving numerous and complex tree handling operations and would allow the automation of repetitive tasks. ScripTree is a tool intended to fill this gap. It is an interpreter to be used in batch mode. Phylogenetic graphics instructions are stored in a text file and performed in a sequential way. Such instructions are related to tree rendering as well as tree annotation. ScriptTree is written in Tcl/Tk making it a cross-platform application, e.g. suitable for Windows and Unix-like systems, including OS X. It can be used either as a standalone package or included in a bioinformatic pipeline and linked to a HTTP server. |
Generators - Building level-k generators
|
|
Level-k generators were introduced by van Iersel and his coauthors in their RECOMB 2008 paper to describe the structure of simple level-k networks. A case analysis to build all level-2 generators is presented in the detailed version of this article, and later generalized as a brute force algorithm by Steven Kelk followed by a digraph isomorphism test to build all 65 level-3 generators. We generalized these works in showing how general level-k networks can be decomposed into generators of level at most k and provided rules to build level-k+1 generators recursively from level-k generators. A program to automatically build generators is available for download. |
Dendroscope - Galled Networks
|
|
In collaboration with Daniel Huson and Regula Rupp, we proposed a new method for building phylogenetic networks from clusters contained in rooted phylogenetic trees. This method is implemented in Dendroscope, a software that is able to visualize rooted phylogenetic trees and networks on thousands of taxa efficiently, and at the same time contains algorithmic machinery to compute consensus trees and rooted phylogenetic networks from a set of trees. |
Treecloud
|
|
In collaboration with Jean Véronis, we developed a tool to create tree clouds to represent textual data. This new visualization is based on phylogenetic trees and extends tag clouds by representing the most important words of a text with different sizes and colors, around a tree which reflects their proximity in the text. TreeCloud is a free software written in Python. |
Web sites
Querying reconciliations
|
|
This website is an alpha version of the reconciliation bank browser, that will be one of the final production of the Phyl-ARIANE project. The main goal is to provide interactive tools in order to query a databank of evolutionary events: gene duplications, gene losses and horizontal gene transfers. The underlying database of events is currently divided into events concerning two separate groups: gamma and beta proteobacteria. After having chosen a set of target species and having graphically located some macro-evolutionary events (amongst transfer, duplication, losses), the website can be used to query the database thanks to tree pattern matching algorithms. |
PhyloExplorer
|
| PhyloExplorer is a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. |
Phylogenetic Networks
|
| This website is a database of articles about Phylogenetic Networks. |