Results on simulated datasets

The above table reports results obtained for simulations under various conditions of evolution, along the lines of (Kuhner and Felsenstein 94). We generated rooted phylogenies on 10 taxa by randomly chosing their structure from the Yule-Harding distribution and fixing their edge-lengths according to a Poisson model.

The evolution of molecular sequences was simulated along the edges of these phylogenies, from the root to the leaves, according to a given condition of evolution (fast evolutionary rates on all edges, medium rates on all edges, slow rates on all edges, fast/slow rates on half of the edges, fast/slow rates on half of the sites) and to the Kimura 2-parameter model of evolution (transi/transv rate of 2). In this way, we generated 25,000 data sets for each condition of evolution.

We applied the Qstar independently to each data set, measuring each time the number of incorrect edges inferred by the method (which may be seen as false positives), as well as the size of the tree T* it outputs. The same process was applied to the NJ method. This method always inferred a fully resolved tree, so that each time it inferred a wrong edge, it forgot a correct edge (which may be seen as a false negative). Depending on the condition of evolution, the sequence length and the data set, some edges of the model phylogeny T did not support any mutation. As a result, data sets did not always contain information for each edge of T. In these cases, the method had no support to infer the corresponding edges. To account for this phenomenon, we also measured, for each data set, the number e_R of "realized" internal edges (ie, edges which supported at least one substitution, cf Kumar 96).

Results confirm that the Qstar method usually produces trees which possess almost only safe edges. More precisely, it induced less than one wrong edge in ten trees (average of 1.3% incorrect edges) over all conditions of evolution. Even for the most difficult condition considered, ie, unequal rates of evolution among different sites (which violates an assumption of the Kimura model and thus lowers the accuracy of the distance corrections), the Qstar method only induced $\approx 3.9\%$ incorrect edges on average.

As a consequence of inferring almost only safe edges, Qstar usually produces trees which are to some extent partially resolved. This implies that some correct edges were not inferred. However, less than 1/3 of the correct edges were missing on average. Moreover, we can see from the table that there is a real correlation between %e_R and %e_T, meaning that the Qstar method does not try to randomly resolve edges for which the data set does not contain any information. This behavior contrasts with that of most other methods, which infer fully resolved trees but usually with a non-negligible percentage of unsafe edges. Eg, the NJ tree contained on average more than one wrong edge in a tree (15.3% incorrect edges). Thus, the resulting tree usually contains some edges specific to the data set rather than from the species' history. The Qstar method is one of the few methods which tries to avoid this overfitting effect (see Berry (98) for other methods designed in that sense).

Last modified: Thu Mar 11 10:51:25 MET DST 1999