The genome can be modeled as a set of strings (chromosomes) of
distinguished elements called genes. Genome duplication is an
important source of new gene functions and novel physiological
pathways. Originally (ancestrally), a duplicated genome contains two
identical copies of each chromosome, but through the genomic
rearrangement mutational processes of reciprocal translocation
(prefix and/or suffix exchanges between chromosomes) and substring
reversals, this simple doubled structure is disrupted. At the time
of observation, each of the chromosomes resulting from the
accumulation of rearrangements can be decomposed into a succession
of conserved segments, such that each segment appears exactly twice
in the genome. We present exact algorithms for reconstructing the
ancestral doubled genome in linear time, minimizing the number of
rearrangement mutations required to derive the observed order of
genes along the present-day chromosomes. Somewhat different
techniques are required for a translocations-only model, a
translocations/reversals one, both of these in the multichromosomal
context (eukaryotic nuclear genomes), and a reversals-only model,
for single chromosome prokaryotic and organellar genomes. We apply
these methods to the yeast genome which is thought to have doubled,
and to the liverwort mitochondrial genome, whose duplicate genes are
unlikely to have arisen by genome doubling.
|