Positions after PhD


Research director


Jan 2016 – Present Montpellier, France

Responsibilities include:

  • Head of Institute of Computational Biology (2015-2019)
  • Head of French Molecular Bioinformatics network (2010-2016)
  • Head of computer science dpt (2007-2010)



Jan 2008 – Oct 1999 Montpellier, France
Bioinformatics and algorithmics research.

Postdoctoral fellow

German Cancer Research Center (Deutsches Krebsforschung Zentrum - DKFZ)

Sep 1996 – Oct 1999 Heidelberg, Germany
Algorithms for transcriptomics.


News about software, results, publications, or collaborations

Both the INSERM and CNRS issued press news on their websites.

Charlotte Dubec joins the team for an internship in collaboration with SpyGen company.

Epitranscriptomic study published in Nature Communications





Algorithms for shortest superstring questions


Software for efficient metagenomics and applications to virus metagenomics


Information fuelled biophysical models for the control of gene expression

Recent Publications

Cancer stem cells (CSCs) are a small but critical cell population for cancer biology since they display inherent resistance to standard therapies and give rise to metastases. Despite accruing evidence establishing a link between deregulation of epitranscriptome-related players and tumorigenic process, the role of messenger RNA (mRNA) modifications in the regulation of CSC properties remains poorly understood. Here, we show that the cytoplasmic pool of fat mass and obesity-associated protein (FTO) impedes CSC abilities in colorectal cancer through its N6,2’-O-dimethyladenosine (m6Am) demethylase activity. While m6Am is strategically located next to the m7G-mRNA cap, its biological function is not well understood and has not been addressed in cancer. Low FTO expression in patient-derived cell lines elevates m6Am level in mRNA which results in enhanced in vivo tumorigenicity and chemoresistance. Inhibition of the nuclear m6Am methyltransferase, PCIF1/CAPAM, fully reverses this phenotype, stressing the role of m6Am modification in stem-like properties acquisition. FTO-mediated regulation of m6Am marking constitutes a reversible pathway controlling CSC abilities. Altogether, our findings bring to light the first biological function of the m6Am modification and its potential adverse consequences for colorectal cancer management.

Novel recombinant viruses may have important medical and evolutionary significance, as they sometimes display new traits not present in the parental strains. This is particularly concerning when the new viruses combine fragments coming from phylogenetically distinct viral types. Here, we consider the task of screening large collections of sequences for such novel recombinants. A number of methods already exist for this task. However, these methods rely on complex models and heavy computations that are not always practical for a quick scan of a large number of sequences.We have developed SHERPAS, a new program to detect novel recombinants and provide a first estimate of their parental composition. Our approach is based on the precomputation of a large database of ‘phylogenetically-informed k-mers’, an idea recently introduced in the context of phylogenetic placement in metagenomics. Our experiments show that SHERPAS is hundreds to thousands of times faster than existing software, and enables the analysis of thousands of whole genomes, or long-sequencing reads, within minutes or seconds, and with limited loss of accuracy.The source code is freely available for download at https://github.com/phylo42/sherpas.Supplementary data are available at Bioinformatics online.

The hierarchical overlap graph (HOG for short) is an overlap encoding graph that efficiently represents overlaps from a given set $P$ of $n$ strings. A previously known algorithm constructs the HOG in $O(|| P || + n^2)$ time and $O(|| P || +n times min (n,max |s|:s ın P))$ space, where $|| P ||$ is the sum of lengths of the $n$ strings in $P$. We present a new algorithm of $O(|| P || łog n)$ time and $O(|| P || )$ space to compute the HOG, which exploits the segment tree data structure. We also propose an alternative algorithm using $O(|| P || fracłog nłog łog n)$ time and $O(|| P ||)$ space in the word RAM model of computation.

Abstract Motivation Phylogenetic placement (PP) is a process of taxonomic identification for which several tools are now available. However, it remains difficult to assess which tool is more adapted to particular genomic data or a particular reference taxonomy. We developed PEWO, the first benchmarking tool dedicated to PP assessment. Its automated workflows can evaluate PP at many levels, from parameter optimisation for a particular tool, to the selection of the most appropriate genetic marker when PP-based species identifications are targeted. Our goal is that PEWO will become a community effort and a standard support for future developments and applications of phylogenetic placement. Availability https://github.com/phylo42/PEWO Supplementary information Supplementary data are available at Bioinformatics online.

Given a set of finite words, the Overlap Graph (OG) is a complete weighted digraph where each word is a node and where the weight of an arc equals the length of the longest overlap of one word onto the other (Overlap is an asymmetric notion). The OG serves to assemble DNA fragments or to compute shortest superstrings, which are a compressed representation of the input. The OG requires space that is quadratic in the number of words, which limits its scalability. The Hierarchical Overlap Graph (HOG) is an alternative graph that also encodes all maximal overlaps, but uses space that is linear in the sum of the lengths of the input words. We propose the first algorithm to build the HOG in linear space for words of equal length.

Popular Topics

Aho-Corasik algebraic technique algorithm algorithms alignment alignment score alphabet size Anchor-based strategy ancient DNA Approximability approximate match approximate pattern matching approximate repeats approximation Approximation algorithm approximation algorithms APX Assembly autocorrelation award bacteria Bacterial genomes Basic Period binary alphabet Binary Vector binding binding site bioinformatics biophysics BLAST bounds Burrows-Wheeler cancer cDNA character Characterisation chromatin chromosome circular permutation cluster analysis clustering clustering algorithms coding coiled coil Collinear fragment chaining common word Comparative genomics complexity compressed data structures compression compression algorithms compression gain computer science Concat-Cycles concensus string conformation Connectivity cross-over cyclic cover Cyclic string cyclic strings Cytoplasmic Male Sterility Data compression data structure Data structures database DCJ de Bruijn graph discrete line DNA dominance order double cut and join duplication dynamic programming edit distance encoding enumeration epitranscriptome equality EST Eulerian tour evaluation evolution Exact Match exponential filtration Gapped seed genetics genome genome rearrangement genome sequencing genomics Golomb ruler graph greedy greedy algorithm Greedy conjecture Haemophilus influenzae Hamiltonian path heuristic algorithms Hi-C homologous sequences human hybrid zone Hypergraph incomplete lineage sorting indexing information content information theory input string INS insulin integer sequence internal duplication intragenic recombination irreducible factor kinship Kolmogorov complexity lattice LCS Levenshtein distance linear superstring linear time linear time algorithm Longest common subsequence mapping tool matroid maximal chain Maximum coverage Maximum independent set Maximum stable set medecine memory metagenome metagenomics microorganisms microsatellite evolution Minimum assignment minisatellite minisatellite locus minisatellites MIS monkey test motif motif size mouse mRNA MS-Align multiple alignment multiple read mutation MVR N-gram NGS NP-complete NP-hard On-line algorithms optimal coding oryza line overlap overlap graph Pairwise alignment parameterized complexity path pattern Pattern recognition pattern search perfect detection periods Permutation phylogenetic profile Polynomial Time Approximation Scheme proportional length protein domain PWM radish genome random-access memory random text Read mapping rearrangement Recognition recombinant regular expression regularity detection relative compression repeats reverse complementary sequence RLE RNA search algorithm seed segment tree seminar sequence sequence alignment sequence classification sequence comparison Sequence graph short tandem repeats Shortest cyclic cover of strings shortest DNA cyclic cover problem Shortest Superstring Problem similarity similarity metrics software spaced-seed String matching stringology Stringology Text Algorithms Indexing Data Structures De Bruijn Graph Assembly Space Complexity Dynamic Update strings student sturgeon phylogeny subset system suffix array suffix tree superstring sweep line tandem duplication tandem repeat tandem repeat alignment tandem repeats text text compression Text indexing Tiling time complexity tool training transcription factor transcriptome translation tree tree alignment validation score virus VNTR W[1]-hard web resource web server Whole genome alignment word enumeration word RAM model Yakuts zebra fish


Connect with me

  • (33) 04 67 41 86 64
  • LIRMM - UMR 5506 & University Montpellier (CC 05016) 860 rue de St Priest - 34095 Montpellier cedex 5 FRANCE