C3G (MASTODONS project)

Date : 08-03-2018

1 Scientific context

New sequencing technologies (NGS) have revolutionized biology and life sciences since their advent in 2005. Now the Third Generation of such Sequencing devices (3GS) are accessible and used. Compared to the second generation sequencing (2GS), these new technologies advantageously yield much longer reads (up to tens of kilobases), but at the cost of much lower precision. Indeed, their error rates still lies above 10%. Correction of such long reads is thus crucial to downstream analysis, and generally to leverage 3GS data. However, the increase in length promises numerous benefit for sequencing project, especially for genome and transcriptome studies.

We built the C3G consortium to cope with 3GS error correction, which still remains a difficult computational task. C3G project aims at developping bioinformatics methods and algorithms to correct sequence data produced by so-called Third Generation Sequencing technologies, and to test and apply these methods on real life data.

Our project is funded by CNRS Mission pour l'Interdisciplinarité - http://www.cnrs.fr/mi/

CNRS-logo-small.jpg sephhade-logo.png

2 Meetings

List of various meetings of C3G project in chronological order

2.1 In 2017

  1. Visio-conference 9th June 2017
  2. Journée de formation programmation GATB à Montpellier
  3. Lille @ JOBIM meeting, 5th July 2017
  4. Paris, 5-6 September 2017 - program and abstracts on this webpage
  5. Rennes, 4th-6th December 2017

2.2 In 2018

  1. Colloquium Mastodons 1-2nd Feb 2017 in CNRS Campus, Paris
  2. Visio-conference meeting, 26th February
  3. Workgroup correction evaluation, 26th Feb - 2nd March, Rouen

3 Consortium and Participants


Figure 1: Logos of Institutes

3.1 Labs & teams

Team Coordinator Institute / Lab University / Institute
MAB Eric Rivals LIRMM CNRS, INS2I, Université Montpellier
IPN Pierre Abad Institut Sophia Agrobiotech (ISA) INRA, Université Côte d’Azur, CNRS
SPIBOC Karine Hugot Institut Sophia Agrobiotech (ISA) INRA, Université Côte d’Azur, CNRS
TIBS Thierry Lecroq LITIS EA 4108 Univ. Rouen Normandie
EEZ Guillaume Castel et Nathalie Charbonnel CBGP INRA, IRD, CIRAD, SUPAGRO
EEG Jean-François Flot Evolutionary Biology & Ecology Univ. Libre de Bruxelles

3.2 Permanent researchers

Firstname Name Position Lab or structure Email
Etienne Danchin Research IPN - Ins. Sophia Agrobiotech (ISA) etienne.danchin@inra.fr
Marc Bailly-Bechet Ass. Prof IPN - Ins. Sophia Agrobiotech (ISA) marc.bailly-bechet@inra.fr
Martine Da Rocha Engineer SPIBOC - Ins. Sophia Agrobiotech (ISA) martine.da-rocha@inra.fr
Corinne Rancurel Engineer SPIBOC - Ins. Sophia Agrobiotech (ISA) corinne.rancurel@inra.fr
Eric Rivals Research LIRMM rivals@lirmm.fr
Guillaume Castel Research CBGP – INRA guillaume.castel@inra.fr
Maxime Galan Engineer CBGP – INRA maxime.galan@inra.fr
Caroline Tatard Tech. CBGP – INRA caroline.tatard@inra.fr
Arnaud Lefebvre Ass. Prof Univ. Rouen Normanide arnaud.lefebvre@univ-rouen.fr
Thierry Lecroq Prof. Univ. Rouen Normandie thierry.lecroq@univ-rouen.fr
Jean-François Flot Prof. Univ. Libre de Bruxelles jflot@ulb.ac.be
Dominique Lavenier Research IRISA - INRIA Rennes lavenier@irisa.fr
Claire Lemaitre Research IRISA - /INRIA Rennes Claire.lemaitre@inria.fr
Pierre Peterlongo Research IRISA - /INRIA Rennes Pierre.Peterlongo@inria.fr
Fabrice Legeai Engineer IGEPP – INRA Rennes Fabrice.legeai@inra.fr
Stéphanie Robin Engineer IGEPP – INRA Rennes Stephanie.robin@inra.fr
Denis Tagu Research IGEPP – INRA Rennes denis.tagu@inra.fr

French translation

  • DR / CR: Res (Senior or Junior)
  • MCF : Ass. Prof.
  • IE : Engineer

3.3 Doctoral students, post-doctoral fellow, engineers, visitors

Firstname Name Position Lab or structure Email
Bastien Cazaux Postdoc LIRMM - MAB (now @ Univ. Helsinki) bastien.cazaux@laposte.net
Julien Veyssier Engineer LIRMM - MAB Montpellier julien.veyssier@lirmm.fr
Mathias Weller Postdoc LIRMM - MAB (now @ IGM Marne La Vallée) mathias.weller@lirmm.fr
Georgios Koutsovoulos Postdoc Institut Sophia Agrobiotech (ISA) georgios.koutsovoulos@inra.fr
Pierre Morisse PhD stud. LITIS - TIBS Rouen pierre.morisse2@univ-rouen.fr
Antoine Limasset Postdoc Université Libre de Bruxelles antoine.limasset@gmail.com
Lolita Lecompte PhD stud. IRISA/INRIA - GenScale lolita.lecompte@inria.fr
Camille Marchet PhD stud. IRISA/INRIA - GenScale camille.marchet@irisa.fr

4 Diffusion - publications - communications

4.1 Publications

  1. The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies. X. Argout, G. Martin, G. Droc, O. Fouet, K. Labadie, E. Rivals, J.M. Aury, C. Lanaud. BMC Genomics 18:730, DOI: 10.1186/s12864-017-4120-9 2017.

4.2 Preprints

  1. P. Morisse, T. Lecroq, and A. Lefebvre. Hybrid correction of highly noisy oxford nanopore long reads using a variable-order de Bruijn graph. bioRxiv, doi: https://doi.org/10.1101/238808, 2017.
  2. B. Cazaux, E. Rivals. Hierarchical Overlap Graph. HAL Archives ouvertes <lirmm-01674319> 2017.
  3. A. Limasset, J.-F. Flot, P. Peterlongo: Toward perfect reads. CoRR abs/1711.03336 (2017)

4.3 Communications

  1. P. Morisse, T. Lecroq, and A. Lefebvre. HG-CoLoR : Hybrid graph for the error correction of long reads. In C. Lhoussaine and H. Touzet, editors, Actes des Journees Ouvertes Biologie Informatique Mathematiques, pages 67-74, Lille, France, 2017.
  2. G. Castel. Projet PIRATE. Étude de la diversité intrahôte du virus Puumala chez le campagnol roussâtre par séquençage haut-débit. Journées annuelles 2017 du groupe Rongeurs du CBGP, Montpellier, France, 2017.
  3. A. Limasset, J.-F. Flot, P. Peterlongo: Toward perfect reads. Workshop SeqBio 2017

5 Software and tools

List of tools and software developped or maintained within the framework of C3G project.

5.1 Hybrid correction

  1. HG-CoLoR: Hybrid correction of highly noisy oxford nanopore long reads using a variable-order de Bruijn graph. Access: https://github.com/morispi/HG-CoLoR ; preprint: : https://www.biorxiv.org/content/early/2017/12/22/238808
  2. LoRDEC: a hybrid error correction method based on de Bruijn graph; Access: https://gite.lirmm.fr/lordec/lordec-releases ; documentation: http://www.lirmm.fr/~rivals/lordec/FAQ
  3. BCOOL: Hybrid error correction using compacted de Bruijn graph; Access: https://github.com/Malfoy/BCOOL ;

5.2 Self-correction

  1. LoRMA: LoRMA: a self correction program for long reads (PacBio, Nanopore); Access: http://www.cs.helsinki.fi/u/lmsalmel/LoRMA

5.3 Evaluation based on simulated read data

  1. ELECTOR: pipeline for comparing tools using simulated read data Access: git repository https://github.com/kamimrcht/ELECTOR Command for download:
git clone --recursive https://github.com/kamimrcht/ELECTOR.git

