Pypi webpage of dipwmsearch and preprint title or bioRxiv

dipwmsearch

Preprint and new python package available

Finding binding site motifs in long DNA or RNA sequences is a current bioinformatic task. We designed a new algorithm that handles dinucleotidic Position Weight Matrices (or di-PWMs for short) as motif representation. Our algorithm implements an adapted enumeration based strategy for di-PWMs. The HOCOMOCO database, for instance, collects di-PWMs for Human and mouse transcription factor binding sites.

We provide a new Python package, called dipwmsearch, which offers functions to search for a di-PWM in any DNA or RNA input sequences. It is easy to install via Pypi or conda, and documented online here. Try it out. Feedbacks are highly welcome.

A preprint presenting the algorithm behind dipwmsearch is freely available on HAL and BioRxiv:

dipwmsearch: a python package for searching di-PWM motifs

bioRxiv 2022 doi:10.1101/2022.11.08.515647

Marie Mille, Julie Ripoll, and Bastien Cazaux and myself are all co-authors.

QR code of preprint

Access

  1. Python package: https://pypi.org/project/dipwmsearch/
  2. Documentation: https://rivals.lirmm.net/dipwmsearch/
  3. Conda package: https://anaconda.org/atgc-montpellier/dipwmsearch
  4. Source code: https://gite.lirmm.fr/rivals/dipwmsearch
Eric Rivals
CNRS Research Director in Computer Science and Bioinformatics

My research interests include string algorithms, bioinformatics, genomics.