next up previous
Up: Eric Rivals' Homepage

 

Software SearchRepeats

What it does

 

Here you can download the executable file of the software SearchRepeats. SearchRepeats searches for exact non overlapping repeats in nucleotidic (DNA) sequences and outputs these repeats in a text report. For more description you can refer to our publication.

Publication

 

Fast Discerning Repeats in DNA Sequences with a Compression Algorithm
Rivals, M. Dauchet, J-P. Delahaye, O. Delgrange
Extended abstract in the 8th Workshop on Genome and Informatics (GIW97)
Tokyo, 12-13 Dec 1997

How to use it?

 

usage : SearchRepeats  <filename>  [Min Factor Length]

Parameters:

filename
name of the input file in that contains the sequence to analyse, it accepts several format incl. FASTA, Genbank, EMBL, ...
Min Factor Length
optional parameter; an integer that gives the minimal length of a repeated word to search for, it should be set to tex2html_wrap_inline71 sequence's length).

A factor is defined by a position and a length in the sequence: it is a subword of the sequence that occurs at a given position.
A zone is a pair of factors that have the same sequence: the first one is the reference occurrence, the second is the encoded occurrence.

Output

 

First, one finds some general informations about the number of factors, zones, etc. Then comes a list that describes each zone on one line. An factor occurrence can be referenced several time: this defines the TYPE of the zone. The first time, it is a TYPE_2 zone, while for further references to an already referenced factor zones are of TYPE_N. The table below gives the meaning of the other columns.

tabular34

Example:

Nb_large_factors_in_seq 31231 Nb_encoded_zones   210 Nb_encoded_factors   209
Gain_evaluation_bits     55776 Encoded_char     33365 Code_length_evaluation    123537
    1  138920 TYPE_2      82   90953
    2  139002 TYPE_2     101   91138
    3  139123 TYPE_2     132   91259
    4  151734 TYPE_2      81  151634
...

Download executable files

 

Sun Solaris 2.7 executable

TOP WHAT PUBLICATION HOW OUTPUT


next up previous
Up: Eric Rivals' Homepage

Eric RIVALS, LIRMM, 161 rue Ada, F-34392 Montpellier Cedex 5