Research in Genomics

Eric Rivals
LIRMM (Computer Science Department)
CNRS - Université de Montpellier
France
rivals_AT_lirmm.fr
http://www.lirmm.fr/~rivals


1  Supplementary material for "Combining SAGE tags to predict genomic transcribed regions"

Rivals-tandem-sage-job-mm.pdf

1  Tandem SAGE tags approach for transcriptome annotation

In a project in collaboration between the Laboratoire d'Informatique de Robotique de Microélectronique de Montpellier, the Institut de Génétique Humaine (I.G.H.), and the Helsinki University of Technology, we propose a new method to annotate transcriptionnally active regions on any mammalian genome. For this we developped with the team of J. Tarhio (H.U.T.) a program able to search for pairs of SAGE or MPSS tags in tandem on the genome. A tag anchored by Sau3A1 is called a G-tag, while a tag anchored by NlaIII is termed a C-tag.

Two programs named SearchTandemSAGE-CG and SearchTandemSAGE-GC respectively are available as a Linux executable (32 byte architecture) and as Mac OS X executable for download free of charge for academic users.
SearchTandemSAGE-CG
: searches for C-G type pairs where for a given G-tag, the associated C-tag is the nearest 5' C-tag that belongs to the input list of C-tags
SearchTandemSAGE-GC
: performs the dual search, that is to say it searches for G-C type pairs where for a given C-tag, the associated G-tag is the nearest 5' G-tag that belongs to the input list of G-tags.
The usage are similar for both programs and we detail it for only one of them.

1.1  Usage

1.2  Examples

Input files: Commands:
  1. Usage / Help
    > ./SearchTandemSAGE-CG -h
    Usage: ./SearchTandemSAGE-CG -h  print this helps
    Usage: ./SearchTandemSAGE-CG -pc <pattern-file-Ctype> -pg <pattern-file-Gtype> -d <sequence-file-FASTA>
    Default: <pattern-file-Ctype> tagsC.txt
    Default: <pattern-file-Gtype> tagsG.txt
    Default: <sequence-file-FASTA> dna.fa
    
    SearchTandemSAGE-GC -h
    Usage: SearchTandemSAGE-GC -h    print this helps
    Usage: SearchTandemSAGE-GC -pc <pattern-file-Ctype> -pg <pattern-file-Gtype> -d <sequence-file-FASTA>
    Default: <pattern-file-Ctype> tagsC.txt
    Default: <pattern-file-Gtype> tagsG.txt
    Default: <sequence-file-FASTA> dna.fa
    


  2. Search C-G pairs without redirection
    ./SearchTandemSAGE-CG -pc tagsC.txt -pg tagsG.txt -d dna.fa 
    >testdna
    2;C;catgctatttagtt;16;3;G;gatcagggctgagg;30;
    5;C;catgtcagtttgga;46;4;G;gatcttctacttgc;60;
    6;C;catgtgggcgcctt;76;1;G;gatcctcagcctcc;90;
    3;C;catgcgctgtgtgc;106;2;G;gatcaaataaaaaa;120;
    


  3. Search C-G pairs with redirection
    ./SearchTandemSAGE-CG -pc tagsC.txt -pg tagsG.txt -d dna.fa > dna.resCG
    
    Output file: dna.resCG

  4. Search for G-C pairs with redirection
    ./SearchTandemSAGE-GC -pc tagsC.txt -pg tagsG.txt -d dna.fa > dna.resGC
    
    Output file: dna.resGC

1.3  Download

Linux 32 bytes
: executable for SearchTandemSAGE-CG SearchTandemSAGE-GC
Mac OSX (> 10.3)
: executable for SearchTandemSAGE-CG SearchTandemSAGE-GC

This document was translated from LATEX by HEVEA.