Gene Trajectories Visualizations

Context of Trajectories extraction on genes visualization


What is the input data ?

The input data is DNA micro-array results regarding different HIV strains. Each DNA micro-array is a capture of HIV-infected genes expression values at several timestamp (4h, 8h, 24h, 48h, 72h) after a strain of virus infection. The expression level is measured using logFC by comparing values with reference ones.

What is the project aim ?

The analyze of DNA micro-array results stays complex for biologists. Applying classic data mining approaches on them have shown some limits. Data mining specialists as Tatoo team from LIRMM laboratory and the UMR TETIS team, and biologists experts from IGMM laboratory, has been involved in the project of applying the recent trajectories extraction method GeT_Move [1] . Such an approach is able to answer the question : "which genes are expressed together during the same timestamp ?"

What is the data mining process used ?

1 - Clusters Visualization

Access the visualization >>

What is the aim ?

The aim is to visualize categories (clusters of genes) evolution during a timestamp by the mean of coloured ribbons. At each timestamp, ribbons are divided or joint according to cluster values (average expression value of genes).

What is the input data ?

CATEGORIES BY TIMESTAMPS FOR EACH GENE

{
4h,8h,24h,48h,72h
average Log FC of the cluster where the gene is at each timestamp...
}
How to read it ?
timestamps
ribbons

TIMESTAMPS Each horizontal line represents a timestamp. Time must be read from top to bottom (4h to 72h). RIBBONS The ribbon width is proportional to the number of elements in a category i.e the number of genes in a cluster. The genes that belong to a particular cluster at a timestamp share a similar value (-0.096928 at 8h). The next times their values change in different ways ( -0.515559 or 0.213290 at 24h): the cluster is splitted into as much values genes from the previous cluster can take the next time.
colors
COLORS Each 4h category matches a colour so as it is possible to follow a ribbon path (i.e. a cluster evolution) from timestamp to timestamp.

2 - Gene Trajectories Visualization

Access the visualization >>

What is the aim ?

The aim is to help biologists to analyze gene trajectories extraction results in a more intuitive way. Initially, every trajectory found is composed of genes and timestamps. By performing a gene annotation, it has been associated with a list of possible biological functions instead of the previous list of genes. Indeed, it is much more meaningfull for biologists than a DNA micro-array gene identifier or a gene name. The more frequent biological function found is assumed to be the best representative for the trajectory. However, the application gives information about the complete list of possible functions sorted by frequencies.

What is the input data ?

LIST OF TRAJECTORY PROFILES

{
id,
timestamps,
MainlyFunctionName,
MainlyFunctionFrequency,
NumberOfTrajectories
}

LIST OF COLOURS FOR FUNCTIONS

{
FunctionName,
Colour
}

LIST OF TRAJECTORIES FREQUENCIES BY PROFILE

{
id,
listOfTrajectoriesFrequencies :
[{ functionName,
frequency}, ...]
}
How to read it ?
timeline graph
TIMELINE GRAPH Each line of rectangles is a trajectory profile including all trajectories with the same predominantly frequent function associated and same timestamps. The line height is proportional to the number of trajectories belonging to the profile. The line width shows the timestamps of the profile. "f1 : 60-97% (10) " means the mainly function for this profile is "f1" with a frequency value between 60% and 97% depending on the 10 trajectories frequencies that compound it.
colors
trajectories frequencies detail

COLORS Biological function names match colours. Colour matching is the same in all visualizations. FREQUENCIES A trajectory profile has more than one function associated. To find out all of them and their frequencies by trajectories, click on a profile in the timeline graph.
linear graph
LINEAR GRAPH the total number of occurences found for a given function (see colours in legend) by timestamp is given here.





Collaborateurs du projet