2018

RESPI : Robustesse du système

Par Bardou Antony (DALI - UPVD) le 2018-07-12

Le projet RESPI est un projet visant à obtenir le volume respiratoire humain au moyen de marqueurs placés sur le thorax et l'abdomen du patient. Cette présentation détaille les différents procédés employés pour réussir une analyse convenable de la robustesse d'un tel système. Après une revue du fonctionnement du système et des optimisations que nous lui avons appliquées, nous présenterons les concepts et algorithmes développés et implémentés dans l’analyse.

High Level Transforms for SIMD and low-level computer vision algorithms

Par Lacassagne Lionel (LIP6, Sorbonne Université) le 2018-05-18

ATTENTION ! Créneau inhabituel : vendredi matin 9h30

This paper presents a review of algorithmic transforms called High Level Transforms for IBM, Intel and ARM SIMD multi-core processors to accelerate the implementation of low level image processing algorithms. We show that these optimizations provide a significant acceleration. A first evaluation of 512-bit SIMD XeonPhi is also presented. We focus on the point that the combination of optimizations leading to the best execution time cannot be predicted, and thus, systematic benchmarking is mandatory. Once the best configuration is found for each architecture, a comparison of these performances is presented. The Harris points detection operator is selected as being representative of low level image processing and computer vision algorithms. Being composed of five convolutions, it is more complex than a simple filter and enables more opportunities to combine optimizations. The presented work can scale across a wide range of codes using 2D stencils and convolutions.

Comment mettre en oeuvre un design VHDL sur un FPGA ?

Par Posso Julien (IRT St Exupery, Toulouse) le 2018-04-12

Je décrirais un FPGA et la façon d'y implanter un code VHDL ainsi que les objectifs de mon travail à venir.

Premières optimisations dans CTA

Par Arrabito Luisa (LUPM, CNRS) le 2018-03-29

Cette présentation permettra de faire le point sur les premières optimisations introduites dans le traitement corsika du mode simulation de CTA (Cherenkov Telescope Array). Ces travaux sont effectués dans le cadre d'une collaboration avec L. Arrabito, J. Bregeon (LUPM), D. Parello, G. Revy et Ph. Langlois (DALI/LIRMM).

High Performance Computing for gamma ray detection

Par Aubert Pierre (LAPP/IN2P3-CNRS) le 2018-03-22

The Ground based gamma-ray detector, like H.E.S.S, MAGIC, VERITAS, use two kind of analysis method. The more common method is based on momentum calculation of the pictures (Hillas like methods), but it does not allow a good way to reject the noise. The other way is to used precomputed pictures (called templates) of the expected signal in the telescopes. This way allows a better stereoscopic reconstruction and discrimination.
The precision of this method implies a huge time of computation before the analysis, and also an expensive time during the analysis because the pictures of the same event are compared pixel by pixel simultaneously.
Unfortunately, the CTA experiment will produce too much data for such analysis, 169 GB/s, and it is not possible to have telescopes ordered by events due to the amount of produced data. This imply the stereoscopic reconstructions using telescopes pictures are unrealizable.
The first part exposes how to deeply optimized the Hillas parameter computation.
The second part explains how to do pictures comparison by extracting the relevant information of the pictures in a massively parallel way, and allows a stereoscopic reconstruction using the best templates physical parameters with likelihood computation when the data are quite reduced.
This algorithm used the Single Value Decomposition (SVD) algorithm to extract each picture's information in one or several singular values. Thus, the computation time due to the image comparison diminish mercy to the simplification allowed by the singular values. In the other hand, the computation time used to generate the template increases.

Calculer sur plusieurs coeurs : la question du déterminisme.

Par Goossens Bernard (DALI/LIRMM) le 2018-03-08

Dans cet exposé, nous proposons un nouveau modèle de calcul parallèle sur plateforme à c½urs multiples assurant le déterminisme du résultat, le déterminisme du placement des traitements sur les coeurs, et le déterminisme des latences. Nous montrons à quel point ce modèle simplifie la programmation parallèle en la rendant très peu différente de la programmation séquentielle. Nous proposons une architecture matérielle mettant en ½uvre ce modèle. Un prototype à 64 coeurs est en cours de développement sur FPGA : le modèle a été synthétisé ; il reste à le porter sur une carte Xilinx Zynq.

Salsa: An Automatic Tool to Improve the Numerical Accuracy of Programs

Par Damouche Nasrine (LAMPS, UPVD) le 2018-02-08

Salsa is an automatic tool that improves the accuracy of the floating-point computations done in numerical codes. Based on static analysis methods by abstract interpretation, our tool takes as input an original program, applies to it a set of transformations and then generates an optimized program which is more accurate than the initial one. The original and the transformed programs are written in the same imperative language. This article is a concise description of former work on the techniques implemented in Salsa, extended with a presentation of the main software architecture, the inputs and outputs of the tool as well as experimental results obtained by applying our tool on a set of sample programs coming from embedded systems and numerical analysis.