Marta Mattoso : Exploring Provenance in High Performance Scientific Computing
Jeudi 6 décembre 2012, Salle 127, Batiment la Galera, LIRMM
Marta Mattoso, COPPE/UFRJ, Rio de Janeiro, Brazil
Title : Exploring Provenance in High Performance Scientific Computing
Large-scale scientific computations are often organized as a composition of many computational tasks linked through data flows. After the completion of a computational scientific experiment, a scientist has to analyze its outcome, for instance, by checking inputs and outputs of computational tasks that are part of the experiment. This analysis can be automated using provenance management systems that describe, for instance, the production and consumption relationships between data artifacts, such as files, and the computational tasks that compose the scientific application. Due to its exploratory nature, large-scale experiments often present iterations that evaluate a large space of parameter combinations. In this case, scientists need to analyze partial results during execution and dynamically interfere on the next steps of the simulation. Features, such as user steering on workflows to track, evaluate and adapt the execution need to be designed to support iterative methods. In this talk we show examples of iterative methods, such as, uncertainty quantification, reduced-order models, CFD simulations and bioinformatics. We discuss challenges in gathering, storing and querying provenance as structured data enriched with information about the runtime behavior of computational tasks in high performance computing environments. We also show how provenance can enable interesting and useful queries to correlate computational resource usage, scientific parameters, and data set derivation. We briefly describe how provenance of many-task scientific computations are specified and coordinated by current workflow systems on large clusters and clouds.
Last update on 19/06/2013