9:00 - Pr. Jesus Labarta - "On the OmpSs road: from the latency to the throughput age"
The OmpSs programming model and its runtime support
The talk will present a vision of how parallel computer architectures are evolving and some of the research being done at the Barcelona Supercomputing Center (BSC) driven by such vision. We consider that the evolution towards increasing complexity, scale and variability in our systems makes two technologies play a very important role in future parallel computing, which with the advent of multicores means in general computing. On one side, performance analysis tools with very detailed analytics capabilities are key to understanding the actual behavior of our systems. On the other, programming models that hide the actual complexity of the underlying hardware are needed to ensure the programming productivity and performance portability needed to ensure the economic sustainability of the programing efforts. We will present the OmpSs programming model and development at BSC, a task based model for homogeneous and heterogeneous systems which acts as a forerunner for OpenMP. OmpSs targets in a uniform way multicores, accelerators and clusters. We will describe features of the NANOS++ runtime on which OmpSs is implemented, focusing on the dynamic scheduling capabilities and load balance support features. We will also present the BSC tools environment, including trace visualization capabilities and specific features to understand the actual behavior of the NANOS runtime and OmpSs programs.
10:30 - Pr. Raymond Namyst - "Can We Really Taskify the World? Challenges for Task-Based Runtime Systems Designers"
To Fully tap into the potential of heterogeneous manycore machines, the use of runtime systems capable of dynamically scheduling tasks over the pool of underlying computing resources has become increasingly popular.
Such runtime systems expect applications to generate a graph of tasks of sufficient "width" so as to keep every processing unit busy. However, not every application can exhibit enough task-based parallelism to occupy the tremendous number of processing units of upcoming supercomputers. Nor can they generate tasks of appropriate granularity for CPUs and accelerators.
Exploiting inner parallelism of tasks and co-scheduling parallel tasks simultaneouly are means to cope with these issues. These techniques rely on runtime mechanisms such as hierarchical scheduling and resource negociation. This talk will give some insights about how task-based applications can benefit from such features over manycore architectures.