If you do not see this message displayed properly, please click here


The Faculty of Informatics is pleased to announce a seminar given by Gerhard Wellein



Performance Engineering for HPC: Models generating insights
Speaker: Gerhard Wellein
University of Erlangen-Nuremberg, Germany
Date: Wednesday, March 29, 2017
Place: USI Lugano Campus, room SI-006, informatics building (Via G. Buffi 13)
Time: 13:30



We consider Performance Engineering (PE) as a structured, iterative process for code optimization and parallelization. The key ingredient is a white-box performance model which provides insights into the interaction between the code and the hardware. The model identifies the actual performance-limiting factors (“bottlenecks”), allowing for a selection of appropriate code changes. Once the impact of the code changes is validated the process restarts with a new bottleneck identified by the performance model. Since this model- based approach provides a thorough understanding of the impact of hardware features on code performance it is also useful in various other areas such as performance reproducibility, performance prediction for future architectures, or education and training. The talk will first introduce our PE concept and survey basic “white-box” performance models. Focus-ing on work performed in the “Equipping Sparse Solvers for Exascale” (ESSEX) project we will demonstrate various aspects of PE in the context of sparse eigenvalue solvers for quantum physics applications. Here a thorough understanding of modern hardware concepts led to the proposal of a new sparse matrix data format, which delivers high performance for many matrix structures on all modern HPC compute devices (multicore CPUs, Intel Xeon Phi, Nvidia GPGPUs). The benefit of using (simple) analytic models in performance optimization is demonstrated for a Kernel Polynomial Method (KPM) based solver, which computes the spectral density of large sparse matrices. By designing specific kernel operations and applying blocking on interleaved vectors, this sparse KPM solver has been accelerated by 3-4 on the node level, delivering about 10% of peak performance on various generations of modern Intel CPUs and Nvidia GPGPUs. These improvements finally enabled us to achieve sustained performance in the PetaFlop/s range for large scale (heterogeneous) KPM calculations on sparse matrices from quantum physics applications.

This work was supported by the German Research Foundation (DFG) through the Priority Programs 1648 “Software for Exascale Computing” under project ESSEX (see https://blogs.fau.de/essex/).



Gerhard Wellein holds a PhD in Physics from the University of Bayreuth and is a regular Professor at the Department for Computer Science at University of Erlangen-Nuremberg. He heads the HPC group at Erlangen Regional Computing Center (RRZE) and has more than ten years of experience in teaching HPC techniques to students and scientists from Computational Science and Engineering. His research interests include solving large sparse eigenvalue problems, novel prarallelization approaches, performance engineering, and architecture-specific optimization.


Host: Prof. Olaf Schenk


Faculty of Informatics

Faculty of Informatics
Università della Svizzera italiana
Via Giuseppe Buffi 13
CH-6904 Lugano
Tel.: +41 (0)58 666 46 90
Fax: +41 (0)58 666 45 36
Email: decanato.inf@usi.ch
Web: www.inf.usi.ch
Twitter: @USI_INF


Segui USI@EXPO2015 su Twitter Segui USI@EXPO2015 su Facebook Segui USI@EXPO2015 su Linkedin Segui USI@EXPO2015 su YouTube