Theory and practice of efficient supercomputer management

Vadim Voevodin (Moscow State Univ.)

Many modern supercomputing systems work with low efficiency, resulting in a lot of valuable computing resources being idle. The reasons may be very different: low efficiency of user applications, inappropriate configuration of system software, hardware errors and much more. For timely detection of these reasons, a holistic approach that allows to analyze various aspects of the functioning of a supercomputer is needed. This talk is devoted to a set of software tools being developed in Moscow State University intended to address this issues.