2015-07 ISC'15: Taking advantage of node power variation in homogenous HPC systems to save energy

Overview

research paper: Torsten Wilde, Axel Auweter, Hayk Shoukourian, Arndt Bode Taking advantage of node power variation in homogenous HPC systems to save energy

published by Springer in: High Performance Computing 30th International Conference, ISC High Performance 2015, Frankfurt, Germany, July 12-16, 2015, Proceedings, Pages 376-393


Abstract: Saving energy and, therefore, reducing the Total Cost of Ownership (TCO) for High Performance Computing (HPC) data centers has increasingly generated attention in light of rising energy costs and the technical hurdles imposed when powering multi-MW data centers. The broadest impact on data center energy efficiency can be achieved by techniques that do not require application specific tuning. Improving the Power Usage Effectiveness (PUE), for example, benefits everything that happens in a data center. Less broad but still better than individual application tuning would be to improve the energy efficiency of the HPC system itself. One property of homogeneous HPC systems that hasn't been considered so far is the existence of node power variation.

This paper discusses existing node power variations in two HPC systems. It introduces three energy-saving techniques: node power aware scheduling, node power aware system partitioning, and node ranking based on power variation, which take advantage of this variation, and quantifies possible savings for each technique. It will show that using node power aware system partitioning and node ranking based on power variation will save energy with very minimal effort over the lifetime of the system. All three techniques are also relevant for distributed and cloud environments.

Keywords --- HPC, energy-efficiency, energy-saving, data center