High Performance Reproducible Computing
Authors: Zhang, Zhang, Intel Corporation; Rosenquist, Todd, Intel Corporation; Moffat, Kent, Intel Corporation
The call for reproducible computational results in scientific research areas has increasingly resonated in recent years. Given that a lot of research work uses mathematical tools and relies on modern high performance computers for numerical computation, obtaining reproducible floating-point computation results becomes fundamentally important in ensuring that research work is reproducible.
It is well understood that, generally, operations involving IEEE floating-point numbers are not associative. For example, (a+b)+c may not equal a+(b+c). Different orders of operations may lead to different results. But exploiting parallelism in modern performance-oriented computer systems has typically implied out-of-order execution. This poses a great challenge to researchers who need exactly the same numerical results from run to run, and across different systems.
This talk describes how to use tools such as Intel® Math Kernel Library (Intel® MKL) and Intel® compilers to build numerical reproducibility into Python based tools. Intel® MKL includes a feature called Conditional Numerical Reproducibility that allows users to get reproducible floating-point results when calling functions from the library. Intel® compilers provide broader solutions to ensure the compiler-generated code produces reproducible results. We demonstrate that scientific computing with Python can be numerically reproducible without losing much of the performance offered by modern computers. Our discussion focuses on providing different levels of controls to obtain reproducibility on the same system, across multiple generations of Intel architectures, and across Intel architectures and Intel-compatible architectures. Performance impact of each level of controls is discussed in detail. Our conclusion is that, there is usually a certain degree of trade-off between reproducibility and performance. The approach we take gives the end users many choices of balancing the requirement of reproducible results with the speed of computing.
This talk uses NumPy/SciPy as an example, but the principles and the methodologies presented apply to any Python tools for scientific computing.