Quadruple Precision BLAS Routines for GPU


Latest QPBLAS-GPU 1.0 is now available for download.


Source code



Japanese version


English version



The routines are provided as open-source software under the BSD 2-Clause License.

Old version

version 0.9

Source code

Manual(Japanese version)

Purpose and Overview of the Program Development

In general, a large-scale simulation can be achieved in principle by using a parallel computer, a collection of multiple processors that are able to work cooperatively. By collecting available memory from each processor, an extremely large memory space is allocated as a whole, and it makes a massively parallel simulation possible. However, since a computer calculates with finitely many digits, a rounding error is added at every calculation. As a result, the rounding errors accumulate as the number of calculations increases. Until recently, the scale of a simulation was not large enough to be problematic in terms of its accumulated error. However, it is expected that, when a simulation that requires the K computer or very-large-scale parallel computers of the future to run at their maximum performance is conducted, the accuracy of the calculation can be significantly reduced in some cases.
Therefore, the Center for Computational Science & e-Systems in the Japan Atomic Energy Agency (JAEA) in cooperation with Imamura Toshiyuki, a team leader at RIKEN Advanced Institute for Computational Science, has extended BLAS, a library of routines that perform basic operations and are frequently used for a computer simulation, to quadruple precision.
As a new expansion of these R&D activities, in cooperation with Professor Youji Okuda at the University of Tokyo, the Center for Computational Science & e-Systems has developed quadruple precision BLAS routines that compliant with calculations on GPUs, which have significantly increased their computational performance recently. By simply replacing the conventional BLAS routines with corresponding QPBLAS-GPU routines, quadruple precision with higher speed can be achieved. In addition, not only basic operations but also applications that support GPUs can utilize quadruple precision calculations, and users can easily conduct quadruple precision calculations on GPUs. It is highly possible that, in the future, making quadruple precision available becomes a universal issue that is applicable to all large-scale simulations. In order to contribute to future progress of research in the computational science field as well as reinforcement of the technical basis of Japan, this website presents the result of JAEA’s research and development to researchers around the world.


[2]MPACK (multiple precision arithmetic versions of BLAS and LAPACK) : http://mplapack.sourceforge.net
(High-precision linear algebra math library by M. Nataka at RIKEN)


Toshiyuki Imamura, the R&D Office of Simulation Technology, the Center for Computational Science (Currently at RIKEN, Japan), Hiroshi Okuda, (The University of Tokyo)


※ Substitute @ for (at).