User:Aisha/BLAS LAPACK dev guide

BLAS, CBLAS, LAPACK and LAPACKE are some of the most important parts of scientific computing toolkits and having optimized versions present is a must for fully utilizing the CPUs capabilities.

This document contains the recommendations and guidelines for developers and packagers to link with the BLAS/CBLAS/LAPACK/LAPACKE libraries and their 64bit API counterparts, in hopes that this will help create a consistent and uniform interface to these libraries.

Overview
Netlibs specifications are very precise in terms of API but they lack the necessary specifications for developers on how to package these libraries.

Two important things missing, from a software development point of view: There is no consistent nomenclature for linking a BLAS/LAPACK library. In fact there are no guidelines to mandate that the libraries need to be called  or   at all. It is dependent on the OS's package maintainer to make sure that a package which uses these libraries is linked to the BLAS library provided by the OS. Netlib guidelines do not need that BLAS is provided by a single library. It is possible to split the library into three chunks,, one for each level of the BLAS routines, and link with all of them during compile time. This model is particularly useful if you are working with GPU based architectures where Level 1 and 2 functions can be done on the CPU while larger Level 3 functions should be done on the GPU. Yet again, it is up to the package maintainer to ensure that all libraries get linked.
 * Shared library naming
 * Symbol presence

The recommendations and guarantees in this article should help all package maintainers and developers deal with BLAS/LAPACK dependent packages in a consistent manner. These are the same guarantees provided by the Debian packages, hence there should be minimal tweaking needed in writing and porting code for Gentoo vs Debian ( and derivatives ).

Shared library usage
Traditionally, it was only important to have the  bit implementations present but the new and upcoming software have options to take advantage of the   bit computation models and use the new interfaces. For the longest time, it has not been possible to have both models present at the same time. With the new nomenclature there should be little doubt on what API a package is using.

Library naming and linking
The libraries that need to be linked will always be named accordingly:



If using the corresponding library, it is guaranteed that this library can be linked to at compile time with the appropriate flag, e.g..

Runtime usage
It is guaranteed that during runtime, the linker will link to a BLAS/LAPACK provider, by linking to the specified library.

This situation happens when using Intels MKL libraries as substitute for BLAS/LAPACK, hence a fair warning has been issued.

It is guaranteed that the symbols will be resolvable at runtime by the runtime linker and that the API functions will have an implementation during runtime. There will be no unresolvable symbols during runtime.

Do not expect that all functions are provided by the same library provider. It is possible to have BLAS and LAPACK resolved from different providers.

Symbol naming conflicts
Each of the libraries will have the same functions named in its API as its 64 bit counterpart, except with the difference of data being stored in incompatible types.

Maintainers must make sure that if a package expects to use the 64 bit API it is linked to the correct providing library.

Runtime switching
It is possible (and almost assured) that the library that was linked at compile time is not the one that is used at run time.

Gentoo has a Blas-lapack-switch mechanism that allows provider changing during runtime.

This can lead to errors in code which have function optimizations and switching during compile times. Maintainers should ensure that this (very rare) situation does not arise for their package.

This situation is most common when packages have a flag for building with MKL. In those cases, the recommendation is to build packages with only one of BLAS or MKL and not both.

Threading model
If using Intel's, its use flag tbb should be enabled on a global scale so that all libraries supporting it are enabled with it. Mixing of threading backends is inconsistent and can blow up resource usage.

It is recommended that all libraries should try to select the same OpenMP library, either the GNU/Intel/Clang library. This is not always possible due to API incompatibility and most packages using API extensions provided by said libraries. In such cases, try to minimize the contact between consumer programs for preventing esoteric error conditions.