Blas-lapack-switch

From Gentoo Wiki
Jump to: navigation, search
Warning
This article has been flagged as dirty by cronolio (talk | contribs) for not conforming to the wiki guidelines. It is now grouped in the list of articles that need formatting improvements.
Resources

The BLAS/LAPACK runtime switching mechanism was created as a GSoC2019 project. It is based on an ld.so feature and produces similar result to Debian's update-alternatives. Classical numerical linear algebra libraries, BLAS and LA-PACK play important roles in the scientific computing field. Various demands on these libraries pose non-trivial challenges on system management and Linux distribution development. By leveraging this mechanism which enables user to switch BLAS and LAPACK libraries smoothly and painlessly, the problems could be properly and decently addressed. This project aims at introducing the mechanism into Gentoo’s eselect framework to manage BLAS and LA-PACK, providing equivalent or better functionality of Debian’s update-alternatives.

User guide

Disabling the feature

This feature is disabled by default, which means users who don’t care about it could simply ignore the eselect-ldso USE flag as if it doesn’t exist and install things under the default settings like before. Users who don’t read any documentation at all won’t fall into trouble with this default setting.

Enabling the feature

First install the skeleton of the mechanism:

FILE /etc/portage/make.conf
USE="${USE} eselect-ldso"
root #emerge --ask --verbose ">=virtual/blas-3.8" ">=virtual/lapack-3.8"

These virtual packages will pull in the reference BLAS/LAPACK implementation and the customized eselect modules for BLAS and LAPACK, i.e. (>=sci-libs/lapack-3.8.0, >=app-eselect/eselect-blas-0.2, >=app-eselect/eselect-lapack-0.2). After finishing the installation, the user should be able to check the status of BLAS/LAPACK selections:

root #eselect blas list
Available BLAS/CBLAS (lib64) candidates:
  [1]   reference *
root #eselect lapack list
Available LAPACK (lib64) candidates:
  [1]   reference *

That means all binaries linked against libblas.so.3 or libcblas.so.3 will use the reference BLAS implementation; those linked against liblapack.so.3 will use the reference LAPACK implementation.

The reference implementation is very slow, and for some users (e.g. scientific computing users) this is unacceptable. In Gentoo’s main repository, there are several typical optimized BLAS/LAPACK implementations available, for example BLIS and OpenBLAS. They could be automatically registered in the mechanism as long as the eselect-ldso USE flag is toggled during installation. For example:

root #emerge --ask --verbose ">=sci-libs/blis-0.6.0" ">=sci-libs/openblas-0.3.5"

Note that without the eselect-ldso flag, these packages won’t be registered in the mechanism and won’t install extra libraries at all. After installation with the feature enabled, we could switch the BLAS/LAPACK implementation like so:

root #eselect blas set openblas
root #eselect lapack set openblas

Directly run your program again and see if it’s running faster. No any re-compilation is required thanks to this mechanism. For more details about the eselect blas or eselect lapack usage please look up the manual page or the help messages.

In Gentoo's main repository, BLAS/LAPACK providers that support this eselect-ldso mechanism are: sci-libs/lapack, sci-libs/blis, sci-libs/openblas, and sci-libs/mkl-rt. The most recommended choice is blas=openblas lapack=openblas. If non-free software is acceptable to you, blas=mkl-rt lapack=mkl-rt is also a decent choice. Advanced users could explore the rest possible combinations, but note the blas=blis lapack=openblas combination is discouraged.

If you are in the transition from the old BLAS/LAPACK packages and encountered package conflicts, keep in mind to keep >=virtual/{blas,cblas,lapack}-3.8, >=sci-libs/lapack-3.8, and >=app-eselect/eslect-{blas,lapack}-0.2. The old sci-lib/{blas,cblas,lapack,lapacke}-reference packages should be removed.

Important
Please don’t use pthread and openmp at the same time since it may incur significant performance drop due to excessive thread creation. This may happen when some libraries linked against an application use OpenMP threading, whiles some other use pthread.
Important
Please don’t use GNU OpenMP (libgomp.so) and (libiomp.so) at the same time as the symbol clash between them may lead to silent computation error. This may happen when MKL uses Intel/LLVM OpenMP while some other libraries linked against the same application use GNU OpenMP.

During migration from sci-libs/blas-reference to sci-libs/lapack such as bug #700176, it may happen that /etc/env.d/blas/lib64/reference is removed during removal of sci-libs/blas-reference. If eselect blas list shows "no entry", please reinstall sci-libs/lapack or register with app-eselect/eselect-blas again, for example on amd64,

root #eselect blas add lib64 /usr/lib64/blas/reference reference

Developer guide

Providers

It must be pointed out that for any BLAS/LAPACK implementation, providing extra shared object with proper SONAMEs is necessary. For example, do not use libopenblas.so.0 (SONAME=libopenblas.so.0) as the BLAS/CBLAS provider by simply symlinking it into libblas.so{,.3} and libcblas.so{,.3} because any program to be linked against BLAS (-lblas) or CBLAS (-lcblas) will be eventually linked against libopenblas.so.0 (verify this with readelf -d foobar), which will clearly break the runtime switching mechanism. The current solution is to patch upstream build systems and build customized shared objects with proper SONAMEs.

To package a BLAS/LAPACK provider with the runtime switching feature enabled, the maintainer should pay attention to the following points:

  • Patch upstream build systems and provide extra shared objects in a private library directory. Specifically, a new BLAS/CBLAS implementation, say "myblas", should install 4 files to the /usr/lib64/blas/<myblas>/ directory:
    1. libblas.so.3 (ELF shared object, providing the fortran BLAS ABI, SONAME=libblas.so.3)
    2. libblas.so (symlink pointing to libblas.so.3);
    3. libcblas.so.3 (ELF shared object, providing the C BLAS ABI, SONAME=libcblas.so.3)
    4. libcblas.so (symlink pointing to libcblas.so.3).
  • Similarly, a new LAPACK implementation, say "mylapack" should install 2 files to the /usr/lib64/blas/<mylapack> directory:
    1. liblapack.so.3 (ELF shared object, providing the fortran LAPACK ABI, SONAME=liblapack.so.3);
    2. liblapack.so (symlink pointing to liblapack.so).
  • Register an alternative with eselect blas add ... during postinst.
  • Remove an alternative with eselect blas validate during postrm.
  • Guard the code associated with all the above points with the eselect-ldso USE flag.

For real example please see the latest ebuild files for sci-libs/lapack, sci-libs/blis, or sci-libs/openblas.

Reverse dependencies

If a package needs to be linked against the reference (aka. netlib) BLAS and LAPACK, it should declare virtual packages dependency, i.e. virtual/{blas,cblas,lapack,lapacke} instead of a specific implementation. In this case the package must assume a standard (reference) API and ABI from the virtual package. Otherwise, please write a specific implementation in the dependency list and avoid linking against -l{,c}blas or -llapack.

Implementation details

The core part of the implementation involves >=sci-libs/lapack-3.8.0, >=eselect-blas-0.2 and >=eselect-lapack-0.2, where the former one controls both (fortran) BLAS and CBLAS alternatives at the same time.

The sci-libs/lapack is the code base of the reference (or standard) BLAS, CBLAS, LAPACK, and LAPACKE. BLAS and LAPACK are a set of stable Fortran API / ABI. CBLAS and LAPACKE are thin wrappers around BLAS and LAPACK respectively, providing the C API / ABI. In our BLAS/LAPACK runtime switching mechanism, every candidate must provide every API / ABI that the reference implementation provides. Taking advantage of the API/ABI stability, we can change the backend libraries (e.g. libblas.so.3) without recompiling applications as long as the new one provides a compatible set of ABI.

The users could easily switch the libraries by adjusting the LD_LIBRARY_PATH environment variables as a temporary solution. For system level library switching, two custom eselect modules (eselect-blas, eselect-lapack) are provided. They manipulates configuration files under the /etc/ld.so.conf.d/ directory, hinting ld.so on the places to find the BLAS/LAPACK libraries.

As a side effect, this solution depends on the ld.so.conf support from the system C standard library. Besides, it is recommended to read the code if you need even more details.

Code: app-eselect/eselect-blas app-eselect/eselect-lapack sci-libs/lapack sci-libs/blis sci-libs/openblas sci-libs/mkl-rt

Frequently asked questions

Q: I disabled this feature when installing a bunch of packages, but now I regret and want to enable the runtime switching feature. How to accomplish this?

A: Simply reinstall the virtual packages and your favorite BLAS/LAPACK providers with the eselect-ldso flag toggled. The whole dependency tree doesn’t need to be rebuilt as a rebuild is expected to make no difference.

Q: Some BLAS/LAPACK implementations support 64-bit array indexing, which provides functions such as sasum(int64_t N, float* X, int64_t INCX). How does this mechanism deal with such feature?

A: The “BLAS64” or “BLAS-ILP64” ABI is different from the “BLAS32” or “BLAS-LP64” ABI. Mixing them together will lead to unpredictable results, hence the “BLAS64” feature is not integrated into the mechanism. Currently we only provide this feature in the sci-libs/openblas package for Julia’s use. Besides, the generic switching mechanism for BLAS64/LAPACK64 is still being experimented in Debian. When the demand on “BLAS64” is common enough or the experiment in Debian was successful, we could start to provide it in Gentoo.

Q: How to add a customized implementation into this mechanism?

A: Taking MKL as an example. We first install MKL to /path/to/mkl, and symlink /path/to/mkl/libmkl_rt.so to /path/to/mkl/lib{,c}blas.so{,.3}. Then register it with eselect blas add lib64 /path/to/mkl/ mkl. Note that building programs when MKL is selected is discouraged. The reason could be found in the developer guide part.

A real example about adding and setting Intel MKL as the backend library:

user $pip install mkl --user
user $cd ~/.local/lib/
user $ln -s libmkl_rt.so libblas.so.3
user $ln -s libmkl_rt.so libblas.so
user $ln -s libmkl_rt.so libcblas.so.3
user $ln -s libmkl_rt.so libcblas.so
user $ln -s libmkl_rt.so liblapack.so.3
user $ln -s libmkl_rt.so liblapack.so
root #eselect blas add lib64 $(pwd) mkl
root #eselect lapack add lib64 $(pwd) mkl
root #eselect blas set mkl
root #eselect lapack set mkl

To remove the MKL candidate, or any other customized library, just remove the corresponding files under /etc/env.d/blas/ and /etc/env.d/lapack/ directories, then select some other candidates. Note, the sci-libs/mkl-rt package can do all the above steps for you.

Reference

  1. GSoC Project Link
  2. [gentoo-science] GSoC Proposal: Improvements to the BLAS / LAPACK and their reverse-dependencies https://archives.gentoo.org/gentoo-science/message/4d0186acdce6df538a2740e0f1146ae6
  3. [gentoo-dev] RFC: BLAS and LAPACK runtime switching https://archives.gentoo.org/gentoo-dev/message/d917547f7a9e1226fca63632a1e02026
  4. [gentoo-dev] [PATCH 0/2] RFC: Introducing ldso switching to BLAS/LAPACK https://archives.gentoo.org/gentoo-dev/message/95beba3dc1c0f684ce1ec82d51988fc8
  5. [gentoo-science] On BLAS and LAPACK int64 ABI https://archives.gentoo.org/gentoo-science/message/8e3b9567297de5a1809feb28c62be633
  6. Hasan ÇALIŞIR (Gentoo Proxy Maintainer) wrote an “openblas” script for similar switching purpose. However the implementation is neither generic nor simple enough. See https://github.com/gentoo/gentoo/pull/11700/files
  7. Zongyu Zhang fixed a bug in numpy ebuild so that numpy could make use of the switching mechanism correctly.
  8. Some positive user feedbacks: https://github.com/gentoo/sci/issues/805#issuecomment-510469206 https://github.com/gentoo/sci/issues/805#issuecomment-512097570

Related pull requests:

  1. https://github.com/gentoo/gentoo/pull/12316
  2. https://github.com/gentoo/gentoo/pull/12318
  3. https://github.com/gentoo/gentoo/pull/12319
  4. https://github.com/gentoo/gentoo/pull/12322
  5. https://github.com/gentoo/gentoo/pull/12323
  6. https://github.com/gentoo/gentoo/pull/12356
  7. https://github.com/gentoo/gentoo/pull/12357
  8. https://github.com/gentoo/gentoo/pull/12358
  9. https://github.com/gentoo/gentoo/pull/12381
  10. https://github.com/gentoo/gentoo/pull/12382
  11. https://github.com/gentoo/gentoo/pull/12405
  12. https://github.com/gentoo/gentoo/pull/12409
  13. https://github.com/gentoo/gentoo/pull/12420
  14. https://github.com/gentoo/gentoo/pull/12422
  15. https://github.com/gentoo/gentoo/pull/12423
  16. https://github.com/gentoo/gentoo/pull/12475
  17. https://github.com/gentoo/gentoo/pull/12742

Maintainers

Author: Mo Zhou lumin@debian.org GSoC Mentor: Benda Xu heroxbd@gentoo.org

See also