Clang

From Gentoo Wiki
Jump to:navigation Jump to:search
Resources

Clang is a "LLVM native" C/C++/Objective-C compiler using LLVM as a backend and optimizer. It aims to be GCC compatible yet stricter, offers fast compile times with low memory usage, and has useful error and warning messages for easier compile troubleshooting.

Installation

Prerequisites

One of the goals of the Clang project is to be compatible with GCC, in dialect and on the command-line (see #Differences from GCC). Occasionally some packages will fail to build correctly with it and some may build successfully but segfault when executed. Some packages also have GCC specific code and will also fail during compiling. In these events, GCC will need to be used as a fallback.

USE flags

Some packages are aware of the clang USE flag.

FILE /etc/portage/make.conf
USE="clang"

Emerge

root #emerge --ask --update --deep --changed-use sys-devel/clang

Configuration

GCC fallback environments

Create a configuration file with a set of environment variables using Portage's built in /etc/portage/env directory. This will override any defaults for any packages that fail to compile with clang. The name used below is just an example, so feel free to choose whatever name is desired for the fallback environment. Be sure to substitute chosen name with the examples used in this article.

FILE /etc/portage/env/compiler-gccEnvironment named compiler-gcc
CC="gcc"
CXX="g++"

The above is the most basic environmental variable needed. You can change it to suit your needs, such as enabling/disabling link-time optimizations, alternative AR, NM, RANLIB, and so on. Here are two examples below:

FILE /etc/portage/env/compiler-gcc-ltoEnvironment variable named compiler-gcc-lto
CC="gcc"
CXX="g++"
CFLAGS="-flto=$N -march=native -O2 -pipe"    #$N refers to the amount of threads used during LTO, you should usually set it to $(nproc)
CXXFLAGS="${CFLAGS}"
AR="gcc-ar"
NM="gcc-nm"
RANLIB="gcc-ranlib"
FILE /etc/portage/env/compiler-gccEnvironment variable named compiler-gcc
CC="gcc"
CXX="g++"
CFLAGS="-march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"
AR="ar"
NM="nm"
RANLIB="ranlib"

Basically, copy over your current working GCC config from your make.conf in the event we need to use it as a fallback. If you choose to use LLVM's implementation of AR, NM, and RANLIB as detailed later in the article, be sure to set them back to the GNU versions for your GCC fallback environments as shown in the above example. If you choose not to, you can ignore the AR, NM, and RANLIB variables. If you want to use link-time optimization it's a good idea to have two separate environments like the above examples.

In the event you have to use the GCC fallback environment(s) set the appropriate flags in the /etc/portage/package.env file.

FILE /etc/portage/package.envFalling back to GCC for app-foo/bar and app-bar/baz
app-foo/bar compiler-gcc-lto        #compiled using GCC with link-time optimization since package bar compiles using lto
app-bar/baz compiler-gcc     #compiled using GCC with no link-time optimization since package baz fails using lto

Clang environments

Now that we've set up a safe fallback we can proceed to enable the usage of Clang in Gentoo. There are two ways to do this: System wide using /etc/portage/make.conf or via environmental variables like the one(s) we created for the GCC fallback.

We'll use the same process as we did earlier in the article for setting up GCC fallbacks.

FILE /etc/portage/env/compiler-clangEnvironment variable named compiler-clang
CC="clang"
CXX="clang++"

You can now use Clang on a per package basis by invoking the compiler-clang environmental variable you created.

FILE /etc/portage/package.envUsing the Clang compiler for app-foo/bar and app-bar/baz
app-foo/bar compiler-clang
app-bar/baz compiler-clang

The setup of a clang + LTO environment is described later in the article.

Global configuration via make.conf

When attempting to use Clang system wide the system absolutely must have a GCC fallback! This cannot be stressed enough as the system will not be able to compile everything using Clang at the moment, such as the GCC compiler. Gentoo maintains a bug tracker for packages that fail to build with Clang. Configuring Gentoo to use Clang system wide is simple. Change the CC and CXX variables in /etc/portage/make.conf to reference the Clang equivalents. No further configuration is necessary.

FILE /etc/portage/make.confSetting the system compiler to Clang
CC="clang"
CXX="clang++"

Packages that must use GCC for compiling can be handled with one of the fallback environments created earlier.

Usage

Bootstrapping the Clang toolchain

Mixing clang and its toolchain / libraries with the gcc toolchain / libraries (especially the linker) will often lead to issues like linker errors during emerge. To prevent this, the clang toolchain is built first with gcc and then with itself to get a self-providing compiler.

Prepare the environment for the Clang toolchain (see above), e.g.

FILE /etc/portage/env/compiler-clang
CC="clang"
CXX="clang++"
LDFLAGS="-fuse-ld=lld -rtlib=compiler-rt -unwindlib=libunwind"

This example replaces not only the compiler but also the GNU linker ld.bfd with the llvm linker lld. It is a drop-in replacement, but significantly faster than the bfd linker.

Set USE flags default-compiler-rt default-lld llvm-libunwind for clang. Then emerge clang llvm compiler-rt llvm-libunwind lld with the default gcc environment:

root #emerge clang llvm compiler-rt llvm-libunwind lld

You can also add the default-libcxx USE flag to use llvms C++ STL with clang, however this is HEAVILY discouraged as libstdc++ and libc++ are not ABI compatible. A program built against libstdc++ will likely break when using a library built against libc++, and vice versa.

Note that sys-libs/llvm-libunwind deals with linking issues that sys-libs/libunwind has, so it is preferred to use and replaces the non-llvm libunwind package if installed (it builds with -lgcc_s to resolve issues with __register_frame / __deregister_frame undefined symbols).

Enable the clang environment for these packages now:

FILE /etc/portage/package.env
sys-devel/llvm compiler-clang
sys-libs/libcxx compiler-clang
sys-libs/libcxxabi compiler-clang
sys-libs/compiler-rt compiler-clang
sys-libs/compiler-rt-sanitizers compiler-clang
sys-libs/llvm-libunwind compiler-clang
sys-devel/lld compiler-clang
sys-devel/clang compiler-clang

Repeat the emerge step with the new environment. The toolchain will now be rebuild with itself instead of gcc.

root #emerge clang llvm libcxx libcxxabi compiler-rt llvm-libunwind lld

You are now free to use clang with other packages.

Link-time optimizations with Clang

The link-time optimization feature defers optimizing the resulting executables to linking phase. This can result in better optimization of packages but isn't standard behavior in Gentoo yet. Clang uses lld for LTO.

Note: Clang can also do LTO via the gold linker, however this is discouraged by llvm since gold is effectively dead upstream. To use gold with clang + LTO, you must first emerge llvm with the gold USE flag, and then set -fuse-ld=gold in the following examples.

Environment

Clang supports two types of link time optimization:

  • Full LTO, which is the traditional approach also used by gcc where the whole link unit is analyzed at once. Using it is no longer recommended.
  • ThinLTO, where the link unit is scanned and split up into multiple parts.[1] With ThinLTO, the final compilation units only contain the code that are relevant to the current scope, thus speeding up compilation, lowering footprint and allowing for more parallelism at (mostly) no cost. ThinLTO is the recommended LTO mode when using Clang.

If you need to use full LTO for some reason, replace -flto=thin with -flto in the following examples. There should be no compatibility differences between Full LTO and ThinLTO. Additionally, if you did not build Clang with the default-lld useflag, you will have to add -fuse-ld=lld to the following LDFLAGS.

FILE /etc/portage/env/compiler-clang-ltoEnvironment named compiler-clang-lto
CC="clang"                            
CXX="clang++"                         
CFLAGS="${CFLAGS} -flto=thin"              
CXXFLAGS="${CXXFLAGS} -flto=thin"          
LDFLAGS="-Wl,-O2 -Wl,--as-needed"    #-O2 refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler

As an alternative, LLVM provides their own ar, nm, and ranlib. You're free to use them and may or may not get more mileage over using the standard ar, nm, and ranlib since they're intended to handle LLVM bitcode which Clang produces when using the -flto flag.

FILE /etc/portage/env/compiler-clang-ltoEnvironment named compiler-clang-lto
CC="clang"                            
CXX="clang++"                         
CFLAGS="${CFLAGS} -flto=thin"              
CXXFLAGS="${CXXFLAGS} -flto=thin"          
LDFLAGS="-Wl,-O2 -Wl,--as-needed"    #-O2 refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

Now you can set /etc/portage/package.env overrides using Clang with LTO enabled.

FILE /etc/portage/package.envEnabling LTO for app-foo/bar and app-bar/baz
app-foo/bar compiler-clang-lto
app-bar/baz compiler-clang-lto

Global configuration

Similar to what we covered earlier in the article, we can do a system wide Clang with LTO enabled setup by changing our /etc/portage/make.conf file.

FILE /etc/portage/make.confSetting the system compiler to Clang
CC="clang"                            
CXX="clang++"                         
CFLAGS="${CFLAGS} -flto=thin"              
CXXFLAGS="${CXXFLAGS} -flto=thin"          
LDFLAGS="-Wl,-O2 -Wl,--as-needed"    #-O2 refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

Again, it's up to you if you want to set the AR, NM, and RANLIB to the LLVM implementations. Since earlier in the article we set up compiler environments using Clang without LTO, GCC without LTO, and GCC with LTO, we can pick and choose which is best on a per package basis. Since the goal is to compile packages system wide with Clang using LTO and not every package will successfully compile using it, we'll have to fall back to Clang with LTO disabled or GCC. Your /etc/portage/package.env may look like this:

FILE /etc/portage/package.envExample package.env setup
app-foo/bar compiler-clang   #compiled using Clang with no link-time optimization since package bar fails using flto
app-bar/baz compiler-gcc     #compiled using GCC with no link-time optimization since package bar fails using flto
app-baz/foo compiler-gcc-lto        #compiled using GCC with link-time optimization since package foo compiles using flto

distcc

In order to use clang on a distcc client, additional symlinks have to be created in /usr/lib*/distcc/bin:

root #ln -s /usr/bin/distcc /usr/lib/distcc/bin/clang
root #ln -s /usr/bin/distcc /usr/lib/distcc/bin/clang++

ccache

Automatic with `>=ccache-3.9-r3` when Clang is emerged.

Troubleshooting

The main place for looking up known failures with clang is bug #408963. If you hit one not reported on our Bugzilla already, please open a new bug report and make it block 408963.

Compile errors when using Clang with -flto

If the packages you're installing are failing, check your logs. Often times packages with errors like the following will need to disable LTO by invoking the compiler-clang environment.

FILE /var/log/portage/sys-apps:less-483-r1:20160712-034715.log
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o:1:3: invalid character
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o:1:3: syntax error, unexpected $end
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o: not an object or archive

You will also most likely see this error in every LTO failure case.

FILE /var/log/portage/sys-apps:less-483-r1:20160712-034715.log
x86_64-pc-linux-gnu-clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)

Simply add the failing package to your /etc/portage/package.env. In this case it's sys-apps/less, so we'll apply the proper override.

FILE /etc/portage/package.envExample package.env setup
sys-apps/less compiler-clang   #compiled using Clang with no link-time optimization since package less fails using lto

Sometimes a package will fail to compile even when disabling LTO because it requires another package which was compiled using -flto and works incorrectly. You may see an error like this:

FILE /var/log/portage/dev-libs:boehm-gc-7.4.2:20160713-085706.log
/usr/lib64/libatomic_ops.a: error adding symbols: Archive has no index; run ranlib to add one

In this case libatomic_ops is causing boehm-gc to fail compiling. Recompile the program causing the failure using your non-LTO environment and then recompile the new program. In this case, boehm-gc fails when using LTO, so we'll add both of them to our /etc/portage/package.env file to build them without LTO.

FILE /etc/portage/package.envExample package.env setup
dev-libs/boehm-gc		compiler-clang
dev-libs/libatomic_ops		compiler-clang

Use of GNU extensions without proper -std=

Some packages tend to use GNU extensions in their code without specifying -std= appropriately. GCC allows that usage, yet Clang disables some of more specific GNU extensions by default.

If a particular package relies on such extensions being available, you will need to append the correct -std= flag to it:

  • -std=gnu89 for C89/C90 with GNU extensions,
  • -std=gnu99 for C99 with GNU extensions,
  • -std=gnu++98 for C++:1998 with GNU extensions.

A common symptom of this problem are multiple definitions of inline functions like this:

FILE /var/log/portage/Example package error in example log
/usr/bin/x86_64-pc-linux-gnu-ld: error: ../mpi/.libs/libmpi.a(mpi-bit.o): multiple definition of '_gcry_mpih_add'
/usr/bin/x86_64-pc-linux-gnu-ld: ../mpi/.libs/libmpi.a(mpi-add.o): previous definition here
/usr/bin/x86_64-pc-linux-gnu-ld: error: ../mpi/.libs/libmpi.a(mpi-bit.o): multiple definition of '_gcry_mpih_add_1'
/usr/bin/x86_64-pc-linux-gnu-ld: ../mpi/.libs/libmpi.a(mpi-add.o): previous definition here

This is because Clang uses C99 inline rules by default which do not work with gnu89 code. To work around it, you most likely have to pass -std=gnu89 or set one of your environmental overrides to use GCC to compile the failing package if passing the right -std= flag doesn't work.

Since both current (2020) GCC and Clang default to -std=gnu17 with C99 inline rules, chances are the problems have already been spotted by a GCC user.

Differences from GCC

Clang's optimizer is different from GCC's. As a result, the command-line semantics are different.

  • The -O flags will work, but mean slightly different things.
    • Clang also vectorizes on -O2 and -Os, albeit more conservatively in terms of code size than -O3.
    • Instead of being the same as -O3, Clang's -O4 is an alias of -O3 -flto.
  • The compatibilty of -f flags are limited as they can be simply meaningless to Clang.
  • The -m and related flags are supposed to work identically, but Clang may not know about certain options. There are also Clang-only options not known by GCC.
  • The PGO in clang is a bit different as it requires post-processing the sample with llvm-profdata.

The differences in language are documented by the project itself.[2]

References

External resources