Clang

From Gentoo Wiki
Jump to:navigation Jump to:search
Resources

Clang is a "LLVM native" C/C++/Objective-C compiler using LLVM as a backend and optimizer. It aims to be GCC compatible yet stricter, offers fast compile times with low memory usage, and has useful error and warning messages for easier compile troubleshooting.

Installation

Prerequisites

One of the goals of the Clang project is to be compatible with GCC, in dialect and on the command-line (see #Differences from GCC). Occasionally some packages will fail to build correctly with it and some may build successfully but segfault when executed. Some packages also have GCC specific code and will also fail during compiling. In these events, GCC will need to be used as a fallback.

USE flags

Some packages are aware of the USE=clang USE flag. This is typically here as specific flags must be used by the package internally for it to compile or work correctly at runtime:

FILE /etc/portage/make.conf
USE="clang"

Packages with a USE=clang USE flag usually don't need any environment variables set, as they will handle it instead.

Emerge

Emerge clang:

root #emerge --ask --update --deep --changed-use sys-devel/clang

Configuration

There are many possible configurations.

Users may wish to default to Clang and selectively use GCC or vice-versa.

There are two ways to do this:

  1. System wide using /etc/portage/make.conf or,
  2. via environment variables like the one(s) created for the GCC fallback.

Basic setup

This example is for defaulting to Clang but using GCC per-package for those which fail to build with Clang.

Clang environment

Warning
Clang versions prior to 14.0.0 did not have a default-pie option similar to gcc. Prior versions would need -fPIC in CFLAGS and -pie in LDFLAGS.

When attempting to use Clang system wide the system absolutely must have a GCC fallback! This cannot be stressed enough as the system will not be able to compile everything using Clang at the moment, such as glibc or wine-vanilla.

Gentoo maintains a bug tracker for packages that fail to build with Clang. Configuring Gentoo to use Clang system wide is simple. Change the CC and CXX variables in /etc/portage/make.conf to reference the Clang equivalents. No further configuration is necessary.

FILE /etc/portage/make.conf
# Normal settings here
COMMON_FLAGS="-O2 -march=native"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"

CC="clang"
CXX="clang++"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

LDFLAGS="${LDFLAGS} -fuse-ld=lld -rtlib=compiler-rt -unwindlib=libunwind -Wl,--as-needed"

# Hardening which isn't (yet?) done by default for Clang, unlike GCC.
_HARDENING_FLAGS="-fstack-protector-strong -D_FORTIFY_SOURCE=2"
CFLAGS="${CFLAGS} ${_HARDENING_FLAGS}"
CXXFLAGS="${CXXFLAGS} ${_HARDENING_FLAGS}"
LDFLAGS="${LDFLAGS} -Wl,-z,relro,-z,now"

Alternatively, the same contents could be put in e.g. /etc/portage/env/compiler-clang. This would allow using Clang on a per package basis by invoking the compiler-clang environment file if desired:

FILE /etc/portage/package.envUsing the Clang compiler for app-foo/bar and app-bar/baz
app-foo/bar compiler-clang
app-bar/baz compiler-clang

The setup of a clang + LTO environment is described later in the article.

GCC fallback environment

Create a configuration file with a set of environment variables using Portage's built in /etc/portage/env directory. This will override any defaults for any packages that fail to compile with clang.

The name used below is just an example, so feel free to choose whatever name is desired for the fallback environment. Be sure to substitute the chosen name with the examples used in this article.

The most basic example is:

FILE /etc/portage/env/compiler-gccEnvironment named compiler-gcc
CC="gcc"
CXX="g++"

AR="${CHOST}-ar"
NM="${CHOST}-nm"
RANLIB="${CHOST}-ranlib"

COMMON_FLAGS="-O2 -march=native"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"
LDFLAGS="-Wl,--as-needed"

In the event GCC should be used as the fallback environment, set the appropriate flags in the /etc/portage/package.env file:

FILE /etc/portage/package.envFalling back to GCC for app-foo/bar and app-bar/baz
# Compiled using GCC with no link-time optimization since package baz fails using lto
app-bar/baz compiler-gcc
# Compiled using GCC with link-time optimization since package bar compiles using lto
app-foo/bar compiler-gcc-lto

Advanced examples

Adjust the following /etc/portage/env entries to suit the desired needs, such as enabling/disabling link-time optimizations, alternative AR, NM, RANLIB, and so on.

For enabling LTO:

FILE /etc/portage/env/compiler-gcc-ltoEnvironment variable named compiler-gcc-lto
# $N refers to the amount of threads used during LTO, one option is to be set to the value of $(nproc)
CFLAGS="-flto=$N -march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"

CC="gcc"
CXX="g++"
AR="gcc-ar"
NM="gcc-nm"
RANLIB="gcc-ranlib"

NO-LTO GCC fallback option:

FILE /etc/portage/env/compiler-gccEnvironment variable named compiler-gcc
CFLAGS="-march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"

CC="gcc"
CXX="g++"
AR="ar"
NM="nm"
RANLIB="ranlib"

Basically, copy the current working GCC config from make.conf, in the event it needs to be used it as a fallback.

When choosing to use LLVM's implementation of AR, NM, and RANLIB as detailed later in the article, be sure to set them back to the GNU versions for the GCC fallback environments as shown in the above example.

When choosing to not LTO, ignore the AR, NM, and RANLIB variables. When desiring to continue to use link-time optimization it's a good idea to have two separate environments like the above examples.

In the event the GCC fallback environment is needed, set the appropriate flags in the /etc/portage/package.env file:

FILE /etc/portage/package.envFalling back to GCC for app-foo/bar and app-bar/baz
# Compiled using GCC with link-time optimization since package bar compiles using lto
app-foo/bar compiler-gcc-lto
# Compiled using GCC with no link-time optimization since package baz fails using lto
app-bar/baz compiler-gcc

Usage

This covers more advanced usage than described above for configuration.

Bootstrapping the Clang toolchain

For a "pure" Clang toolchain, one can build the whole LLVM stack using itself.

This is detailed in a subpage: Clang/Bootstrapping.

Link-time optimizations with Clang

The link-time optimization feature defers optimizing the resulting executables to linking phase. This can result in better optimization of packages but isn't standard behavior in Gentoo yet. Clang uses the lld linker for LTO.

Environment

Clang supports two types of link time optimization:

  • Full LTO, which is the traditional approach also used by gcc where the whole link unit is analyzed at once. Using it is no longer recommended.
  • ThinLTO, where the link unit is scanned and split up into multiple parts.[1] With ThinLTO, the final compilation units only contain the code that are relevant to the current scope, thus speeding up compilation, lowering footprint and allowing for more parallelism at (mostly) no cost. ThinLTO is the recommended LTO mode when using Clang.

For full LTO, replace -flto=thin with -flto in the following examples. There should be no compatibility differences between full LTO and thin LTO. Additionally, if Clang was not built with the default-lld USE flag, add the -fuse-ld=lld value to the following LDFLAGS.

FILE /etc/portage/env/compiler-clang-ltoEnvironment named compiler-clang-lto
CFLAGS="${CFLAGS} -flto=thin"
CXXFLAGS="${CXXFLAGS} -flto=thin"
# -O2 in LDFLAGS refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler          
LDFLAGS="${LDFLAGS} -Wl,-O2 -Wl,--as-needed"

CC="clang"                         
CXX="clang++"

As an alternative, LLVM provides its own ar, nm, and ranlib values. Feel free to use them though mileage may vary over using the standard ar, nm, and ranlib, since they're intended to handle LLVM bitcode which Clang produces when using the -flto flag.

FILE /etc/portage/env/compiler-clang-ltoEnvironment named compiler-clang-lto
CFLAGS="${CFLAGS} -flto=thin"
CXXFLAGS="${CXXFLAGS} -flto=thin"
# -O2 in LDFLAGS refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler              
LDFLAGS="${LDFLAGS} -Wl,-O2 -Wl,--as-needed"

CC="clang"
CXX="clang++"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

Now set /etc/portage/package.env overrides using Clang with LTO enabled:

FILE /etc/portage/package.envEnabling LTO for app-foo/bar and app-bar/baz
app-foo/bar compiler-clang-lto
app-bar/baz compiler-clang-lto

Global configuration

Similar to what was covered earlier in the article, a system wide Clang with LTO enabled can be done by changing the /etc/portage/make.conf file:

FILE /etc/portage/make.confSetting the system compiler to Clang
CFLAGS="${CFLAGS} -flto=thin"
CXXFLAGS="${CXXFLAGS} -flto=thin"
# -O2 in LDFLAGS refers to binary size optimization during linking, it is NOT related to the -O levels of the compiler          
LDFLAGS="${LDFLAGS} -Wl,-O2 -Wl,--as-needed"

CC="clang"
CXX="clang++"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

Again, it is possible to set the AR, NM, and RANLIB to the LLVM implementations. Since earlier in the article compiler environments were set up using Clang without LTO, GCC without LTO, and GCC with LTO, it is now possible to pick and choose which is best on a per package basis. Since the goal is to compile packages system wide with Clang using LTO and not every package will successfully compile using it, fall back to Clang with LTO disabled or GCC. The /etc/portage/package.env may look like the following:

FILE /etc/portage/package.envExample package.env setup
# Compiled using Clang with no link-time optimization since package bar fails using flto
app-foo/bar compiler-clang
# Compiled using GCC with no link-time optimization since package baz fails using flto
app-bar/baz compiler-gcc
# Compiled using GCC with link-time optimization since package foo compiles using flto
app-baz/foo compiler-gcc-lto

distcc

In order to use Clang on a distcc client, additional symlinks must to be created in /usr/lib*/distcc/bin:

root #ln -s /usr/bin/distcc /usr/lib/distcc/bin/clang
root #ln -s /usr/bin/distcc /usr/lib/distcc/bin/clang++

ccache

ccache support is automatic once Clang is emerged.

Kernel

The Linux kernel can be compiled via clang/llvm. These steps are mentioned in bug #786405.

Genkernel

When using genkernel, edit the /etc/genkernel.conf by substituting the following "Low Level Compile Settings" and adding the additional MAKEOPTS:

FILE /etc/genkernel.confSample Clang/llvm genkernel.conf
# =========Low Level Compile Settings=========
#
# Additional make options
MAKEOPTS="LLVM=1 LLVM_IAS=1 ${MAKEOPTS}"

# Assembler to use for the kernel.  See also the --kernel-as command line
# option.
## FIXME: llvm-as may not be a compatible tool
## KERNEL_AS="llvm-as"

# Archiver to use for the kernel.  See also the --kernel-ar command line
# option.
KERNEL_AR="llvm-ar"

# Compiler to use for the kernel (e.g. distcc).  See also the --kernel-cc
# command line option.
KERNEL_CC="clang"

# Linker to use for the kernel.  See also the --kernel-ld command line option.
KERNEL_LD="ld.lld"

# NM utility to use for the kernel.  See also the --kernel-nm command line option.
KERNEL_NM="llvm-nm"

# GNU Make to use for kernel.  See also the --kernel-make command line option.
#KERNEL_MAKE="make"

# not exposed in default config
KERNEL_OBJCOPY="llvm-objcopy"
KERNEL_OBJDUMP="llvm-objdump"
KERNEL_READELF="llvm-readelf"
KERNEL_STRIP="llvm-strip"
KERNEL_RANLIB="llvm-ranlib"

# Assembler to use for the utilities.  See also the --utils-as command line
# option.
## FIXME: llvm-as may not be a compatible tool
# it broke building util-linux for me for certain users
##UTILS_AS="llvm-as"

# Archiver to use for the utilities.  See also the --utils-ar command line
# option.
UTILS_AR="llvm-ar"

# C Compiler to use for the utilities (e.g. distcc).  See also the --utils-cc
# command line option.
UTILS_CC="clang"

# C++ Compiler to use for the utilities (e.g. distcc).  See also the --utils-cxx
# command line option.
UTILS_CXX="clang++"

# Linker to use for the utilities.  See also the --utils-ld command line
# option.
UTILS_LD="ld.lld"

# NM utility to use for the utilities.  See also the --utils-nm command line option.
UTILS_NM="llvm-nm"

# GNU Make to use for the utilities.  See also the --utils-make command line
# option.
#UTILS_MAKE="make"

# not exposed in default config
UTILS_OBJCOPY="llvm-objcopy"
UTILS_OBJDUMP="llvm-objdump"
UTILS_READELF="llvm-readelf"
UTILS_STRIP="llvm-strip"
UTILS_RANLIB="llvm-ranlib"

# Target triple (i.e. aarch64-linux-gnu) to build for. If you do not
# cross-compile, leave blank for auto detection.
#CROSS_COMPILE=""

# Override default make target (bzImage). See also the --kernel-target
# command line option. Useful to build a uImage on arm.
#KERNEL_MAKE_DIRECTIVE_OVERRIDE="fooImage"

# Override default kernel binary path. See also the --kernel-binary
# command line option. Useful to install a uImage on arm.
#KERNEL_BINARY_OVERRIDE="arch/foo/boot/bar"

After that use genkernel as usual:

root #genkernel all

Modules

Additionally, the same options will have to be provided for any kernel modules:

FILE /etc/portage/package.envUse clang for any kernel module packages
# Compiled using clang like kernel itself
app-foo/bar compiler-clang
FILE /etc/portage/env/compiler-clangEnvironment named compiler-clang
# This is added to make options by linux-mod.eclass
BUILD_FIXES="LLVM=1 LLVM_IAS=1"

# CC/CCX and other tools must match genkernel config

Further, once clang becomes the default compiler, it might be possible to use portageq envvar and make things DRY.

Troubleshooting

The main place for looking up known failures with Clang is the tracker bug #408963. If hitting an issue not reported on Gentoo's Bugzilla already, please open a new bug report and make it block the linked tracker.

Compile errors when using Clang with -flto

If the packages being installed are failing, check the logs. Often, packages with errors like the following will need to disable LTO by invoking the compiler-clang environment.

FILE /var/log/portage/sys-apps:less-483-r1:20160712-034715.log
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o:1:3: invalid character
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o:1:3: syntax error, unexpected $end
/usr/bin/x86_64-pc-linux-gnu-ld: error: version.o: not an object or archive

The following error may be seen in every LTO failure case:

FILE /var/log/portage/sys-apps:less-483-r1:20160712-034715.log
x86_64-pc-linux-gnu-clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)

Simply add the failing package to /etc/portage/package.env. In this case, it's the sys-apps/less package, so to apply the proper override.

FILE /etc/portage/package.envExample package.env setup
# Compiled using Clang with no link-time optimization since the package 'less' fails using lto
sys-apps/less compiler-clang

Sometimes a package will fail to compile even when disabling LTO because it requires another package which was compiled using -flto and works incorrectly. Something like the following error may be seen:

FILE /var/log/portage/dev-libs:boehm-gc-7.4.2:20160713-085706.log
/usr/lib64/libatomic_ops.a: error adding symbols: Archive has no index; run ranlib to add one

In this case libatomic_ops is causing boehm-gc to fail compiling. Recompile the program causing the failure using the non-LTO environment and then recompile the new program. In this case, boehm-gc fails when using LTO, so add both of them to the /etc/portage/package.env file to build them without LTO:

FILE /etc/portage/package.envExample package.env setup
dev-libs/boehm-gc compiler-clang
dev-libs/libatomic_ops compiler-clang

Use of GNU extensions without proper -std=

Some packages tend to use GNU extensions in their code without specifying -std= appropriately. GCC allows that usage, yet Clang disables some of more specific GNU extensions by default.

If a particular package relies on such extensions being available, then append the correct -std= flag to it:

  • -std=gnu89 for C89/C90 with GNU extensions,
  • -std=gnu99 for C99 with GNU extensions,
  • -std=gnu++98 for C++:1998 with GNU extensions.

A common symptom of this problem are multiple definitions of inline functions like this:

FILE /var/log/portage/example.logExample package error in example log
'"`UNIQ--pre-0000002B-QINU`"'

This is because Clang uses C99 inline rules by default which do not work with gnu89 code. To work around it, it is likely necessary to pass -std=gnu89 or set one of the environmental overrides to use GCC to compile the failing package if passing the right -std= flag doesn't work.

Since both current (2020) GCC and Clang default to -std=gnu17 with C99 inline rules, chances are the problems have already been spotted by a GCC user.

Differences from GCC

Clang's optimizer is different from GCC's. As a result, the command-line semantics are different:

  • The -O flags will work, but mean slightly different things.
    • Clang also vectorizes on -O2 and -Os, albeit more conservatively in terms of code size than -O3.
    • Instead of being the same as -O3, Clang's -O4 is an alias of -O3 -flto.
  • The compatibility of -f flags are limited as they can be simply meaningless to Clang.
  • The -m and related flags are supposed to work identically, but Clang may not know about certain options. There are also Clang-only options not known by GCC.
  • The PGO in clang is a bit different as it requires post-processing the sample with llvm-profdata.

The differences in language are documented by the project itself.[2]

References

External resources