Distcc

From Gentoo Wiki
Jump to: navigation, search
This page is a translated version of the page Distcc and the translation is 30% complete.

Other languages:
Deutsch • ‎English • ‎Türkçe • ‎español • ‎français • ‎italiano • ‎polski • ‎русский • ‎中文(中国大陆)‎ • ‎日本語 • ‎한국어
Resources

Distcc 是一个被设计用来将编译任务发布到网络上特定主机的程序。 它由一个服务器 distccd和一个客户端 distcc>程序组成. Distcc 可以很好的与ccache, Portage协同工作,对Automake只需少量的配置。

如果打算使用distcc 来安装Gentoo系统,确保先阅读Using distcc to bootstrap.

安装

在配置distcc之前,让我们先看看在所有主机上安装的sys-devel/distcc软件包。

所有主机的要求

为了使用distcc,网络上的所有计算机都必须具有相同版本的GCC。例如,可以混合使用3.3.x(其中x有所不同),但是将3.3.x与3.2.x混合可能会导致编译错误或运行时错误。

验证所有系统都使用相同版本的binutils(eselect binutils list),否则许多软件包将由于诸如文本重定位之类的各种错误而链接失败。

USE 标记

USE flags for sys-devel/distcc Distribute compilation of C code across several machines on a network

gnome Add GNOME support
gssapi Enable support for net-libs/libgssglue
gtk Add support for x11-libs/gtk+ (The GIMP Toolkit)
hardened Activate default security enhancements for toolchain (gcc, glibc, binutils)
ipv6 Add support for IP version 6
selinux !!internal use only!! Security Enhanced Linux support, this must be set by the selinux profile or breakage will occur
xinetd Add support for the xinetd super-server
zeroconf Support for DNS Service Discovery (DNS-SD)

安装(Emerge)

Distcc附带了一个图形监视器,用于监视计算机发送出的编译任务。当设置了gtk USE标志时,将启用此监视器。

配置USE设置后,安装sys-devel/distcc软件包:

root #emerge --ask sys-devel/distcc
重要
记得在所有参与的计算机上安装sys-devel/distcc

配置

服务

为了使distccd自动启动,请遵循以下说明。

OpenRC

编辑/etc/conf.d/distccd并确保将--allow指令设置为仅允许受信任的客户端。为了提高安全性,请使用--listen指令来告知distccd守护程序要侦听的IP(对于多宿主系统)。有关distcc安全性的更多信息,请参见Distcc安全说明

警告
可以连接到distcc服务器端口的任何人都可以作为distccd用户在该计算机上运行任意命令。

以下示例允许运行在192.168.0.4192.168.0.5的distcc客户端连接到本地运行的distccd服务器:

FILE /etc/conf.d/distccd 允许特定客户端连接到distccd
DISTCCD_OPTS="--port 3632 --log-level notice --log-file /var/log/distccd.log -N 15 --allow 192.168.0.4 --allow 192.168.0.5"
重要
使用--allow--listen很重要。请阅读distccd手册页或以上安全性文档以获取更多信息。

现在,在所有参与的计算机上启动distccd守护程序:

root #rc-update add distccd default
root #rc-service distccd start

systemd

编辑/etc/systemd/system/distccd.service.d/00gentoo.conf文件,以CIDR格式添加允许的客户端。如示例将添加192.168.1.xxx范围内的所有IP地址:

FILE /etc/systemd/system/distccd.service.d/00gentoo.conf配置 ALLOWED_SERVERS
Environment="ALLOWED_SERVERS=192.168.1.0/24"
附注
这里的名称“ALLOWED_SERVERS”相当混乱,因为它指的是允许连接到本地distccd服务器的客户端。不过,此变量在distccd服务中用作--allow option选项的值 – 有关更多信息,请参见/usr/lib/systemd/system/distccd.service文件。

进行此类更改后,重新加载单元文件:

root #systemctl daemon-reload

配置distccd自动启动,然后启动服务:

root #systemctl enable distccd
root #systemctl start distccd

指定参与主机

使用distcc-config命令设置主机列表。

以下是主机定义的示例列表。在大多数情况下,参考第1行和第2行就足够了。第2行使用/limit语法向distcc通知要在此节点上启动的最大作业数。有关第3行和第4行中使用的语法的更多信息,请参见distcc手册页

CODE 主机定义示例
192.168.0.1          192.168.0.2                       192.168.0.3
192.168.0.1/2        192.168.0.2                       192.168.0.3/10
192.168.0.1:4000/2   192.168.0.2/1                     192.168.0.3:3632/4
@192.168.0.1         @192.168.0.2:/usr/bin/distccd     192.168.0.3

还有其他几种设置主机的方法。有关更多详细信息,请参见distcc手册页(man distcc)。

如果还应该在本地计算机上进行编译,则将localhost放在主机列表中。相反,如果不使用本地计算机进行编译,请从主机列表中将其忽略。在使用本地主机的慢速计算机上,实际上可能会使速度降低。确保测试性能设置。

让我们将distcc配置为使用示例第一行中提到的主机:

root #/usr/bin/distcc-config --set-hosts "192.168.0.1 192.168.0.2 192.168.0.3"

Distcc还通过调用pump命令支持pump模式。当并行编译多个文件时,这可能会大大减少构建时间。它在服务器端缓存了经过预处理的标头,因此避免了重复重复上传和预处理这些标头文件。

要将主机配置为pump模式,请在主机定义中添加,cpp,lzo后缀。Pump模式需要cpp lzo标志(无论文件是C还是C++)。

root #/usr/bin/distcc-config --set-hosts "192.168.0.1,cpp,lzo 192.168.0.2,cpp,lzo 192.168.0.3,cpp,lzo"

使用

With Portage

Setting up Portage to use distcc is easy. It is a matter of enabling the distcc feature, and setting a decent value for the number of simultaneous build jobs (as distcc increases the amount of build resources).

Set the MAKEOPTS variable and FEATURES variable as shown below.

A common strategy is to

  • set the value of N to twice the number of total (local + remote) CPU cores + 1, and
  • set the value of M to the number of local CPU cores

The use of -lM in the MAKEOPTS variable will prevent spawning too many tasks when some of the distcc cluster hosts are unavailable (increasing the amount of simultaneous jobs on the other systems) or when an ebuild is configured to disallow remote builds (such as with gcc). This is accomplished by refusing to start additional jobs when the system load is at or above the value of M.

FILE /etc/portage/make.confSetting MAKEOPTS and FEATURES
# Replace N and M with the right value as calculated previously
MAKEOPTS="-jN -lM"
FEATURES="distcc"
Warning
distcc-pump is known to break multiple packages in unpredictable ways. Do not ever use it system-wide. Bug reports filed with distcc-pump enabled may be rejected.

For instance, when there are two quad-core host PCs running distccd and the local PC has a dual core CPU, then the MAKEOPTS variable could look like this:

FILE /etc/portage/make.confMAKEOPTS example for 2 quad-core (remote) and one dual core (local) PC
# 2 remote hosts with 4 cores each = 8 cores remote
# 1 local host with 2 cores = 2 cores local
# total number of cores is 10, so N = 2*10+1 and M=2
MAKEOPTS="-j21 -l2"

CFLAGS and CXXFLAGS

While editing the make.conf file, make sure that it does not have -march=native in the CFLAGS or CXXFLAGS variables. distccd will not distribute work to other machines if march is set to native. An approximate set of -march= and machine flags can be obtained by running the following command:

user $gcc -v -E -x c -march=native -mtune=native - < /dev/null 2>&1 | grep cc1 | perl -pe 's/^.* - //g;'

See Inlining -march=native for distcc for more information.

A GCC bug has recently been fixed in the 8.0 dev tree which facilitates a more reliable and succinct mechanism for extrapolating appropriate machine flags. The fix has been backported to the 6 and 7 branches and should be released fairly soon. Some processing is still required and a script can be found in the distccflags repo, or via wget:

Warning
Downloading scripts and executing them without any validation is a security risk. Before executing such scripts, take a good look at what they want to accomplish and refrain from executing it when the content or behavior is not clear and purposeful.
user $chmod +x distccflags
user $./distccflags -march=native

With automake

This is, in some cases, easier than the Portage setup. All that is needed is to update the PATH variable to include /usr/lib/distcc/bin/ in front of the directory that contains gcc (/usr/bin/). However, there is a caveat. If ccache is used, then put the distcc location after the ccache one:

root #export PATH="/usr/lib/ccache/bin:/usr/lib/distcc/bin:${PATH}"

Put this in the user's ~/.bashrc or equivalent file to have the PATH set every time the user logs in, or set it globally through an /etc/env.d/ file.

Instead of calling make alone, add in -jN (where N is an integer). The value of N depends on the network and the types of computers that are used to compile. A heuristic approach to the right value is given earlier in this article.

To bootstrap

Using distcc to bootstrap (i.e. build a working toolchain before installing the remainder of the system) requires some additional steps to take.

Step 1: Configure Portage

Boot the new box with a Gentoo Linux LiveCD and follow the installation instructions, while keeping track of the instructions in the Gentoo FAQ for information about bootstrapping. Then configure Portage to use distcc:

FILE /etc/portage/make.confConfigure Portage to use distcc
FEATURES="distcc"
MAKEOPTS="-jN"

Update the PATH variable in the installation session as well:

root #export PATH="/usr/lib/ccache/bin:/usr/lib/distcc/bin:${PATH}"

Step 2: Getting distcc

Install sys-devel/distcc:

root #USE='-*' emerge --nodeps sys-devel/distcc

Step 3: Setting up distcc

Run distcc-config --install to setup distcc; substitute the host# in the example with the IP addresses or hostnames of the participating nodes.

root #/usr/bin/distcc-config --set-hosts "localhost host1 host2 host3 ..."

Distcc is now set up to bootstrap! Continue with the proper installation instructions and do not forget to run emerge distcc after running emerge @system. This is to make sure that all of the necessary dependencies are installed.

Note
During bootstrap and emerge @system distcc may not appear to be used. This is expected as some ebuilds do not work well with distcc, so they intentionally disable it.

Extras

The distcc application has additional features and applications to support working in a distcc environment.

Monitoring utilities

Distcc ships with two monitoring utilities. The text-based monitoring utility is always built and is called distccmon-text. Running it for the first time can be a bit confusing, but it is really quite easy to use. If the program is run with no parameter it will run just once. However, if it is passed a number it will update every N seconds, where N is the argument that was passed.

user $distccmon-text 10

The other monitoring utility is only enabled when the gtk USE flag is set. This one is GTK+ based, runs in an X environment, and it is quite lovely. For Gentoo, the GUI monitor has been renamed to distccmon-gui to make it less confusing (it is originally called distccmon-gnome).

user $distccmon-gui

To monitor Portage's distcc usage:

root #DISTCC_DIR="/var/tmp/portage/.distcc/" distccmon-text 10
root #DISTCC_DIR="/var/tmp/portage/.distcc/" distccmon-gui
Important
If the distcc directory is elsewhere, change the DISTCC_DIR variable accordingly.

A trick is to set DISTCC_DIR in environment variables:

root #echo 'DISTCC_DIR="/var/tmp/portage/.distcc/"' >> /etc/env.d/02distcc

Now update the environment:

root #env-update
root #source /etc/profile

Finally, start the GUI application:

root #distccmon-gui

SSH for communication

Setting up distcc via SSH includes some pitfalls. First, generate an SSH key pair without password setup. Be aware that portage compiles programs as the Portage user (or as root if FEATURES="userpriv" is not set). The home folder of the Portage user is /var/tmp/portage/, which means the keys need to be stored in /var/tmp/portage/.ssh/

root #ssh-keygen -b 2048 -t rsa -f /var/tmp/portage/.ssh/id_rsa

Second, create a section for each host in the SSH configuration file:

FILE /var/tmp/portage/.ssh/configAdd per-host sections
Host test1
    HostName 123.456.789.1
    Port 1234
    User UserName
 
Host test2
    HostName 123.456.789.2
    Port 1234
    User UserName

Send the public key to each compilation node:

root #ssh-copy-id -i /var/tmp/portage/.ssh/id_rsa.pub UserName@CompilationNode

Also make sure that each host is available in the known_hosts file:

root #ssh-keyscan -t rsa <compilation-node-1> <compilation-node-2> [...] > /var/tmp/portage/.ssh/known_hosts

Fix the file ownership as follows:

root #chown -R portage:portage /var/tmp/portage/.ssh/

To set up the hosts test1 and test2, run:

root #/usr/bin/distcc-config --set-hosts "@test1 @test2"

Please note the @ (@ sign), which specifies ssh hosts for distcc.

Finally, tell distcc which SSH binary to use:

FILE /etc/portage/make.conf
DISTCC_SSH="ssh"

It is not necessary to run the distccd initscript on the hosts when distcc communicates via SSH.

Testing

To test distcc, write a simple Hello distcc program and run distcc in verbose mode to see if it communicates properly.

FILE main.c
#include <stdio.h>
 
int main() {
    printf("Hello distcc!\n");
    return 0;
}

Next, turn on verbose mode, compile the program using distcc and link the generated object file into an executable:

user $export DISTCC_VERBOSE=1
user $distcc gcc -c main.c -o main.o # or 'pump distcc <...>'
user $gcc main.o -o main
Note
Replace distcc command with pump distcc for use pump mode.

There should be a bunch of output about distcc finding its configuration, selecting the host to connect to, starting to connect to it, and ultimately compile main.c. If the output does not list the desired distcc hosts, check the configuration.

Finally, ensure the compiled program works properly. To test each host, enumerate each compile host in the hosts file.

user $./main
Hello distcc!

Troubleshooting

If a problem occurs while using distcc, then this section might help in resolving the problem.

ERROR: failed to open /var/log/distccd.log

As of January 22nd, 2015 emerging fails to create the proper distccd.log file in /var/log/. This apparently only effects version 3.1-r8 of distcc. This bug is in the process of being corrected (see bug #477630). It is possible to work around this by manually creating the log file, giving it proper ownership, and restarting the distccd daemon:

root #mkdir -p /var/log/distcc
root #touch /var/log/distcc/distccd.log
root #chown distcc:daemon /var/log/distcc/distccd.log

Next update the /var/log path of the distccd configuration file in /etc/conf.d/distccd to the distcc directory created in the step before:

FILE /etc/conf.d/distccdUpdating log path
DISTCCD_OPTS="--port 3632 --log-level notice --log-file /var/log/distcc/distccd.log -N 15

Finally, restart the distccd service:

root #/etc/init.d/distccd restart

Some packages do not use distcc

As various packages are installed, users will notice that some of them aren't being distributed (and aren't being built in parallel). This may happen because the package' Makefile doesn't support parallel operations, or the maintainer of the ebuild has explicitly disabled parallel operations due to a known problem.

Sometimes distcc might cause a package to fail to compile. If this happens, please report it.

Mixed GCC versions

If the environment hosts different GCC versions, there will likely be very weird problems. The solution is to make certain all hosts have the same GCC version.

Recent Portage updates have made Portage use ${CHOST}-gcc (minus gcc) instead of gcc. This means that if i686 machines are mixed with other types (i386, i586) then the builds will run into troubles. A workaround for this may be to run:

root #export CC='gcc' CXX='c++'

It is also possible to set the CC and CXX variables in /etc/portage/make.conf to the values list in the command above.

Important
Doing this explicitly redefines some behavior of Portage and may have some weird results in the future. Only do this if mixing CHOSTs is unavoidable.
Note
Having the right version of gcc as a slot on a server isn’t enough. Portage uses distcc as a replacement for the compiler referenced by the CHOST variable (i.e. x86_64-pc-linux-gnu) and distccd invokes it by exactly same name. The right version of gcc should be a default system’s compiler on all involved compilation hosts.

-march=native

Starting with GCC 4.3.0, the compiler supports the -march=native option which turns on CPU auto-detection and optimizations that are worth being enabled on the processor on which GCC is running. This creates a problem when using distcc because it allows the mixing of code optimized for different processors. For example, running distcc with -march=native on a system that has an AMD Athlon processor and doing the same on another system that has an Intel Pentium processor will mix code compiled on both processors together.

Heed the following warning:

Warning
Do not use -march=native or -mtune=native in the CFLAGS or CXXFLAGS variables of make.conf when compiling with distcc.

See the CFLAGS and CXXFLAGS section and Inlining -march=native for distcc for more information.

Get more output from emerge logs

It is possible to obtain more logging by enabling verbose mode. This is accomplished by adding DISTCC_VERBOSE to /etc/portage/bashrc:

FILE /etc/portage/bashrcEnabling verbose logging
export DISTCC_VERBOSE=1

The verbose logging can then be found in /var/tmp/portage/$CATEGORY/$PF/temp/build.log.

Keep in mind that the first distcc invocation visible in build.log isn’t necessary the first distcc call during a build process. For example a build server can get a one-minute backoff period during the configuration stage when some checks are performed using a compiler (distcc sets a backoff period when compilation on a remote server failed, it doesn’t matter whether it failed on local machine or not).

Dig into the /var/tmp/portage/$CATEGORY/$PF/work/ directory to investigate such situations. Find other logs, or call make explicitly from within the working directory.

Another interesting variable to use is DISTCC_SAVE_TEMPS. When set, it saves the standard output/error from a remote compiler which, for Portage builds, results in files in the /var/tmp/portage/$CATEGORY/$PF/temp/ directory.

FILE /etc/portage/bashrcSaving temporary output
export DISTCC_SAVE_TEMPS=1

See also

  • The DistCC Cross-compiling guide explains how using one architecture to build programs for another architecture is done through distcc. This can be as simple as using an Athlon (i686) to build a program for a K6-2 (i586), or using a SPARC to build a program for a PowerPC.

External resources


This page is based on a document formerly found on our main website gentoo.org.
The following people contributed to the original document: Lisa Seelye, Mike Gilbert (floppym), Erwin, Sven Vermeulen (SwifT), Lars Weiler, Tiemo Kieft, and
They are listed here because wiki history does not allow for any external attribution. If you edit the wiki article, please do not add yourself here; your contributions are recorded on each article's associated history page.