Distcc/zh-cn

Distcc 是一个被设计用来将编译任务发布到网络上特定主机的程序. 它由一个服务器 和一个客户端 >程序组成. Distcc 可以很好的与ccache, Portage协同工作,对Automake只需少量的配置.

如果打算使用 来安装Gentoo系统，确保先阅读Using distcc to bootstrap.

安装
在配置之前，让我们先看看在所有主机上安装的软件包.

所有主机的要求
为了使用，网络上的所有计算机都必须具有相同版本的GCC. 例如，可以混合使用3.3.x（其中x有所不同），但是将3.3.x与3.2.x混合可能会导致编译错误或运行时错误.

验证所有系统都使用相同版本的binutils（eselect binutils list），否则许多软件包将由于诸如文本重定位之类的各种错误而链接失败.

安装（Emerge）
Distcc附带了一个图形监视器，用于监视计算机发送出的编译任务. 当设置了 USE标志时，将启用此监视器.

配置USE设置后，安装软件包：

服务
为了使自动启动，请遵循以下说明.

OpenRC
编辑并确保将 指令设置为仅允许受信任的客户端. 为了提高安全性，请使用 指令来告知守护程序要侦听的IP（对于多宿主系统）. 有关安全性的更多信息，请参见Distcc安全说明.

以下示例允许运行在 和 的distcc客户端连接到本地运行的服务器：

When logging to a file in, create the log and give appropriate permissions:

现在，在所有参与的计算机上启动守护程序：

systemd
编辑文件，以CIDR格式添加允许的客户端. 如示例将添加192.168.1.xxx范围内的所有IP地址：

Or an example with multiple clients and a manually specified log-level

To set the proper environment variables for, place them into. For example,

For workaround, you need to edit distccd.service by running the following command.

This will open up an editor. Change the line with  directive to

Alternatively, you could also write a shell script wrapper for.

进行此类更改后，重新加载单元文件：

配置自动启动，然后启动服务：

指定参与主机
使用命令设置主机列表.

以下是主机定义的示例列表. 在大多数情况下，参考第1行和第2行就足够了. 第2行使用 语法向通知要在此节点上启动的最大作业数. 有关第3行和第4行中使用的语法的更多信息，请参见distcc手册页.

还有其他几种设置主机的方法. 有关更多详细信息，请参见手册页（）.

如果还应该在本地计算机上进行编译，则将 放在主机列表中. 相反，如果不使用本地计算机进行编译，请从主机列表中将其忽略. 在使用本地主机的慢速计算机上，实际上可能会使速度降低. 确保测试性能设置.

让我们将配置为使用示例第一行中提到的主机：

Distcc还通过调用命令支持pump模式. 当并行编译多个文件时，这可能会大大减少构建时间. 它在服务器端缓存了经过预处理的标头，因此避免了重复重复上传和预处理这些标头文件.

要将主机配置为pump模式，请在主机定义中添加 后缀. Pump模式需要 和  标志（无论文件是C还是C++）.

Hosts also need to be in:

Optionally, to set the maximum number of threads used by a host, add a forward slash "/" after each host:

The same applies to the command. If the maximum threads number is not specified, it will default to 4.

With Portage
设置 Portage 来使用 很容易. 这是启用 功能并为同时构建作业的数量设置合适值的问题（因为  会增加构建资源的数量）.

设置 MAKEOPTS 变量和 FEATURES 变量如下.

一个常见的策略是
 * 将  的值设置为 “所有”（本地 + 远程）CPU 内核数的“两倍”+ 1，以及
 * 将  的值设置为“本地”CPU 内核数

The use of  in the MAKEOPTS variable will prevent spawning too many tasks when some of the  cluster hosts are unavailable (increasing the amount of simultaneous jobs on the other systems) or when an ebuild is configured to disallow remote builds (such as with gcc). This is accomplished by refusing to start additional jobs when the system load is at or above the value of.

例如，当有两台运行 的四核主机 PC 并且本地 PC 具有双核 CPU 时， MAKEOPTS 变量可能如下所示：

CFLAGS 和 CXXFLAGS
在编辑 文件时，确保它在 CFLAGS 或 CXXFLAGS 中没有   变量. 如果  设置为  ， 不会将工作分发到其他机器. 可以通过运行以下命令获得一组近似的  和机器标志：

See Inlining  for distcc for more information.

With automake
This is, in some cases, easier than the Portage setup. All that is needed is to update the PATH variable to include in front of the directory that contains. However, there is a caveat. If is used, then put the  location after the  one:

Put this in the user's or equivalent file to have the PATH set every time the user logs in, or set it globally through an  file.

Instead of calling alone, add in   (where   is an integer). The value of  depends on the network and the types of computers that are used to compile. A heuristic approach to the right value is given earlier in this article.

With ccache
To make Ccache work with, some prerequisites must be fulfilled:
 * Ccache is successfully set up locally
 * Distcc is successfully set up on the desired hosts

The following setup will work as follows:

Configure distccd
In order to let the daemon use, it must masquerade the path  with. Furthermore, when it uses, should use the prefix  :

Additionally must be aware of the environment variables DISTCC_DIR and CCACHE_DIR :

Next, update the environment variables:

Finally, restart the daemon to adapt all changes:

Configure ccache
First, prepare the cache directories:

The second command will create the first level directories from  to,  to  and. The following loop will then look for the first level directories, excluding the current directory  and. It then descends into each of them, creates the second level directories from to  and  to   and goes back to the previous directory , which is.

When the preparation is done, every directory - including the directory itself - must be owned by the user  :

Configure portage
To use with  and, make sure, that both features are enabled and that CCACHE_DIR is set in :

It might be redundant to set CCACHE_DIR here, since it is already defined in, mentioned here. But to make absolutely sure, configure it like that.

Remote
First enable verbose logging by setting  to   in :

After that, restart the daemon to adapt the changes:

Also check, if there are directories in - including the directory  itself - which are not owned by the user   and correct their owner permissions:

Client
Make sure, that the following environment variables are present in the current shell:

After that, navigate to a temporary directory within and compile the example mentioned below:

This will provide a verbose output, while also keeping temporary files receiving from the remote site in by default:

Any occuring error from the remote site are saved in.

If the compilation was successful, the following line will be shown.

On the remote site, it will look like this:

The important part here, is, that any symlink of is a save symlink to.

Also, on the remote site, there should be the cached file in, assuming, the example with its filename was copied from this wiki article. Generally, one can monitor the ccache size using, while compiling.

Testing distcc with ccache using emerge
Check, if necessary environment variables are present for the current shell, see here and that was configured properly, see here.

To produce some cached files on the remote site, one can compile small packages like  and   on the client:

Future usage
Make sure, that the following environment variables are always set in the desired shell:

To bootstrap
Using to bootstrap (i.e. build a working toolchain before installing the remainder of the system) requires some additional steps to take.

Step 1: Configure Portage
Boot the new box with a Gentoo Linux LiveCD and follow the installation instructions, while keeping track of the instructions in the Gentoo FAQ for information about bootstrapping. Then configure Portage to use :

Update the PATH variable in the installation session as well:

Step 2: Getting distcc
Install :

Step 3: Setting up distcc
Run to setup distcc; substitute the   in the example with the IP addresses or hostnames of the participating nodes.

Distcc is now set up to bootstrap! Continue with the proper installation instructions and do not forget to run after running. This is to make sure that all of the necessary dependencies are installed.

Extras
The application has additional features and applications to support working in a  environment.

Monitoring utilities
Distcc ships with two monitoring utilities. The text-based monitoring utility is always built and is called. Running it for the first time can be a bit confusing, but it is really quite easy to use. If the program is run with no parameter it will run just once. However, if it is passed a number it will update every  seconds, where   is the argument that was passed.

The other monitoring utility is only enabled when the  USE flag is set. This one is GTK based, runs in an X environment, and it is quite lovely. For Gentoo, the GUI monitor has been renamed to to make it less confusing (it is originally called ).

To monitor Portage's usage:

A trick is to set DISTCC_DIR in environment variables:

Now update the environment:

Finally, start the GUI application:

SSH for communication
Setting up distcc via SSH includes some pitfalls. First, generate an SSH key pair without password setup. Be aware that portage compiles programs as the Portage user (or as root if  is not set). The home folder of the Portage user is, which means the keys need to be stored in

Second, create a section for each host in the SSH configuration file:

Send the public key to each compilation node:

Also make sure that each host is available in the file:

Fix the file ownership as follows:

To set up the hosts  and , run:

Please note the  (@ sign), which specifies ssh hosts for distcc.

Finally, tell which SSH binary to use:

It is not necessary to run the initscript on the hosts when  communicates via SSH.

Reverse SSH
As an alternative to distcc's built-in SSH solution, a compiling server can connect to the distcc client via SSH, redirecting the client's distcc TCP port to the compiling server. There is no need for password-less SSH keys on the client.

Note that distcc uses as a literal keyword for special purpose so that  has to be used instead. For multiple compiling servers each needs its own port redirection on the client (e.g. 127.0.0.1:4000, 127.0.0.1:4001 etc). Assert that IP addresses and ports are listed in on the client.

Testing
To test, write a simple Hello distcc program and run in verbose mode to see if it communicates properly.

Next, turn on verbose mode, compile the program using and link the generated object file into an executable:

There should be a bunch of output about finding its configuration, selecting the host to connect to, starting to connect to it, and ultimately compile. If the output does not list the desired hosts, check the configuration.

Finally, ensure the compiled program works properly. To test each host, enumerate each compile host in the hosts file.

Troubleshooting
If a problem occurs while using, then this section might help in resolving the problem.

ERROR: failed to open
As of January 22nd, 2015 emerging fails to create the proper file in. This apparently only effects version 3.1-r8 of distcc. This bug is in the process of being corrected (see ). It is possible to work around this by manually creating the log file, giving it proper ownership, and restarting the distccd daemon:

Next update the path of the  configuration file in  to the  directory created in the step before:

Finally, restart the distccd service:

Some packages do not use distcc
As various packages are installed, users will notice that some of them aren't being distributed (and aren't being built in parallel). This may happen because the package' doesn't support parallel operations, or the maintainer of the ebuild has explicitly disabled parallel operations due to a known problem.

Sometimes might cause a package to fail to compile. If this happens, please report it.

Rust package is known to cause excessive IO utilization as --local-load is ignored and --jobs is usually too high for local build resources. A package.env needs to be provisioned with non-distcc MAKEOPTS values to workaround this behaviour.

Mixed GCC versions
If the environment hosts different GCC versions, there will likely be very weird problems. The solution is to make certain all hosts have the same GCC version.

Recent Portage updates have made Portage use  (minus gcc) instead of. This means that if i686 machines are mixed with other types (i386, i586) then the builds will run into troubles. A workaround for this may be to run:

It is also possible to set the CC and CXX variables in to the values list in the command above.

-march=native
Starting with GCC 4.3.0, the compiler supports the  option which turns on CPU auto-detection and optimizations that are worth being enabled on the processor on which GCC is running. This creates a problem when using because it allows the mixing of code optimized for different processors. For example, running with   on a system that has an AMD Athlon processor and doing the same on another system that has an Intel Pentium processor will mix code compiled on both processors together.

Heed the following warning:

See the CFLAGS and CXXFLAGS section and Inlining  for distcc for more information.

Network is unreachable
Due to network restrictions introduced by the feature, you may run into this issue. Since contradicts with this security feature, you have to disable it:

Get more output from emerge logs
It is possible to obtain more logging by enabling verbose mode. This is accomplished by adding DISTCC_VERBOSE to :

The verbose logging can then be found in.

Keep in mind that the first invocation visible in  isn’t necessary the first  call during a build process. For example a build server can get a one-minute backoff period during the configuration stage when some checks are performed using a compiler ( sets a backoff period when compilation on a remote server failed, it doesn’t matter whether it failed on local machine or not).

Dig into the directory to investigate such situations. Find other logs, or call explicitly from within the working directory.

Another interesting variable to use is DISTCC_SAVE_TEMPS. When set, it saves the standard output/error from a remote compiler which, for Portage builds, results in files in the directory.

Failed to create directory /dev/null/.cache/ccache/tmp: Not a directory
This error can be discovered from the standard error output file in the server if you set DISTCC_SAVE_TEMPS. It only occurs when using with.

Likely, it is because CCACHE_DIR is not properly set, or not passed correctly to. will then default to as its cache folder. However, is run by  under user distcc, which is a non-login account. See systemd section and With ccache section for setting CCACHE_DIR.

Portage build failing with errors that are apparently not connected with distcc at all
When builds are failing with errors that do not seem to be connected to distcc, but the build works with FEATURES="-distcc", it has been reported that builds sometimes fail because of DISTCC_VERBOSE=1. Try the build with DISTCC_VERBOSE=0.

External resources

 * Inlining  for distcc
 * Distcc on Github
 * Distcc homepage