Distcc/en

Distcc is a program designed to distribute compiling tasks across a network to participating hosts. It is comprised of a server, distccd, and a client program, distcc. Distcc can work transparently with ccache, Portage, and Automake with a little setup.

When planning on using distcc to help bootstrap a Gentoo installation, make sure to read Using distcc to bootstrap.

Setup
Before configuring distcc, let's first look into the installation of the package on all hosts.

Dependencies
In order to use distcc, all of the computers on the network need to have the same GCC versions. For example, mixing 3.3.x (where the x varies) is okay, but mixing 3.3.x with 3.2.x may result in compilation errors or runtime errors.

Installing distcc
Distcc ships with a graphical monitor to monitor tasks that a computer is sending away for compilation. This monitor is enabled when the    flag is set.

After configuring the  setting, install the  package:

Auto-starting the distcc daemon
In order to have distccd started automatically, follow the next set of instructions, depending on the init system used.

Using OpenRC
Edit and be sure to set the   directive to allow only trusted hosts. For added security, use the  directive to tell the distccd daemon what IP to listen on (for multi-homed systems). More information on distcc security can be found at Distcc security notes.

The following example allows the distcc clients running at 192.168.0.4</tt> and 192.168.0.5</tt> to connect to the distccd</tt> server running locally:

Now start the distccd</tt> daemon on all the participating computers:

Using systemd
Edit and add the allowed clients in CIDR format. Here is an example:

Reload the unit files after making such changes:

Enable auto-starting distcc</tt> and then start the service:

Configuration
Let's now look into the configuration of distcc</tt>.

Specifying participating hosts
Use the distcc-config command to set the list of hosts.

The following is an example list of host definitions. In most cases, variants of lines 1 and 2 suffice. More information about the syntax used in lines 3 and 4 can be found in the distcc manual page.

There are also several other methods of setting up hosts. See the distcc</tt> man page ( man distcc ) for more details.

If compilations should also occur on the local machine, put  in the hosts list. Conversely if the local machine is not to be used to compile, omit it from the hosts list. On a slow machine using localhost may actually slow things down. Make sure to test the settings for performance.

Let's configure distcc</tt> to use the hosts mentioned on the first line in the example:

Setting up Portage to use distcc
Setting up Portage to use distcc</tt> is easy. Execute the following steps on each system that should participate in the distributed compiling.

Next, set the  variable and   variable as shown below. A common strategy is to set  to twice the number of total (local + remote) CPU cores + 1 and   as number of local CPU cores. In case participating hosts are not available, or when the ebuild that is being installed requires a local-only compile (e.g. gcc), then the  flag will prevent the spawning of too many tasks.

A common strategy is to
 * set the value of  to twice the number of total (local + remote) CPU cores + 1, and
 * set the value of  to the number of local CPU cores

The use of  in the   variable will prevent spawning too many tasks when some of the distcc</tt> cluster hosts are unavailable (increasing the amount of simultaneous jobs on the other systems) or when an ebuild is configured to disallow remote builds (such as with gcc). This is accomplished by refusing to start additional jobs when the system load is at or above the value of M</tt>.

For instance, when there are two quad-core host PCs running distccd</tt> and the local PC has a dual core CPU, then the  variable could look like this:

While editing the file, make sure that it does not have   in the   or   variables. distccd</tt> will not distribute work to other machines if  is set to. The appropriate  value can be obtained by running the following command:

See Inlining -march=native for distcc for more information.

Setting up distcc to work with automake
This is, in some cases, easier than the Portage setup. All that is needed is to update the  variable to include  in front of the directory that contains gcc</tt>. However, there is a caveat. If ccache</tt> is used, then put the distcc location after the ccache one:

Put this in the user's or equivalent file to have the   set every time the user logs in, or set it globally through an  file.

Instead of calling just make, add in -jN (where  is an integer). The value of  depends on the network and the types of computers that are used to compile. A heuristic approach to the right value is given earlier in this article.

Using distcc to bootstrap
Using distcc</tt> to bootstrap (i.e. build a working toolchain before installing the remainder of the system) requires some additional steps to take.

Step 1: configure Portage
Boot the new box with a Gentoo Linux LiveCD and follow the installation instructions, while keeping track of the instructions in the Gentoo FAQ for information about bootstrapping. Then configure Portage to use <tt>distcc</tt>:

Update the  variable in the installation session as well:

Step 2: getting distcc
Install :

Step 3: setting up distcc
Run distcc-config --install to setup distcc; substitute the  in the example with the IP addresses or hostnames of the participating nodes.

Distcc is now set up to bootstrap! Continue with the proper installation instructions and do not forget to run emerge distcc after running emerge @system. This is to make sure that all of the necessary dependencies are installed.

Distcc extras
The <tt>distcc</tt> application has additional features and applications to support working in a <tt>distcc</tt> environment.

Distcc monitors
Distcc ships with two monitoring utilities. The text-based monitoring utility is always built and is called <tt>distccmon-text</tt>. Running it for the first time can be a bit confusing, but it is really quite easy to use. If the program is run with no parameter it will run just once. However, if it is passed a number it will update every  seconds, where   is the argument that was passed.

The other monitoring utility is only enabled when the    flag is set. This one is GTK+ based, runs in an X environment, and it is quite lovely. For Gentoo, the GUI monitor has been renamed to <tt>distccmon-gui</tt> to make it less confusing (it is originally called <tt>distccmon-gnome</tt>).

To monitor Portage's <tt>distcc</tt> usage:

A trick is to set  in environment variables:

Now update the environment:

Finally, start the GUI application:

Setting up distcc to work with ssh
Setting up distcc via ssh includes some pitfalls. First, generate an SSH key pair without password setup. Be aware that portage compiles programs as the <tt>portage</tt> user. The home folder of the <tt>portage</tt> user is, which means the keys need to be stored in

Second, create a section for each host in the SSH configuration file:

Send the public key to each compilation node:

Also make sure that each host is available in the file and append the public key to the  file of the hosts. To set up the hosts <tt>test1</tt> and <tt>test2</tt>, run:

Fix the file permissions as follows:

To set up the hosts <tt>test1</tt> and <tt>test2</tt>, run:

Please note the <tt>@</tt> (@ sign), which specifies ssh hosts for distcc.

Troubleshooting
If a problem occurs while using <tt>distcc</tt>, then this section might help in resolving the problem.

ERROR: failed to open
As of January 22nd, 2015 emerging fails to create the proper file in. This apparently only effects version 3.1-r8 of distcc. This bug is in the process of being corrected (see ). It is possible to work around this by manually creating the log file, giving it proper ownership, and restarting the distccd daemon:

Next update the path of the <tt>distccd</tt> configuration file in  to the  directory created in the step before:

Finally, restart the distccd service:

Some packages don't use distcc
As various packages are installed, users will notice that some of them aren't being distributed (and aren't being built in parallel). This may happen because the package' doesn't support parallel operations, or the maintainer of the ebuild has explicitly disabled parallel operations due to a known problem.

Sometimes <tt>distcc</tt> might cause a package to fail to compile. If this happens, please report it.

Mixed GCC versions
If the environment hosts different GCC versions, there will likely be very weird problems. The solution is to make certain all hosts have the same GCC version.

Recent Portage updates have made Portage use  (minus gcc) instead of. This means that if i686 machines are mixed with other types (i386, i586) then the builds will run into troubles. A workaround for this may be to run export CC='gcc' CXX='c++' as root in a terminal, or put it in.

-march=native
Starting with GCC 4.3.0, the compiler supports the  option which turns on CPU auto-detection and optimizations that are worth being enabled on the processor on which GCC is running. This creates a problem when using <tt>distcc</tt> because it allows the mixing of code optimized for different processors. For example, running <tt>distcc</tt> with  on a system that has an AMD Athlon processor and doing the same on another system that has an Intel Pentium processor will mix code compiled on both processors together.

Heed the following warning:

To know the flags that GCC would enable when called with, execute the following:

Cross-compiling
Cross-compiling is using one architecture to build programs for another architecture. This can be as simple as using an Athlon (i686) to build a program for a K6-2 (i586), or using a SPARC to build a program for a PowerPC. This is documented in the DistCC Cross-compiling guide.

External resources

 * Inlining -march=native for distcc
 * Distcc homepage