Distcc

From Gentoo Wiki
Revision as of 11:15, 10 April 2014 by Sreich (Talk | contribs)

Jump to: navigation, search
Other languages:English 100% • ‎español 100% • ‎français 99% • ‎italiano 40% • ‎한국어 100% • ‎русский 100% • ‎Türkçe 42%

Distcc is a program designed to distribute compiling tasks across a network to participating hosts. It is comprised of a server, distccd, and a client program, distcc. Distcc can work transparently with ccache, Portage, and Automake with a little setup.

If you are planning on using distcc to help you bootstrap a Gentoo installation, make sure you read the section Using Distcc to Bootstrap.

Setup

Dependencies

In order to use Distcc, all of the computers on your network need to have the same GCC versions. For example, mixing 3.3.x (where the x varies) is okay, but mixing 3.3.x with 3.2.x may result in compilation errors or runtime errors.

Installing Distcc

There are a couple of options you should be aware of before you start installing distcc.

Distcc ships with a graphical monitor to monitor tasks that a computer is sending away for compilation, enabled with the gtk USE flag.

root # emerge --ask distcc
Important
Remember, you must be sure to install distcc on all of your participating machines.

Setting up Portage to use Distcc

Setting up Portage to use distcc is easy. Execute the following steps on each system that should participate in the distributed compiling:

root # emerge --ask distcc

Now, set the MAKEOPTS variable and FEATURES variable as shown below. A common strategy is to set N as twice the number of total (local + remote) CPUs + 1 and M as number of local CPUs. In case distcc hosts are not available, or the ebuild requires local-only compile (e.g. gcc), -lM flag this will prevent spawning too many tasks.

root # nano -w /etc/portage/make.conf
MAKEOPTS="-jN -lM"
FEATURES="distcc"

Specifying Participating Hosts

Use the distcc-config command to set the list of hosts. Here is an example of some hosts that might be in your list:

CodeExamples of host definitions

192.168.0.1          192.168.0.2                       192.168.0.3
192.168.0.1/2        192.168.0.2                       192.168.0.3/10
192.168.0.1:4000/2   192.168.0.2/1                     192.168.0.3:3632/4
@192.168.0.1         @192.168.0.2:/usr/bin/distccd     192.168.0.3

There are also several other methods of setting up hosts. See the distcc manpage for more details.

If you wish to compile on the local machine you should put 'localhost' in the hosts list. Conversely if you do not wish to use the local machine to compile (which is often the case) omit it from the hosts list. On a slow machine using localhost may actually slow things down. Make sure to test your settings for performance.

It may all look complicated, but in most cases a variant of line 1 or 2 will work.

Since most people won't be using lines 3 or 4, I'll refer to the distcc docs (man distcc) for more information, which includes being able to run distcc over an SSH connection.

For instance, to set the first line in the previous example:

root # /usr/bin/distcc-config --set-hosts "192.168.0.1 192.168.0.2 192.168.0.3"

Edit /etc/conf.d/distccd to your needs and be sure to set the --allow directive to allow only hosts you trust. For added security, you should also use the --listen directive to tell the distcc daemon what IP to listen on (for multi-homed systems). More information on distcc security can be found at Distcc Security Design .

Important
It is important to use --allow and --listen. Please read the distccd manpage or the above security document for more information.

Now start the distcc daemon on all the participating computers:

root # rc-update add distccd default
root #
/etc/init.d/distccd start

Setting up Distcc to Work With Automake

This is, in some cases, easier than the Portage setup. What you have to do is update your PATH variable to include /usr/lib/distcc/bin in front of the directory that contains gcc ( /usr/bin ). However, there is a caveat. If you use ccache you have to put distcc after the ccache part:

root # export PATH="/usr/lib/ccache/bin:/usr/lib/distcc/bin:${PATH}"

You can put this in your ~/.bashrc or equivalent file to have the PATH set every time you log in.

Then, as you would normally type make , you would type make -jN (where N is an integer). The value of N depends on your network and the types of computers you are using to compile. Test your own settings to find the number that yields the best performance.

Setting up Distcc to Work With ssh

Setting up distcc via ssh includes some pitfalls. First, generate a ssh key pair without password setup. Be aware that portage compiles programs as user portage. The home folder of the user portage is /var/tmp/portage, which means the keys need to be stored in /var/tmp/portage/.ssh.

root # ssh-keygen -b 2048 -t rsa -f /var/tmp/portage/.ssh/id_rsa

Second, generate a section for each host in the ssh config file:

root # nano -w /var/tmp/portage/.ssh/config
Host test1
    HostName 123.456.789.1
    Port 1234
    User UserName

Host test2
    HostName 123.456.789.2
    Port 1234
    User UserName

Also make sure that each host is available in the known_hosts file and append your public key to the authorized_keys file of the hosts. To set up the hosts test1 and test2 run

root # /usr/bin/distcc-config --set-hosts "@test1 @test2"

Please note the '@'-sign, which specifies ssh hosts for distcc.

Cross-Compiling

Cross-compiling is using one architecture to build programs for another architecture. This can be as simple as using an Athlon (i686) to build a program for a K6-2 (i586), or using a Sparc to build a program for a ppc. This is documented in our DistCC Cross-compiling Guide .

Using Distcc to Bootstrap

Step 1: Configure Portage

Boot your new box with a Gentoo Linux LiveCD and follow the installation instructions up until the bootstrapping part. (See the Gentoo FAQ for more information about bootstrapping.) Then configure Portage to use distcc:

root # nano -w /etc/portage/make.conf
FEATURES="distcc"
MAKEOPTS="-jN"
root # export PATH="/usr/lib/ccache/bin:/usr/lib/distcc/bin:${PATH}"

Step 2: Getting Distcc

Install distcc:

root # USE='-*' emerge --nodeps sys-devel/distcc

Step 3: Setting Up Distcc

Run distcc-config --install to setup distcc; substitute host* with the IP addresses or hostnames of the participating DistCC nodes.

root # /usr/bin/distcc-config --set-hosts "localhost host1 host2 host3 ..."

Distcc is now set up to bootstrap! Continue with the official installation instructions and do not forget to re-emerge distcc after emerge system. This is to make sure that all of the dependencies you want are installed as well.

Note
During bootstrap and emerge system distcc may not appear to be used. This is expected as some ebuilds do not work well with distcc, so they intentionally disable it.

Troubleshooting

Some Packages Don't Use Distcc

As you emerge various packages, you'll notice that some of them aren't being distributed (and aren't being built in parallel). This may happen because the package's Makefile doesn't support parallel operations or the maintainer of the ebuild has explicitly disabled parallel operations due to a known problem.

Sometimes distcc might cause a package to fail to compile. If this happens for you, please report it to us.

Mixed GCC Versions

If you have different GCC versions on your hosts, there will likely be very weird problems. The solution is to make certain all hosts have the same GCC version.

Recent Portage updates have made Portage use ${CHOST}-gcc instead of gcc . This means that if you're mixing i686 machines with other types (i386, i586) you will run into problems. A workaround for this may be to export CC='gcc' CXX='c++' or to put it in /etc/portage/make.conf .

Important
Doing this explicitly redefines some behavior of Portage and may have some weird results in the future. Only do this if you're mixing CHOSTs.

-march=native

Starting with GCC 4.3.0, the compiler supports the -march=native switch which turns on CPU autodetection and optimizations that are worth being enabled on the processor the GCC is running at. This is a problem with distcc as it allows mixing of code optimized for different processors (like AMD Athlon and Intel Pentium). Don't use -march=native or -mtune=native in your CFLAGS or CXXFLAGS when compiling with distcc .

To know the flags that GCC would enable when called with -march=native, execute the following:

user $ gcc -march=native -E -v - </dev/null 2>&1 | grep cc1
/usr/libexec/gcc/x86_64-pc-linux-gnu/4.7.3/cc1 -E -quiet -v - -march=corei7-avx \
  -mcx16 -msahf -mno-movbe -mno-aes -mpclmul -mpopcnt -mno-abm -mno-lwp -mno-fma \
  -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mavx -mno-avx2 -msse4.2 -msse4.1 \
  -mno-lzcnt -mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=32 \
  --param l1-cache-line-size=64 --param l2-cache-size=6144 -mtune=corei7-avx

Distcc Extras

Distcc Monitors

Distcc ships with two monitors. The text-based one is always built and is called distccmon-text. Running it for the first time can be a bit confusing, but it is really quite easy to use. If you run the program with no parameter it will run once. However, if you pass it a number it will update every N seconds, where N is the argument you passed.

The other monitor is only enabled through the gtk USE flag. This one is GTK+ based, runs in an X environment and it is quite lovely. For Gentoo, the GUI monitor has been called distccmon-gui for less confusion. Elsewhere it may be referred to as distccmon-gnome.

root # distccmon-text N

or run distccmon-gui:

root # distccmon-gui

To monitor Portage's distcc usage you can use:

root # DISTCC_DIR="/var/tmp/portage/.distcc/" distccmon-text N
root #
DISTCC_DIR="/var/tmp/portage/.distcc/" distccmon-gui
Important
If your distcc directory is elsewhere, change the DISTCC_DIR variable accordingly.

Acknowledgements

We would like to thank the following authors and editors for their contributions to this guide:

  • Lisa Seelye
  • Mike Frysinger
  • Erwin
  • Sven Vermeulen
  • Lars Weiler
  • Tiemo Kieft
  • nightmorph