User:Capezotte/s6 on Gentoo

This aims to be a guide on how to set up a Gentoo system with a full s6 stack (s6-linux-init, s6-svscan+s6-supervise, s6-rc), ultimately allowing you to replace sysvinit and OpenRC.

Included in the LEGO set

 * s6-svscan and s6-supervise (and associated tools) are the workhorse of system management. They start services, ensure they are reachable and there's only one instance of them, and restarts them automatically if they crash. It's a supervision suite, like runit (without runit-init).
 * s6-linux-init performs the bare minimum of system initialization needed for s6-svscan, executes it as PID 1, and handles shutdown. It's a init process, like sysvinit or runit-init.
 * s6-rc enhances s6-svscan with dependency management and the ability to run oneshots (scripts do one thing on system startup/shutdown). It's a service manager, like OpenRC.

Getting started
You'll want to emerge (s6-svscan+s6-supervise),  and. Avoid emerging the latter with so you can fall back to sysvinit+OpenRC if things go awry.

Introduction to service definitions
Creating services for s6-rc will feel familiar if you've already used runit, but it also has support for one-shot scripts, which allow us to perform the core system initialization and starting actual long-running services (what runit calls, respectively, stage 1 and 2) with the same tool.

With s6-rc, services definitions are folders with at least one file called, whose contents are one of these three strings:
 * longrun: what we usually think of being services: programs that run for the lifetime of the machine, providing some sort of functionality (e.g. device management, usually udev). If you choose this type, there must also be an executable file called . It's usually a script that performs necessary setup and then execs into the program, which must be in foreground mode (otherwise, s6 will lose track of it). Again, runit users will be familiar with this requirement.
 * oneshot: a small script performing setup before or after a service is started. For instance, mounting filesystems (i. e. calling mount -a), or calling udev to recognize currently plugged devices (i. e. udevadm trigger). If you choose this type, you must provide an file, which is an execline script. For very simple scripts, it will be exactly like shell, but you can't use single quotes. If you aren't willing to fully learn it, no worries -- just write the script in whichever language you like the most, mark it as executable and write its filename to.
 * bundle: a set of longruns, oneshots or even other bundles. You write the name of the each item, one per line, to a file named, which is mandatory.

Both longrun and oneshot can have an additional file, which lists, one per line, what other longruns, oneshots or bundles must be working before this service is started.

This is just a very simplistic introduction. For the full description, read the docs.

Creating a service database
Now you need to pick a folder in your system to be the place where the service definitions are located (the service database). You can pick any folder for this, I personally prefer, but you can put , , or. Create subfolders and start working on the service definitions inside of them.

The bare minimum of core initialization required for most Linux systems is a series of oneshots:


 * Mounting the essential virtual filesystems
 * Parsing
 * Creating the {u,b,w}tmp files (especially if you use ).
 * Reading a few settings from the file system (sysctls, hostname...)
 * Cold-plugging the device manager.

For reference, you can see how Artix implements core s6 services. The ones that do the above steps are:


 * mount-devfs
 * mount-filesystems
 * mount-net
 * mount-procfs
 * mount-sysfs
 * mount-tmpfs
 * sysctl
 * cleanup
 * hostname
 * udevadm
 * udevd-{srv,log}

In addition, to actually get a login prompt, the agetty-* longruns are needed.

For a quick start, you can copy-paste them. Notice that udevd-srv needs to be adjusted as Gentoo and Artix place the udev binary in different folders.

Let's look into a few examples.

Example 1. mount-procfs
This will be familiar to you if you had to setup a Linux container/chroot, or borked your system so hard you had to resort to. On a shell, you'd run:

This is run only once and you're set for the machine's lifetime. This calls for a oneshot type service, with 's contents set to the above command. That's it, this service is good enough to mount on your machine at boot.

However, if this script is called again for some reason, will be mounted again (and on Linux, mounts can stack). This will leave us with two entries. To fix this, we should write this service so restarting it won't change anything unless it's necessary (idempotence).

The command for checking if a folder is already mount is. So we should only mount /proc if this command fails.

In execline, this translates to.

More information on execline's if here.

Final layout:

Example 2. mount-sysfs
If you were able to understand the previous example, you'd easily arrive at the following command line:

if -nt { mountpoint -q /sys } mount -t sysfs sys /sys

However, within there are other mountpoints that must be handled:
 * securityfs
 * efivarfs.

If it's possible to mount them, the folders will be present. Therefore, the script must check if they're there (not considering it a failure for them to be absent), and, if so, mount them.

if -t { test -d /sys/kernel/security } if -nt { mountpoint -q /sys/kernel/security } mount -t securityfs securityfs /sys/kernel/security

This command line just got fairly long, so let's take advantage of the fact that, in execline, newlines are equivalent to spaces:

if -t { test -d /sys/kernel/security } if -nt { mountpoint -q /sys/kernel/security } mount -t securityfs securityfs /sys/kernel/security

But wait. If newlines are equivalent to spaces, how do I chain multiple command lines together? No, not with a semicolon, but with a block.

foreground { if -nt { mountpoint -q /sys } mount -t sysfs sys /sys } if -t { test -d /sys/kernel/security } if -nt { mountpoint -q /sys/kernel/security } mount -t securityfs securityfs /sys/kernel/security

Adding efivarfs to the mix, by putting securityfs's command line inside another foreground block, yields the final result.

Example 3. sshd
The SSH daemon is a program that will stay up for the lifetime of the machine, so it's a clear longrun.

For the file, all you need to do is write a script that execs sshd in the foreground (which, by default, it doesn't, so the -D option needs to be given).

sshd -D
 * 1) !/bin/execlineb -P

Unlike a shell script for runit, writing is not necessary; that's implied in execline (see the note above).

As a convenience for the user, we could run before  itself, which generates SSH host keys if necessary.

Turning the service database into something useful
When you're done, you need to compile the database. s6-rc will build a dependency list and make your s6-rc services ready to be plugged into s6-svscan, and complain if the dependencies/types are wrong. For convenience later (we'll explain in the s6-linux-init-part), we'll make it compile to a folder under with a unique name, then symlink to :

Setting up s6-linux-init
will be the first process of the system and a piece bundled with it will remain operational for the lifetime of the machine, waiting for the fateful shutdown command. It's configured through files in.

First, let's use included program to create the  directory (though it's still non-functional):


 * With , we can specify an emergency service that will always be started. In this example, it's a getty on tty12 (Ctrl+Alt+F12) which you can use to login to your system even if your s6-rc config is borked.
 * sends system initialization logs to the tty as well as to the file. If this is not specified, only  will have the logs.

More details here.

Of the folders in, scripts is the most important; they're called after s6-svscan is set up (rc.init) and before (rc.shutdown) and after (rc.shutdown.final) every process other than init is killed. Those are the moments where every piece of the LEGO set comes together.


 * rc.init: uncomment  line. This copies the s6-rc → s6-svscan translated service definitions from  (remember him?) to  (where   expects services to be, by default, when spawned from s6-linux-init). However, it won't start any of them.
 * rc.shutdown: uncomment the exec . This will bring all of our services down before the system is powered off.
 * runlevel: uncomment . When we call a runlevel change through   (or on system initialization, as we'll see later), it will be translated into a call to , i.e. start the service/bundle $RUNLEVEL, and stop everything else.
 * rc.shutdown.final can be left alone.

Earlier I told you s6-rc-init won't start any services. So how are we supposed to bring up the services on system boot? If you read down, you'll find:

exec /etc/s6-linux-init/current/scripts/runlevel "$rl"

This is what actually starts our services on boot. If you didn't specify any runlevels on the kernel command line,  will be default. Usually, default will be a bundle with the services you want most of the time. If you didn't create a bundle called default, take the opportunity to remove the symlink and perform that set of commands again (the fact that there's this whole compiling dance on s6-rc is one of its drawbacks, unfortunately).

I recommend not making a single giant default bundle, but rather work with layers. For instance, boot with just the filesystems, utmp, ttys and the device manager, and then include this boot bundle inside of default, alongside services like CUPS, bluetoothd, elogind, etc.. Adding boot to the command line could then act as a sort of "safe mode", with no potentially misbehaving services, in addition to making  useful.

Trying it out
To make this setup bootable, you need to symlink the contents of to. You can use the following script:

Now, you can put  on your kernel command line, and use / to perform power management on your s6 session. If anything goes wrong, go to the emergency tty we set up, or reboot with OpenRC.

Managing s6 within s6
Boot into s6. So far, we've learned how what the database is and how to compile it, and hopefully we have a working one.

We're just getting started.

Bringing services up or down
Just in case you have made this far and haven't read the documentation of s6-rc, here's a cheatsheet:


 * to bring a service or a bundle of services up.
 * to bring a service or a bundle of services down.

s6-rc can also take a -p option, which either means "Stop everything else and bring these up" (-u + -p) or "Stop these and services that depend on it, and bring up everything else" (-d + -p).

Changing /etc/s6-rc/compiled in place
Simply re-linking when we want to add services, as we did before booting into s6, will bring s6-rc to an inconsistent state where you can't bring services up or down without errors - if you think "waiting for session C2 of user X" on systemd was bad, you haven't accidentally overwritten  on an s6 install.

This doesn't mean we can't change service definitions without rebooting - there's a tool to change, in place, s6-rc from to somewhere else, dynamically -.

We first need to compile a database with a unique name (as we've been doing):

Now, tell s6-rc to use the new database:

Now that s6-rc is looking away, we can safely overwrite.

You'll be doing this a lot, so it's recommended to make it a script. Let's say, with the following contents:

NAME=$(date +%s) && s6-rc-compile -- "/etc/s6-rc/$NAME" /etc/s6-rc/src && s6-rc-update -- "/etc/s6-rc/$NAME" && ln -sfT -- "$NAME" /etc/s6-rc/compiled

Readiness notification
Most non-trivial services take a certain amount of time before they're actually ready to perform their duty. This means that if the service manager is fast enough (and s6-rc is fast), dependant services might start before the "dependee" is actually ready to perform its duty. To account for this case, s6 has implemented a simple readiness notification mechanism: daemons write a newline to a pipe (in a location specified in a file called ), and s6-supervise understands it's ready to communicate with other processes and broadcasts this information to anyone who asks. s6-rc asks and takes it into account when ordering services.

Most programs included in s6 have an option for this, and many other programs have options that, although not even intended for systems using s6, can work just as well for this purpose (such as DBus's --print-address= and Xorg's -displayfd). The latter case -- fitting in places where it wasn't even expected -- is a sign of a well thought out system, in my opinion.

For example, a definition of an instance with readiness notification, which will be relevant for the next section, might look like.

Note that the argument for option -d must the same as the content of the file, as it should for non-s6 programs either intentionally or accidentally compatible with s6's notifications.

Polling
Another example of how s6's notification protocol is simple yet extensible is the program. For programs that don't implement a command-line option useful for s6's readiness notification protocol, it implements a polling mechanism that feeds back into it.

Create a folder named inside the service folder and write a script named check, that only exits successfully if the service is up. If the program doesn't have a dedicated utility for pinging, using one to query information from it (think,  , etc.) with its output redirected to  should also work.

Once check is written, assuming you chose 3 as the, replace

program

in with

s6-notifyoncheck -d -33 program

Notice two options were given: -d (double-fork, so s6-notifyoncheck doesn't become a "zombie process") and -3 (the location of the notification pipe).

Logging chain
Syslog under s6-rc is not natively supported. Instead, the preferred mechanism is sending daemon's standard output and error to a second logger daemon,. Again, runit users will be familiar with this, as it requires a similar design with in place of. However, how it is set up is quite different.

Let's say you want to log a daemon called verbosed. Create two service definition folders.

The first one - named verbosed-srv, if you're following Artix's conventions - should be populated it as normal, with run, type, notification-fd, etc, for the daemon itself. The only difference is that you should write  (shell) or   (execline) in  so error messages go to standard output.

A second folder - conventionally, verbosed-log - is then populated with another service, preferably s6-log writing to a unique location (conventionally, ), and with readiness notification.

Now, you can use s6-rc's pipeline mechanism. It can supervise entire equivalents of shell script pipelines, but the one pipeline in particular we want to supervise is.

The steps we take are:


 * Write verbosed-log to - so verbosed-srv's standard output gets connect to verbosed-log's standard input.
 * Write verbosed-srv to - this confirms the above in verbosed-log's side.
 * Write verbosed to - the file that contains the name of the bundle with verbosed-srv | verbosed-log.

After recompiling, your database will contain a verbosed bundle, that will start the service and its logger. If this sounds like the kind of boilerplate you'd want to automate with a script, that's because it is.

SERVICE=${1?:Need service.}
 * 1) !/bin/sh

mkdir -- "$SERVICE-srv" "$SERVICE-log" || exit

( cd -- "$SERVICE-srv" touch run # write it echo longrun > type echo "$SERVICE-log" > producer-for ) ( cd -- "$SERVICE-log" printf '%s\n' '#!/bin/execlineb -P' "s6-log -d 3 -- /var/log/$SERVICE" > run echo 3 > notification-fd echo longrun > type echo "$SERVICE-srv" > consumer-for echo "$SERVICE" > pipeline-name )

Services without a dedicated logger will have their logs sent to, and, if  was given in the   step, to the console.

Replacing OpenRC
Preferably, do these outside of OpenRC + Sysvinit, so you can shutdown your system without the good (?) ol' Alt+PrintScreen+REISUB.

Removing OpenRC and Sysvinit
Deselect and. Gentoo packages usually won't have an explicit OpenRC dependency just because of the init script. They will be kept around after we switch, which will come quite handy when you want to rewrite services for your new init. Eventually, should agree to remove OpenRC and sysvinit. If you're in a hurry, just unmerge both immediately and deal with the fallout (including updates trying to reinstall them) later.

Reemerging sys-apps/s6-linux-init
Re-emerge with. Rename your symlink to, or create it now if you haven't.

A note on rewriting init scripts
OpenRC, in most cases, is an absent parent which would rather not deal with children nagging them, so it usually avoids passing arguments that would make daemons be in the foreground, or even actively passes arguments that make programs go to the background.

Like runit (and to be fair, supervise-daemon), however, s6 considers processes to have failed when they exit, and tries to restart then in this case.

This means that, if you just blindly copy-paste the command line used by OpenRC, a conflict might happen: first, s6 spawns the program with OpenRC flags. Then program spawned by s6 will spawn the actual service, and leave s6 in the dust. After that, s6 keeps trying to restart the first program, which will fail due to there already being a PID file, there already being something waiting for commands on the same location, etc.

For instance, if you make a "literal" translation of, you'll end up with a file that calls , which will go to the background and cause s6 to repeatedly start elogind. Just omitting the  option makes it work properly.

On the other hand, programs like, say,, by default have the behavior OpenRC expects, so you'll have to read the manual and look for a "do not fork", "run in foreground", "no detaching/backgrouding", "supervised", "debug" etc. option and apply it to your service definition. For reference, you can look at Artix Linux's s6 implementation or at runit init scripts for the service you're trying to port, which have the same restriction.

As a last resort, if the service is really obnoxious about not being watched by anything (let's say it's called obnoxiusd), you can write  to. is, well, a hackish program that will try to stay alive for as long as there are processes spawned by, giving s6-svscan and s6-supervise "a bone to chew", so to speak. However, stopping s6-fghack (which we've tricked s6 into thinking is the service) won't stop the actual service process, which was spawned by the obnoxiousd command and left without a trace.

Therefore, you'll then have to use whatever the author's intended way to stop the service is as the script (which is run after the service -- in this case, s6-fghack -- is stopped), be it , or.

WIP
More/better service examples (are they even a good thing, or is a separate execline guide warranted)?

Explain even more s6 concepts.