Runit

Runit is daemontools-inspired process supervision suite that also provides a program suitable for running as process 1]]. It can be used as alternative to sysvinit or systemd, either by itself or in conjunction with OpenRC. It can also be used as a helper for supervising OpenRC services.

Environment variables

 * SVDIR - Directory will search for the services specified as arguments.
 * SVWAIT - Time will wait for a service to reach its desired state before timing out or killing it with a   signal.

Files

 * - Directory will search for the services specified as arguments if SVDIR is empty of unset.
 * - File will execute when the machine boots.
 * - File will execute and supervise when  exits.
 * - File will execute when the machine shuts down.
 * - File will execute when receiving a   signal.
 * - Used by to decide whether it should initiate machine shutdown when receiving a   signal or not.
 * - Used by to decide whether it should halt or reboot the machine.
 * - Symbolic link to 's scan directory when using =sys-process/runit-2.1.2.
 * - Service directory repository when using >=sys-process/runit-2.1.2.
 * - 's scan directory when using OpenRC's runit integration feature.
 * - Symbolic link to 's scan directory when using <sys-process/runit-2.1.2.

OpenRC
See here.

Process supervision
For in-depth information about the process supervision aspects of runit, see daemontools-encore. A summary follows.

The program implementing the supervisor features in runit is, and just like daemontools' , it takes the (absolute or relative to the working directory) pathname of a service directory (or servicedir) as an argument. A runit service directory must contain at least an executable file named, and can contain an optional, regular file named , and an optional subdirectory or symbolic link to directory named , all of which work like their daemontools counterparts. The service directory can also contain an optional, executable file named, that can be used to perfom cleanup actions each time the supervised process stops, possibly dependening on its exit status information. calls with two arguments: the first one is 's exit code, or -1 if  didn't exit normally, and the second one is the least significant byte of the exit status as determined by POSIX. For instance, the second argument is 0 if exited normally, and the signal number if  was terminated by a signal. If or  exit immediately,  waits 1 second before starting  or restarting, so that it does not loop too quickly. A supervised process will run in its parent's session; making it a session leader requires using the  program with a   option inside. If receives a   signal, it behaves as if an sv exit command naming the corresponding service directory had been used (see later).

Just like daemontools', keeps control files in a subdirectory of the servicedir, named , and if it finds a simbolic link to directory with that name,  will follow it and use the linked-to directory for its control files. Unlike daemontools, also keeps human-readable files in the  directory, named  and, containing status information about the supervised process. For further information please consult the man page.

The program allows supervising a set of processes running in parallel using a scan directory (or scandir), just like daemontools', so it will be the supervision tree's root. It also checks at least every 5 seconds the time of last modification, the inode, or the device, of the scandir, and performs a scan if any of them has changed, launching child processes for each new servicedir it finds, or old servicedir for which it finds its  process has exited, and sending a   signal to all  children for which their corresponding servicedir is no longer present. Unlike daemontools', accepts a second argument after the scan directory's pathname, that must be at least seven characters long, and works like daemontools 's last argument: it sets the number of characters of an automatically rotated log that  keeps in memory, and can be seen in the output of the  utility. The first 5 characters will remain as specified in the argument, the rest will right shift as new messages are sent to 's standard error. also writes a dot to the log every 15 minutes so that old messages expire. If a  option is passed as an argument,  makes its  children leaders of new sessions using the POSIX  call. For further information please consult the man page.

is the logger program provided by the runit package. It supports automatically rotated logging directories (or logdirs) in the same way daemontool's program does, but its user interface is quite different. Logging directory pathnames are supplied as arguments and don't need to start with a dot ('.') or slash ('/'). To prepend a timestamp in external TAI64N format to logged lines, must be invoked with a   option. A  option prepends a UTC timestamp of the form YYYY-MM-DD_HH:MM:SS.xxxxx, and a   option prepends a UTC timestamp of the form YYYY-MM-DDTHH:MM:SS.xxxxx. Other actions performed by on text lines read on its standard input can be specified in a file inside the logging directory, named. Empty lines in or lines that start with '#' are ignored, every other line specifies a single action. Actions are carried out sequentially in line order. Actions starting with s, n, !, + and - behave like their daemontools' counterparts. Patterns in + and - actions have the same syntax as those from Bernstein daemontools', except that runit's also accepts a plus sign ('+') as a special character that matches the next character in the pattern one or more times, and that prepended timestamps are not considered for matching against the patterns. can be forced to perform a rotation if it receives a  signal, and rereads the  files in the logdirs (after closing and reopening all logs) if it receives a   signal. For the full description of 's functionality please consult the respective man page.

is a chain loading program that can be used to modify a supervised process' execution state. It accepts a set of options that specify what to do; some of them work like daemontools', , , , and , and others are runit-specific. For example, chpst -n increments or decrements the nice value of the process (using POSIX ), chpst -/ changes the root directory before executing the next program in the chain (using Linux on Gentoo), and chpst -b newname executes the next program in the chain as if it was invoked with the name newname (i.e. performs  substitution). This is useful for programs that have different behaviours depending on the name they are invoked with. If itself is invoked with the names, , , ,  or , it behaves as those daemontools programs. For the full description of 's functionality please consult the respective man page.

is runit's program for controlling supervised processes and querying status information about them. It accepts a subcommand and a set of service directory pathnames as arguments. Unless a pathname starts with a dot ('.') or slash ('/'), it is asumed to be relative to the directory specified as the value of the SVDIR environment variable, or to if SVDIR is empty or unset. The subcommand tells what to do. The up, down, once and exit subcommands behave like daemontools' svc -u svc -d  svc -o and svc -dx commands, respectively. The status subcommand is similar to daemontools', it displays whether the supervised process is running ('run') or not ('down'), or if its file is currently running ('finish'), whether it is transitioning to the desired state or already there ('want up' or 'want down'), its process ID (PID) if it is up (or 's PID if it is currently running), how long it has been in the current state, and whether its current up or down status matches the presence or absence of a  file in the servicedir ('normally up' or 'normally down'). If also shows if the supervised process is paused (because of a  signal) or has been sent a   signal and  is waiting for its effect. Other subcommands allow reliably sending signals to the supervised process. In particular, sv alarm can be used to send a  signal to a supervised  process to force it to perform a rotation, and sv hup can be used to send it a   signal to make it reread the logging directories'  files.

also accepts a set of subcommands resembling LSB init script actions :


 * The sv start, sv stop and sv shutdown commands behave like sv up , sv down and sv exit , respectively, except that they wait for their actions to be completed, and then print the process' status as if an sv status command had been used. The wait period's duration is the value of the SVWAIT environment variable (in seconds), or 7 seconds if SVWAIT is empty or unset. It can also be specified in a  option passed as argument to . The status line starts with 'ok:' if the supervised process reached the desired state during the wait period, and with 'timeout:' if it did not.
 * The sv force-stop and sv force-shutdown commands behave like sv stop and sv shutdown, respectively, except that if the supervised process didn't reach the desired state during the wait period, it will be sent a  signal as if an sv kill command had been used. The status line will start with 'kill:' in that case.
 * The sv reload command behaves like sv hup (i.e. sends a  signal to the supervised process), except that it prints the status line afterwards.
 * The sv try-restart command behaves like sv term followed by sv cont (i.e. sends a  signal and then a   signal), except that it waits for its actions to be completed, timing out after the wait period expires if they didn't, and prints the status line afterwards, just like sv start, sv stop and sv shutdown do.
 * The sv restart command behaves like sv term followed by sv cont followed by sv up, except that it waits for its actions to be completed, timing out after the wait period expires if they didn't, and prints the status line afterwards. That is, it behaves like sv try-restart , but with an extra sv up action.
 * The sv force-reload and sv force-restart commands behave like sv try-restart and sv restart, except that the supervised process will be sent a  signal if it didn't reach the desired state during the wait period, just like sv force-stop and sv force-shutdown do.

's LSB init script action-like subcommands consider that the effect of their actions is complete based on the state considers the supervised process to be in (as sv status would report it). This behaviour can be extended by including an executable file named in the service directory. Subcommands that include in their actions the equivalent of an sv up, sv term or sv kill command will make execute the  file. The supervised process is considered to be up if considers it to be up, and if 's exit code is 0. also supports a check subcommand, that performs no action on the supervised process, but makes execute the  file. If its exit code is 0, it will print a status line starting with 'ok:', otherwise, it will print a status line starting with 'timeout:' after the wait period expires.

For the full description of 's functionality please consult the respective man page.

Sample runit scan directory with and  files, as well as a symbolic link to a  directory elswhere:

It is assumed is a program that ignores the   signal.

Resulting supervision tree when is run on this scandir as a background process in an interactive shell, assuming it is a subdirectory named in the working directory (i.e. launched with runsvdir scan & ):

subdirectory contents:

Messages sent by the supervised processes to 's standard output when manually starting :

After enough seconds have elapsed:

Reliably sending a  signal to :

Reliably sending a  signal afterwards:

The signal doesn't have any efect yet because the supervised process is stopped. To resume it a  signal is needed:

Since the process is supervised, after being killed executes, and then restarts the process by executing.

Messages sent by the supervised processes to 's standard output when manually stopping :

This shows that stopped  by killing it with a   signal (signal 15).

Manually starting using the 's LSB-like interface:

Manually stopping and  using the 's LSB-like interface:

This shows that could be stopped ('ok:') but  couldn't (because it ignores  ), so after the default 7 seconds wait period,  gives up ('timeout:'). Forcibly stopping using the 's LSB-like interface:

This shows that because didn't stop during the default 7 seconds wait,  sends it a   signal (signal 9), so it is now stopped:

From OpenRC
As of version 0.22, OpenRC provides a service script that can launch with -style logging, also named. On Gentoo, the scan directory will be. This script exists to support the OpenRC-runit integration feature, but can be used to just launch a runit supervision tree. Thus, it can be started when the machine boots by adding it to an OpenRC runlevel using :

Or it can also be started manually:

Because the service script calls using absolute path, a symlink to the correct path must be created if using >=sys-process/runit-2.1.2:

Alternatively, OpenRC's service could be used to start the supervision tree when entering OpenRC's 'default' runlevel, by placing '.start' and '.stop' files in  (please read  for more details) that perform actions similar to those of the  service script:

The  signal makes  send a   signal to all its  children before exiting, which, in turn, makes them stop their supervised processes and exit. The  signal that  sends by default would just make  exit.

From sysvinit
Following upstream's suggestion, Gentoo's packaging of runit provides a symbolic link to , that allows  to be launched and supervised by sysvinit by adding a 'respawn' line for it in. Used in this way, the supervision tree becomes rooted in process 1, which cannot die without crashing the machine.

Gentoo users wanting to use in this way will need to manually edit, and then call :

This will make sysvinit launch and supervise when entering runlevels 1 to 5.

The logging chain
A supervision tree where all leaf processes have a logger can be arranged into what the author of s6 calls the logging chain, which he considers to be technically superior to the traditional syslog-based centralized approach.

Since processes in a supervision tree are created using the POSIX call, all of them will inherit 's standard input, output and error. A logging chain arrangement using runit is as follows:


 * Leaf processes should normally have a logger, so their standard output and error connect to their logger's standard input. Therefore, all their messages are collected and stored in dedicated, per-service logs by their logger. Some programs might need to be invoked with certain options passed as arguments to make them send messages to their standard error, and redirection of stderr to stdout (i.e. 2>&1 in a shell script) must be performed in the servicedir's file.
 * Leaf processes with a controlling terminal are an exception: their standard input, output and error connect to the terminal.
 * , the loggers, and leaf processes that exceptionally don't have logger for some reason, inherit their standard input, output and error from, so their messages are sent wherever the ones from are.
 * Leaf processes that still unavoidably report their messages using have them collected and logged by a (possibly supervised) syslog server.

If runit is used as the init system, and was invoked with no second argument, its standard input, output and error will be redirected to. If was invoked with a second argument, -like logging is turned on and messages sent to 's standard error will go to the log and can be seen using.

Runit as the init system
The runit package provides a program capable of running as process 1, also called, and a helper program,. If detects it is running as process 1, it replaces itself with  using the POSIX  call. Therefore, to use as the system's init, a   parameter can be added to kernel's command line using the bootloader's available mechanisms (e.g. a  command in some 'Gentoo with runit' menu entry for GRUB2). It is possible to go back to sysvinit + OpenRC at any time by reverting the change.

When the machine starts booting (if an initramfs is being used, after it passes control to the 'main' init), executes the  file as a child process, in a foreground process group with  as the controlling terminal, and waits for it to finish. This file is usually a shell script, and is expected to perform all one time initialization tasks needed to bring the machine to its stable, normal 'up and running' state. Gentoo's file is quite minimal, it only calls the  program to enter OpenRC's 'sysinit' runlevel, and then its 'boot' runlevel, emulating Gentoo's sysvinit  setup.

When exits,  then executes the  file as a child process, makes it a session leader with the POSIX  call, and supervises it: if  is killed by a signal or its exit code is 111, then  will restart it, after sending a   signal to every remaining process in its process group. Gentoo's file is upstream's suggested one with minimal modifications. It is a shell script that uses the builtin utility to replace itself with, so this creates a supervision tree rooted in process 1. The scan directory will be for =sys-process/runit-2.1.2. The enviroment will be empty, except for the PATH variable, set to a known value in the script. will use -like logging, and, for >=sys-process/runit-2.1.2, is also passed the  option.

Gentoo's packaging of runit expects for =sys-process/runit-2.1.2, to be a repository of service directories. Services that need to be started when the machine boots require a symbolic link in the scan directory to the corresponding servicedir in that repository. Gentoo only provides service directories for 6 parallel supervised processes (with their symlinks in the scan directory); this allow users to get to a text console login, like with Gentoo's sysvinit  setup. Service directories for anything else must be created by the administrator, either from scratch or taken from somewhere else (e.g. alternative ebuild repositories).

Runit doesn't directly support any runlevel-like concept, but if the machine contains a set of directories, each one with a scan directory structure, then it is possible to have a behaviour similar to 'changing runlevels' if the scan directory argument of is actually a symbolic link. The software package's author proposes creating a symbolic link to directory pointing to one of the aforementioned directories, which then becomes the current scan dirrectory. Runit provides a program that can atomically modify this symlink, thereby changing the current scan directory, and 's next periodic rescan would take care of starting and killing the appropriate  processes. For further details on, please consult the respective man page. Gentoo's packaging of runit version 2.1.1 supports this model: 's scan directory argument is symbolic link, that points to. The latter in turn is also a symlink that points to, but can be modified later using. Gentoo's packaging of more recent versions of runit does away with this runlevel-like setup.

If receives a   signal, and the file  exists and has the execute by owner permission set, it will kill  (first by sending it a   signal and waiting, then by sending it a   signal) and then execute the  file. This file is usually a shell script, and is expected to perform all tasks needed to shut the machine down. If is killed by a signal or its exit code is 100,  skips execution of  and executes. will also execute if  exits (with an exit code other than 111). If exits,  will send a   signal to all remaining processes, and then check if the file  exists, to decide what to do next. If the file exists and has the execute by owner permission set, it reboots the machine. In any other case, it will poweroff the machine, or halt it if it can't power it off.

Gentoo's file performs an sv shutdown for =sys-process/runit-2.1.2, on every servicedir of 's scan directory, and then calls the  program to enter OpenRC's 'shutdown' or 'reboot' runlevels, depending on whether a poweroff or reboot operation was requested to  via.

If receives a   signal (which is usually configured to happen when key combination ++ is pressed), and the file  exists and has the execute by owner permission set, it will execute it as a child process, and when it exits, behave as if it had received a   signal. Gentoo's prints a "System is going down in 14 seconds..." message using the  utility, makes sure file  exists and has the execute by owner permission set, waits 14 seconds and then exits. The result being that  will either halt or reboot the machine after 14 seconds, depending on.

All 's children run initially with their standard input, output and error redirected to.

Reboot and shutdown
The program can be used to shut the machine down when  is running as process 1. Unless is running as process 1, it accepts one argument, which can be either 0 or 6:
 * If it is 0, it will create the and  files if any of them does not exist, set the execute by owner permission for the former, unset it for the latter, and send a   signal to process 1.
 * If it is 6, it will create the and  files if any of them does not exist, set the execute by owner permission for both of them, and send a   signal to process 1.

Therefore, if process 1 is, then runit-init 0 will poweroff the machine, and runit-init 6 will reboot it.

This means that is not directly compatible with sysvinit's, , , , and  commands. However, many programs (e.g. desktop environments) expect to be able to call programs with those names during operation, so if such thing is needed, it is possible to use compatibility shell scripts:

Runit and service management
Runit doesn't have service manager features, i.e. it does not provide mechanisms for specifiying dependencies, service ordering constraits, etc. like OpenRC does using  functions in service scripts. If such things are needed, the wanted behaviour must be explicitly enforced in the code of files; the software package's author provides some tips on how to do that. Sometimes, just doing nothing might be enough: if simply exits with an error status when there is an unmet required condition, and, perhaps with help from a  files that analyzes the exit code, the state the machine was in before  was executed is restored, the supervisor would just keep restarting the service until, after some convergence period, all its required conditions are met. The author of nosh calls this "the thundering herd solution".

Nevertheless, OpenRC and runit do not interfere with each other, so it is possible to use OpenRC-managed services on a machine where the init system is runit. In particular, once the supervision tree rooted in process 1 is launched, it is still possible to manually start individual OpenRC services using, or even entering OpenRC's 'default' runlevel manually:

Services from OpenRC's 'default' runlevel could be started automatically on boot using the existing service, moving it to the 'boot' runlevel:

Alternatively, can be modified to add the corresponding  invocation:

Note however that OpenRC services will not be supervised by runit.

Runit can be used without OpenRC's service management, but this requires alternative implementation of the functionality of its service scripts, especially those executed upon entering the 'sysinit', 'boot' and 'shutdown' runlevels, and replacing the Gentoo-provided and  files with custom ones, since they call the  program. It can be be useful to study those from runit-based distributions (e.g. see Void Linux's ones in their void-runit package sources).

OpenRC's runit integration feature
Starting with version 0.22, OpenRC can launch supervised long-lived processes using the runit package as a helper. This is an alternative to 'classic' unsupervised long-lived processes launched using the program. It should be noted that service scripts that don't contain  and   functions implicitly use.

OpenRC services that want to use runit supervision need both a service script in and a runit service directory. The service script must contain a  variable assignment to turn the feature on, and must have a 'need' dependency on the  service in its   function, to make sure the  program is launched (see here). It can contain neither a  function, nor a   function (but their   and   variants are OK), nor a   function; OpenRC internally invokes  when the service script is called with a 'start', 'stop' or 'status' argument.

The runit service directory can be placed anywhere in the filesystem, and have any name, as long as the service script (or the service-specific configuration file in ) assigns the servicedir's absolute path to the runit_service variable. If runit_service is not assigned to, the runit servicedir must have the same name as the OpenRC service script, and will be searched in the >=sys-process/runit-2.1.2 service directory repository,. The scan directory when using this feature is, and OpenRC will create a symlink to the service directory when the service is started, and delete it when the service is stopped.

Sample setup for a hypothetical supervised test-daemon service, with and without a dedicated logger.

The service directories:

This launches program with effective user daemon and the maximum number of open file descriptors set to 5. This is the same as if performed a   call itself with   set to 5, provided that value does not exceed the corresponding hard limit. The program also periodically sends a message of the form "Logged message #N" to its standard error.

The redirection of 's standard error to standard output allows logging its messages using runit's. An automatically rotated logging directory named logdir will be used, and messages will have a UTC timestamp prepended to them.

Manually starting :

Make OpenRC's notion of the service's state catch up:

The resulting supervision tree so far:

Messages from the process with PID 2155 go to the logging directory:

Manually starting :

Make OpenRC's notion of the service's state catch up because of the service startup bug:

The scan directory:

Final supervision tree:

Since the process with PID 2250 doesn't have a dedicated logger, its messages go to 's standard error, which are logged -style and show up in ' output (for process 1931 in this case).

Unmerge
All scan directories, service directories, the symlink to, etc. must be manually deleted if no longer wanted after uninstalling the package. Also, all modifications to sysvinit's must be manually reverted: lines for  must be deleted, and a telinit q command must be used afterwards. And obviously, if runit is being used as the init system, an alternative one must be installed in parallel, and the machine rebooted to use it (possibly by reconfiguring the bootloader), before the package is uninstalled, or otherwise the machine will become unbootable.

External resources

 * Runit article on the Void Linux Wiki (a runit-based GNU/Linux distribution).
 * Runit article on the Arch Linux Wiki.
 * A thread about runit on the Gentoo Forums.
 * The flussence overlay, providing an alternative runit packaging, and accompanying runit-scripts repository.
 * The powerman overlay, providing an alternative runit packaging, service directory files for many services, and runit boot scripts.
 * Avery Payne's supervision-scripts project, compatible with runit.