Celebrating Daemontools
I basically always use some program in the daemontools
family on my computers. My home laptop and desktop are booted with an init system (runit
) based on daemontools
, while many of the systems I set up elsewhere boot a vanilla distribution but immediately set up a daemontools
service directory as a secondary service management tool. Quite frankly, it's one of the best examples of good Unix design and at this point I wouldn't want to go without it.
This is a high-level introduction to the idea of daemontools
rather than a full tutorial: to learn how to set it up in practice, djb's own site as well as a handful1 of others are better references.
What is Daemontools?
The core of daemontools
is just two programs: svscan
and supervise
. They're very straightforward: svscan
takes a single optional argument, and supervise
takes a single mandatory one.
svscan
watches a directory (if none is specified, then it will watch the current working directory) and checks to see if new directories have been added. Any time a new directory is added, it starts an instance of supervise
pointing at that new directory2.
And that's all that svscan
does.
supervise
switches to the supplied directory and runs a script there called ./run
. If ./run
stops running for any reason, it will be started again (after a short pause, to avoid hammering the system.) It will also not start the ./run
script if a file called ./down
exists in the same directory. Extra data about the running process gets stored in a subdirectory called ./supervise
, and a few other tools can be used to prod and modify that data—-for example, to send certain signals to kill the running program, to temporarily stop it, or to see how long it has been running.
And that's almost all that supervise
does.
One extra minor wrinkle is that if supervise
is pointed at a directory that also contains a subdirectory called ./log
, and ./log/run
also exists, then it will monitor that executable as well and point the stdout of ./run
to the stdin of ./log/run
. This allows you to build a custom logging solution for your services if you'd like. The ./log
directory is optional.
So, how does this run a system? Well, you point svscan
at a directory that contains a subdirectory for each service you want to run. Those services are generally small shell scripts that call the appropriate daemon in such a way that it will stay in the foreground. For example, a script to run sshd
might look like:
#!/bin/sh
# redirecting stderr to stdout
exec 2>&1
# the -D option keeps sshd in the foreground
# and the -e option writes log information to stderr
exec /usr/sbin/sshd -D -e
And your directory structure might look like
- service/
|- ngetty/
| |- run
| |- log/
| |- run
|- sshd/
| |- run
| |- log/
| |- run
|- crond/
| |- run
| |- log/
| |- run
Once you point svscan
at this, you end up having a process tree where svscan
is managing multiple service
instances which in turn manage their respective services and logging services:
-svscan-+-service-+-ngetty
| `-log-service
+-service-+-sshd
| `-log-service
+-service-+-crond
| `-log-service
This design has some pretty amazing practical advantages, many of which are attributable to the fact that daemontools
is written in terms of Unix idioms. The “Unix way” gets a fair amount of derision—-some well-deserved, some not—-but daemontools
is a good example of how embracing the idioms of your system can produce better, more flexible software. Consider the following problems and their daemontools
solutions:
Testing a Service Before You Start It
The ./run
script is a plain executable. If it runs and stays in the foreground, doing what it should do, it's correct. If it doesn't, then there's a problem. That's also the only code path, which is a sharp contrast to the infamously difficult-to-write sysvinit
scripts, where start
and stop
and status
and so forth must all be tested in various system states3.
Starting and Stoping a Service
All you do is create or delete a service directory. The most common way of doing this is to create the service directory elsewhere, and then create a symlink into the service directory to start it. This lets you delete a symlink without deleting the main directory, and furthermore ensures that the 'creation' of the directory is atomic.
Another tool, svc
, lets you send signals to the running processes (e.g. svc -p
sends a STOP
signal, and svc -d
sends a TERM
signal as well as telling supervise
to hold off on restarting the service otherwise.)
Express Service Dependencies
The daemontools
design allows for various helper tools. One of them is svok
, which finds out whether a given service is running. This is just another Unix program that will exit with either 0
if the process is running, or 100
if it is not. That means we can write
#!/bin/sh
svok postgres || (echo "waiting for postgres..." && exit 1)
exec 2>&1
exec python2 some-web-app.py
and the script will die (prompting svscan
to wait a moment and then restart it) unless postgres
is already running.
Express Resource Limits
daemontools
has several other applications that can enforce various resource limits or permissions. These are not part of the service mechanism—-instead, they simply modify the current process and then exec
some other command. That means that you can easily incorporate them into a service script
#!/bin/sh
exec 2>&1
# change to the user 'sample', and then limit the stack segment
# to 2048 bytes, the number of open file descriptors to 3, and
# the number of processes to 1:
exec setuidgid sample \
softlimit -n 2048 -o 3 -p 1 \
some-small-daemon -n
These aren't actually special, and don't have anything to do with the daemontools
service mechanism. Any shell script can incorporate setuidgid
or softlimit
, even if those scripts have nothing to do with service management!
Allow User-Level Services
If I want a given user to have their own services that are run as that user, all I need to do is have another svscan
running as that user and pointing at another directory, which I can run as another top-level service:
#!/bin/sh
exec 2>&1
exec setuidgid user \
/usr/sbin/svscan /home/user/service
Variations
What I described above was vanilla daemontools
. Other systems are designed for booting entire systems with this kind of service management. Variations on this basic design add various features:
- The
runit
package extendssupervise
with the ability to execute a./finish
script if the./run
script fails, to do various kinds of cleanup. (runit
renamessvscan
andsupervise
torunsvdir
andrunsv
, respectively.) - The
s6
package adds even more options to both core programs (which are here nameds6-svscan
ands6-supervise
) to e.g. limit the maximum number of services or modify how often scanning is done. It additionally allows control of ans6-supervise
instance through a directory of FIFOs called./event
. - The
daemontools-encore
package adds even more optional scripts: a./start
script which is run before the main./run
script and a./stop
script after the service is disabled, a./notify
script which is invoked when the service changes, and a few others. - The
nosh
package is designed as a drop-in replacement forsystemd
on platforms wheresystemd
cannot run (i.e. any Unix that is not a modern Linux) and so has a lot of utilities that superficially emulatesystemd
as well as tools which can convertsystemd
units intonosh
service directories.nosh
is the most radically divergent of the bunch, but is clearly adaemontools
descendant (and incorporates most of the changes fromdaemontools-encore
, as well.)
Additionally, all these (except for daemontools-encore
) have other capabilities used to set up a Unix system before starting the service-management portion. They also generally include other tools for running services (e.g. runit
includes the swiss-army-knife chpst
for modifying a process's state; s6
includes a plethora of other service helpers and tools for doing things like file-system locking or socket activation) while keeping the same guiding principles of daemontools
intact.
The Takeaway
The whole daemontools
family has two properties which I really appreciate:
- A strong commitment to never parsing anything.
- A strong commitment to using Unix as a raw material.
Why avoid parsing?
Parsing is a surprisingly difficult thing to get right. Techniques for writing parsers vary wildly in terms of how difficult they are, and parsing bugs are a common source of weird machines in computer security. Various techniques can make parsing easier and less bug-prone, but it's a dangerous thing to rely on.
One way to get around this is to just skip parsing altogether. This is difficult in Unix, where most tools consume and emit plain text (or plain binary.) In other systems, such as in individual programming environments or systems like Windows PowerShell, the everything-is-plain-text requirement is relaxed, allowing tools to exchange structured data without reserializing and reparsing.
The way to avoid parsing in Unix is to use various kinds of structure to your advantage. Take the file system: it can, used correctly, emulate a tree-like structure or a key-value store. For example, one supplementary daemontools
utility is envdir
, which reads in environment variables not by parsing a string of name=value
pairs, but by looking at a directory and turning the filename-to-file-contents mapping into a variable-name-to-variable-content mapping.
You might argue that this is silly—-after all, parsing an environment variable declaration is as easy as name=value
! Could a system really introduce a security bug in parsing something as simple as that? As it happens, the answer is yes.
So daemontools
avoids parsing by using directories as an organizing principle, rather than using configuration files.4 This makes an entire class of bugs and vulnerabilities impossible, which is always a good design choice.
What is “Unix as a raw material”?
The building blocks of daemontools
are the parts of Unix which are common to every modern Unix variant: directories and executables and Unix processes and (in some of its descendants) FIFOs. This means you have a universe of actions you can perform outside of the daemontools
universe:
- Your scripts can be written in anything you'd like, not just a shell language. You could even drop a compiled executable in, at the cost of later maintainability.
- Similarly,
daemontools
services are trivially testable, because they're just plain ol' executables. - Lots of details get moved out of service management because they can be expressed in terms of other building blocks of the system. There's no need for a 'which user do I run as' configuration flag, because that can get moved into a script. (Although that script can also consult an external configuration for that, if you'd like!)
- Your directories can be arranged in various ways, being split up or put back together however you'd like.5
In contrast, service management with upstart
or systemd
requires special configuration files and uses various other RPC mechanisms, which means that interacting with them requires using the existing tools and... isn't really otherwise possible. Testing a service with upstart
or systemd
requires some kind of special testing tool in order to parse the service description and set up the environment it requests. Dependency-management must be built in, and couldn't have been added in afterwards. The same goes for resource limits or process isolation. ...and so forth.
“Unix design” has sometimes been used to justify some very poor design choices. On the other hand, it's possible to embrace Unix design and still build elegant systems. A well-built Unix system has some aspects in common with a well-built functional program: small components with well-defined semantics and scope and a clear delineation of the side-effects of any given part, all of which can easily be used together or apart. The daemontools
family of programs is a perfect example of Unix design done well.
- This one is about
runit
, notdaemontools
, but they are similar enough in principle. - It does this not using
inotify
or some other mechanism, but rather just by waking up every five seconds and doing a quick traversal of everything in the directory. This is less efficient, but also makes fewer assumptions about the platform it's running on, which meansdaemontools
can run just about anywhere. - Of course, your daemon might still rely on state—-but that's the fault of your daemon, and no longer inherent in the service mechanism. Contrast this to
sysvinit
-style scripts, where the only possible API is a stateful one in which the script does different things depending on the process state. - One might argue that this is a little bit disingenuous: after all, you're still invoking shell scripts! If one part of your system avoids parsing, but then you call out to a piece of software as infamously complicated and buggy as
bash
, all that other security is for naught. But there's no reason that you have to write your scripts inbash
, and in fact, the creator ofs6
has built a new shell replacement for that purpose: namely,execline
, which is designed around both security and performance concerns. If you wanted, you could replace all those shell scripts with something else, perhaps something more like theshill
language. Luckily, thedaemontools
way is agnostic as to what it's executing, so it is easy to adopt these tools as well! - I personally tend to have a system-level
/etc/sv
for some services and a user-level/home/gdritter/sv
for other services, regardless of whether those services are run in my user-level service tree in/home/gdritter/service
or the root-level tree in/service
.