Index | Archives | Atom Feed | RSS Feed

systemd Status Update

It has been a while since my last status update on systemd. Here's another short, incomprehensive status update on what we worked on for systemd since then.

  • Fedora F15 (Rawhide) now includes a split up /etc/init.d/rc.sysinit (Bill Nottingham). This allows us to keep only a minimal compatibility set of shell scripts around, and boot otherwise a system without any shell scripts at all. In fact, shell scripts during early boot are only used in exceptional cases, i.e. when you enabled autoswapping (bad idea anyway), when a full SELinux relabel is necessary, during the first boot after initialization, if you have static kernel modules to load (which are not configured via the systemd-native way to do that), if you boot from a read-only NFS server, or when you rely on LVM/RAID/Multipath. If nothing of this applies to you can easily disable these parts of early boot and save several seconds on boot. How to do this I will describe in a later blog story.
  • We have a fully C coded shutdown logic that kills all remaining processes, unmounts all remaining file systems, detaches all loop devices and DM volumes and does that in the right way to ensure that all these things are properly teared down even if they depend on each other in arbitrary ways. This is not only considerably faster then the traditional shell hackery for this, but also a lot safer, since we try to unmount/remount the remaining file systems with a little bit of brains. This feature is available via systemctl --force poweroff to the administrator. The --force controls whether the usual shutdown of all services is run or whether this is skipped and we immediately shall enter this final C shutdown logic. Using --force hence is a much safer replacement for the old /sbin/reboot -f and does not leave dirty file systems behind. (Thanks to Fabiano Fidencio has his colleagues from ProFUSION for this).
  • systemd now includes a minmalistic readahead implementation, based on fanotify(), fadvise() and mincore(). It supports btrfs defragmentation and both SSD and HDD disks. While the effect on boots that are anyway fast (such as most stuff involving SSD) is minimal, slower and older machines benefit from this more substantially.
  • We now control fsck and quota during early boot with a C tool that ensure maximum parallelization but properly implements the necessary high-level administration logic.
  • Every service, every user and every user session now gets its own cgroup in the 'cpu' hierarchy thus creating better fairness between the logged in users and their sessions.
  • We now provide /dev/log logging from early boot to late shutdown. If no syslog daemon is running the output is passed on to kmsg. As soon as a proper syslog daemon starts up the kmsg buffer is flushed to syslog, and hence we will have complete log coverage in syslog even for early boot.
  • systemctl kill was introduced, an easy command to send a signal to all processes of a service. Expect a blog story with more details about this shortly.
  • systemd gained the ability to load the SELinux policy if necessary, thus supporting non-initrd boots and initrd boots from the same binary with no duplicate work. This is in fact (and surprisingly) a first among Linux init systems.
  • We now initialize and set the system locale inside PID 1 to be inherited by all services and users.
  • systemd has native support for /etc/crypttab and can activate encrypted LUKS/dm-crypt disks both at boot-up and during runtime. A minimal password querying infrastructure is available, where multiple agents can be used to present the password to the user. During boot the password is queried either via Plymouth or directly on the console. If a system crypto disk is plugged in after boot you are queried for the password via a GNOME agent, or a wall(1) agent. Finally, while you run systemctl start (or a similar command) a minimal TTY password agent is available which asks you for passwords right-away if this is necessary. The password querying logic is very simple, additional agents can be implemented in a trivial amount of code (Yupp, KDE folks, you can add an agent for this, too). Note that the password querying logic in systemd is only for non-user passwords, i.e. passwords that have no relation to a specific user, but rather to specific hardware or system software. In future we hope to extend this so that this can be used to query the password of SSL certificates when Apache or other servers start.
  • We offer a minimal interface that external projects can use to extend the dependency graph systemd manages. In fact, the cryptsetup logic mentioned above is implemented via this 'plugin'-like system. Since we did not want to add code that deals with cryptographic disks into the systemd process itself we introduced this interface (after all cryptographic volumes are not an essential feature of a minimal OS, and unncessary on most embedded uses; also the future might bring us STC which might make this at least partially obsolete). Simply by dropping a generator binary into /lib/systemd/system-generators which should write out systemd unit files into a temporary directory third-party packages may extend the systemd dependency tree dynamically. This could be useful for example to automatically create a systemd service for each KVM machine or LXC container. With that in place those containers/machines could be managed and supervised with the same tools as the usual system services.
  • We integrated automatic clean-up of directories such as /tmp into the tmpfiles logic we already had in place that recreates files and directories on volatile file systems such as /var/run, /var/lock or /tmp.
  • We now always measure and write to the log files the system startup time we measured, broken up into how many time was spent on the kernel, the initrd and the initialization of userspace.
  • We now safely destroy all user session before going down. This is a feature long missing on Linux: since user processes were not killed until the very last moment the unhealthy situation that user code was running at a time where no other daemon was remaining was a normal part of shutdown.
  • systemd now understands an 'extreme' form of disabling a service: if you symlink a service name in /etc/systemd/system to /dev/null then systemd will mark it as masked and completely refuse starting it, regardless if this is requested manually or automaticallly. Normally it should be sufficient to simply call systemctl disable to disable a service which still allows manual activation but no automatic activation. Masking a service goes one step further.
  • There's now a simple condition syntax in places which allows skipping or enabling units depending on the existance of a file, whether a directory is empty or whether a kernel command line option is set.
  • In addition to normal shutdowns for reboot, halt or poweroff we now similarly support a kexec reboot, that reboots the machine without going though the BIOS code again.
  • We have bash completion support for systemctl. (Ran Benita)
  • Andrew Edmunds contributed basic support to boot Ubuntu with systemd.
  • Michael Biebl and Tollef Fog Heen have worked on the systemd integration into Debian to a level that it is now possible to boot a system without having the old initscripts packaged installed. For more details see the Debian Wiki. Michael even tested this integration on an Ubuntu Natty system and as it turns out this works almost equally well on Ubuntu already. If you are interesting in playing around with this, ping Michael.

And that's it for now. There's a lot of other stuff in the git commits, but most of it is smaller and I will it thus spare you.

We have come quite far in the last year. systemd is about a year old now, and we are now able to boot a system without legacy shell scripts remaining, something that appeared to be a task for the distant future.

All of this is available in systemd 13 and in F15/Rawhide as I type this. If you want to play around with this then consider installing Rawhide (it's fun!).


27C3 Fudfest

I really wonder why on earth the 27C3 accepted a nonsensical paper like this into their programme. So .. stupid. You read half the proposal and it's already kinda obvious that the presenter has no idea what he is talking of. Fundamental errors, obvious misinterpretations, outdated issues: this is just FUD.

And apparently this talk even is anonymous? Such a coward! FUDing around anonymously is acceptable at the CCC?


Linux Plumbers Conference/Gnome Summit Recap

Last week LPC and GS 2010 took place in Cambridge, MA. Like the last years, LPC showed again that -- at least for me -- it is one of the most relevant Linux conferences in existence, if not the single most relevant one.

Here's a terse, incomprehensive report of the different discussions I took part in with various folks at the conference, in no particular order:

The Boot and Init track led by Kay Sievers (Suse) was a great success. We had exciting talks which I think helped quite a bit in clearing a few things up, and hopefully helps us in consolidating the full Linux boot process among all the components involved. We had talks covering everything from the BIOS boot, to initrds, graphical boot splashes and systemd. Kay Sievers and I spoke about systemd, also covering the state of it in the Fedora and openSUSE distributions. Gustavo Barbieri (ProFUSION, Gentoo) and Michael Biebl (Debian) gave interesting talks about systemd adoption in their respective distributions. I was particularly interested in the various statistics Michael showed about SysV/LSB init script usage in Debian, because this gives an idea how much work we have in front of us in the long run. A longer discussion about the future of initrds and the logic necessary to find the root file system on boot was quite enlightening. I think this track was helpful to increase the unification and consolidation of the way Linux systems boot up and are maintained during runtime.

Kay and I and some other folks sat down with Arjan van de Ven (Intel), to talk about the prospects of systemd in Meego. The discussions were very positive. In particular Arjan hat some great suggestions regarding use of the Simple Boot Flag in systemd (expect this in one of the next versions) and readahead. Before systemd can find adoption in Meego we'd have to add a short number of features to systemd first, most of them should be easy to add.

Similarly, I sat down with Martin Pitt and James Hunt (both Canonical) and discussed systemd in relation to Ubuntu. I think we managed to clear a lot of things up, and have a good chance to improve cooperation between Ubuntu and systemd in relation to APIs and maybe even more.

We talked to Thomas Gleixner regarding userspace notifications when the wallclock time jumps relative to the monotonic clock. This is important to systemd so that we can schedule calendar jobs similar to cron, but without having to wake up periodically to check whether the wallclock time changed relatively to the monotonic clock so that we can recalculate the next point in time a calendar event is triggered. There has been previous work in this area in the kernel world, but nothing got merged. Thomas' suggestion how to add this facility should be much easier than anything proposed so far.

I also tried to talk Andreas Grünbacher into supporting file system user extended attributes in various virtual file systems such as procfs, cgroupfs, sysfs and tmpfs. I hope I convinced him that this would be a good idea, since this would allow setting externally accessible attributes to all kinds of kernel objects, such as processes and devices. This would not only have uses in systemd (where we could easily store all meta information systemd needs to know about a service in the cgroupfs via xattrs, so that systemd could even crash or go away at any time and we still can read all runtime information necessary beyond mere cgrouping from the file system when systemd comes to live again) but also in the desktop environments, so that we could for example attach the human readable application name, an icon or a desktop file to the processes currently running, in a simple way where the data we attach follows the lifecycle of the process itself.

The Audio track went really well, too. I was particularly excited about Pierre-Louis Bossart's (Intel) plans regarding AC3 (and other codecs) support in PulseAudio, and the simplicity of his approach. Also great was hearing about Laurent Pinchart's project to expose audio and video device routing to userspace. Finally, I really enjoyed David Henningsson's and Luke Yelavich's (both Canonical) talk regarding tracking down audio bugs on Ubuntu. I was really impressed by the elaborate tools they created to test audio drivers on users machines. Pretty cool stuff. Maybe this can be extended into a test suite for driver writers, because the current approach for driver writers (i.e. "If PulseAudio works correctly, your driver is correct") doesn't really scale (although I like the idea and take it as a compliment...). I also liked the timechart profiling results Pierre showed me that he generated for PulseAudio. Seems PulseAudio is behaving quite nicely these days.

Together with Harald Hoyer I got a demo of David Zeuthen's disk assembly daemon (stc), which makes RAID/MD/LVM assembly more dynamic. Great stuff, and I think we convinced him to leave actual mounting of file systems to systemd instead of doing it himself.

Harald and I also hashed out a few things to make integration between dracut and systemd nicer (i.e. passing along profiling information between the two, and information regarding the root fsck).

I also hope I convinced Ray Strode to make Plymouth actively listen to udev for notifications about DRM devices, so that further synchronization between udev and plymouth won't be necessary, which both makes things more robust and a little bit faster.

Kay and I talked to Greg Kroah-Hartman regarding the brokeness of VT_WAITEVENT in kernel TTY layer, and discussed what to do about this. After returning from the US Kay now did the necessary hacking work to provide a minimal sysfs based solution that allows userspace query to which TTYs /dev/console and /dev/tty0 currently point, and get notifications when this changes. This should allow us to greatly simplify ConsoleKit and make it possible to add console-triggered activation to systemd (think: getty gets started the moment you switch to its virtual terminal, not already at boot).

I also spent some time discussing the upcoming deadline scheduling kernel logic with Dario, Dhaval and Tommaso regarding its possible use in PulseAudio. I believe deadline schedule is a useful tool to hand out real-time scheduling to applications securely. As an easy path to supporting deadline scheduling in PulseAudio I suggested patching RealtimeKit to optionally use deadline scheduling for its clients. This would magically teach PA (and other clients) to use deadline scheduling without further patching in the clients.

At GNOME Summit I sat down with Ryan Lortie and Will Thompson to discuss the the future of the D-Bus session bus and how we can move to a machine/user bus instead in a nice way. We managed to come to a nice agreement here, and this should enable us to introduce systemd for session management soonishly. Now we only need to convince the other folks having stakes in D-Bus that what we discussed is actually a good idea, expect more about this soon on dbus-devel. Ryan and I also hashed out our remaining differences regarding the exact semantics of XDG_RUNTIME_DIR, the result of which you can already see on the XDG mailing list. Ryan already did the GLib work to introduce XDG_RUNTIME_DIR and systemd already supports this inofficially since a few versions.

I quite appreciate how Michael Meeks quoted me in his final keynote. ;-)

There was a lot of other stuff going on at the conference, and what I wrote above is in no way complete. And of course, besides all the technical stuff, it was great meeting all the good Linux folks again, especially my colleagues from Red Hat.

I am still amazed how systemd is received so positively and with open arms all across the board. It's particularly amazing that systemd at this point in time has already been adopted by various companies in the automotive and aviation industry.


Off to LPC 2010, Boston

Later this week the Linux Plumbers Conference 2010 will take place at the Hyatt Regency in Cambridge.

Together with Mark Brown I'll be running the conference track about Audio, and I believe we managed to put together quite a nice schedule with various interesting talks covering many areas of what Audio on Linux is about.

I'll also be around at the Boot and Init Systems track which Kay Sievers is running. Together with Kay I'll do a session about systemd, everybody's favourite system and session manager. We also managed to convince a number of distribution maintainers of systemd to do short presentations about the state of systemd adoption in their respective distributions: Michael Biebl from Debian, Gustavo Barbieri from Gentoo, Kay for openSUSE and yours truly for Fedora.

Because there never can be enough systemd coverage at a conference I'll do another talk about systemd, in Vincent Untz' Desktop track, this time focussing less on how to boot and maintain a system, but more on doing the same for desktop sessions, in particular GNOME.

I'll also stick around for the the first two days of the GNOME Boston Summit.

See you in Cambridge!


FOSS.in CFP Deadline Approaching!

I just submitted my paper[1] for FOSS.in 2010 in Bangalore/India. Don't forget to submit yours! The CFP closes on 10th of October. That's this Sunday! Hurry up, before it is too late!

FOSS.in is one of the most amazing Free Software conferences this world has to offer (hey, and I think I can say that because I have presented at quite a few). A dedicated audience, flawless organization, magic hospitality, and all this in incredible India! Both the technical programme and everything around it are impressive. Which other conference can offer you a concert of one of India's greatest acts as part of the schedule? Which other international conference host city can be such a positive attack on your senses as Bangalore (see that endless sea of flowers below)? And where else do they serve pure silver as part of the conference catering?

Bangalore Market

Read the CFP! Or, go straight to submitting a paper.

Footnotes

[1] About systemd.


systemd for Administrators, Part III

Here's the third installment of my ongoing series about systemd for administrators.

How Do I Convert A SysV Init Script Into A systemd Service File?

Traditionally, Unix and Linux services (daemons) are started via SysV init scripts. These are Bourne Shell scripts, usually residing in a directory such as /etc/rc.d/init.d/ which when called with one of a few standardized arguments (verbs) such as start, stop or restart controls, i.e. starts, stops or restarts the service in question. For starts this usually involves invoking the daemon binary, which then forks a background process (more precisely daemonizes). Shell scripts tend to be slow, needlessly hard to read, very verbose and fragile. Although they are immensly flexible (after all, they are just code) some things are very hard to do properly with shell scripts, such as ordering parallized execution, correctly supervising processes or just configuring execution contexts in all detail. systemd provides compatibility with these shell scripts, but due to the shortcomings pointed out it is recommended to install native systemd service files for all daemons installed. Also, in contrast to SysV init scripts which have to be adjusted to the distribution systemd service files are compatible with any kind of distribution running systemd (which become more and more these days...). What follows is a terse guide how to take a SysV init script and translate it into a native systemd service file. Ideally, upstream projects should ship and install systemd service files in their tarballs. If you have successfully converted a SysV script according to the guidelines it might hence be a good idea to submit the file as patch to upstream. How to prepare a patch like that will be discussed in a later installment, suffice to say at this point that the daemon(7) manual page shipping with systemd contains a lot of useful information regarding this.

So, let's jump right in. As an example we'll convert the init script of the ABRT daemon into a systemd service file. ABRT is a standard component of every Fedora install, and is an acronym for Automatic Bug Reporting Tool, which pretty much describes what it does, i.e. it is a service for collecting crash dumps. Its SysV script I have uploaded here.

The first step when converting such a script is to read it (surprise surprise!) and distill the useful information from the usually pretty long script. In almost all cases the script consists of mostly boilerplate code that is identical or at least very similar in all init scripts, and usually copied and pasted from one to the other. So, let's extract the interesting information from the script linked above:

  • A description string for the service is "Daemon to detect crashing apps". As it turns out, the header comments include a redundant number of description strings, some of them describing less the actual service but the init script to start it. systemd services include a description too, and it should describe the service and not the service file.
  • The LSB header[1] contains dependency information. systemd due to its design around socket-based activation usually needs no (or very little) manually configured dependencies. (For details regarding socket activation see the original announcement blog post.) In this case the dependency on $syslog (which encodes that abrtd requires a syslog daemon), is the only valuable information. While the header lists another dependency ($local_fs) this one is redundant with systemd as normal system services are always started with all local file systems available.
  • The LSB header suggests that this service should be started in runlevels 3 (multi-user) and 5 (graphical).
  • The daemon binary is /usr/sbin/abrtd

And that's already it. The entire remaining content of this 115-line shell script is simply boilerplate or otherwise redundant code: code that deals with synchronizing and serializing startup (i.e. the code regarding lock files) or that outputs status messages (i.e. the code calling echo), or simply parsing of the verbs (i.e. the big case block).

From the information extracted above we can now write our systemd service file:

[Unit]
Description=Daemon to detect crashing apps
After=syslog.target

[Service]
ExecStart=/usr/sbin/abrtd
Type=forking

[Install]
WantedBy=multi-user.target

A little explanation of the contents of this file: The [Unit] section contains generic information about the service. systemd not only manages system services, but also devices, mount points, timer, and other components of the system. The generic term for all these objects in systemd is a unit, and the [Unit] section encodes information about it that might be applicable not only to services but also in to the other unit types systemd maintains. In this case we set the following unit settings: we set the description string and configure that the daemon shall be started after Syslog[2], similar to what is encoded in the LSB header of the original init script. For this Syslog dependency we create a dependency of type After= on a systemd unit syslog.target. The latter is a special target unit in systemd and is the standardized name to pull in a syslog implementation. For more information about these standardized names see the systemd.special(7). Note that a dependency of type After= only encodes the suggested ordering, but does not actually cause syslog to be started when abrtd is -- and this is exactly what we want, since abrtd actually works fine even without syslog being around. However, if both are started (and usually they are) then the order in which they are is controlled with this dependency.

The next section is [Service] which encodes information about the service itself. It contains all those settings that apply only to services, and not the other kinds of units systemd maintains (mount points, devices, timers, ...). Two settings are used here: ExecStart= takes the path to the binary to execute when the service shall be started up. And with Type= we configure how the service notifies the init system that it finished starting up. Since traditional Unix daemons do this by returning to the parent process after having forked off and initialized the background daemon we set the type to forking here. That tells systemd to wait until the start-up binary returns and then consider the processes still running afterwards the daemon processes.

The final section is [Install]. It encodes information about how the suggested installation should look like, i.e. under which circumstances and by which triggers the service shall be started. In this case we simply say that this service shall be started when the multi-user.target unit is activated. This is a special unit (see above) that basically takes the role of the classic SysV Runlevel 3[3]. The setting WantedBy= has little effect on the daemon during runtime. It is only read by the systemctl enable command, which is the recommended way to enable a service in systemd. This command will simply ensure that our little service gets automatically activated as soon as multi-user.target is requested, which it is on all normal boots[4].

And that's it. Now we already have a minimal working systemd service file. To test it we copy it to /etc/systemd/system/abrtd.service and invoke systemctl daemon-reload. This will make systemd take notice of it, and now we can start the service with it: systemctl start abrtd.service. We can verify the status via systemctl status abrtd.service. And we can stop it again via systemctl stop abrtd.service. Finally, we can enable it, so that it is activated by default on future boots with systemctl enable abrtd.service.

The service file above, while sufficient and basically a 1:1 translation (feature- and otherwise) of the SysV init script still has room for improvement. Here it is a little bit updated:

[Unit]
Description=ABRT Automated Bug Reporting Tool
After=syslog.target

[Service]
Type=dbus
BusName=com.redhat.abrt
ExecStart=/usr/sbin/abrtd -d -s

[Install]
WantedBy=multi-user.target

So, what did we change? Two things: we improved the description string a bit. More importantly however, we changed the type of the service to dbus and configured the D-Bus bus name of the service. Why did we do this? As mentioned classic SysV services daemonize after startup, which usually involves double forking and detaching from any terminal. While this is useful and necessary when daemons are invoked via a script, this is unnecessary (and slow) as well as counterproductive when a proper process babysitter such as systemd is used. The reason for that is that the forked off daemon process usually has little relation to the original process started by systemd (after all the daemonizing scheme's whole idea is to remove this relation), and hence it is difficult for systemd to figure out after the fork is finished which process belonging to the service is actually the main process and which processes might just be auxiliary. But that information is crucial to implement advanced babysitting, i.e. supervising the process, automatic respawning on abnormal termination, collectig crash and exit code information and suchlike. In order to make it easier for systemd to figure out the main process of the daemon we changed the service type to dbus. The semantics of this service type are appropriate for all services that take a name on the D-Bus system bus as last step of their initialization[5]. ABRT is one of those. With this setting systemd will spawn the ABRT process, which will no longer fork (this is configured via the -d -s switches to the daemon), and systemd will consider the service fully started up as soon as com.redhat.abrt appears on the bus. This way the process spawned by systemd is the main process of the daemon, systemd has a reliable way to figure out when the daemon is fully started up and systemd can easily supervise it.

And that's all there is to it. We have a simple systemd service file now that encodes in 10 lines more information than the original SysV init script encoded in 115. And even now there's a lot of room left for further improvement utilizing more features systemd offers. For example, we could set Restart=restart-always to tell systemd to automatically restart this service when it dies. Or, we could use OOMScoreAdjust=-500 to ask the kernel to please leave this process around when the OOM killer wreaks havoc. Or, we could use CPUSchedulingPolicy=idle to ensure that abrtd processes crash dumps in background only, always allowing the kernel to give preference to whatever else might be running and needing CPU time.

For more information about the configuration options mentioned here, see the respective man pages systemd.unit(5), systemd.service(5), systemd.exec(5). Or, browse all of systemd's man pages.

Of course, not all SysV scripts are as easy to convert as this one. But gladly, as it turns out the vast majority actually are.

That's it for today, come back soon for the next installment in our series.

Footnotes

[1] The LSB header of init scripts is a convention of including meta data about the service in comment blocks at the top of SysV init scripts and is defined by the Linux Standard Base. This was intended to standardize init scripts between distributions. While most distributions have adopted this scheme, the handling of the headers varies greatly between the distributions, and in fact still makes it necessary to adjust init scripts for every distribution. As such the LSB spec never kept the promise it made.

[2] Strictly speaking, this dependency does not even have to be encoded here, as it is redundant in a system where the Syslog daemon is socket activatable. Modern syslog systems (for example rsyslog v5) have been patched upstream to be socket-activatable. If such a init system is used configuration of the After=syslog.target dependency is redundant and implicit. However, to maintain compatibility with syslog services that have not been updated we include this dependency here.

[3] At least how it used to be defined on Fedora.

[4] Note that in systemd the graphical bootup (graphical.target, taking the role of SysV runlevel 5) is an implicit superset of the console-only bootup (multi-user.target, i.e. like runlevel 3). That means hooking a service into the latter will also hook it into the former.

[5] Actually the majority of services of the default Fedora install now take a name on the bus after startup.


Xiph Video

Don't miss Monty's awesome video.


systemd for Administrators, Part II

Here's the second installment of my ongoing series about systemd for administrators.

Which Service Owns Which Processes?

On most Linux systems the number of processes that are running by default is substantial. Knowing which process does what and where it belongs to becomes increasingly difficult. Some services even maintain a couple of worker processes which clutter the "ps" output with many additional processes that are often not easy to recognize. This is further complicated if daemons spawn arbitrary 3rd-party processes, as Apache does with CGI processes, or cron does with user jobs.

A slight remedy for this is often the process inheritance tree, as shown by "ps xaf". However this is usually not reliable, as processes whose parents die get reparented to PID 1, and hence all information about inheritance gets lost. If a process "double forks" it hence loses its relationships to the processes that started it. (This actually is supposed to be a feature and is relied on for the traditional Unix daemonizing logic.) Furthermore processes can freely change their names with PR_SETNAME or by patching argv[0], thus making it harder to recognize them. In fact they can play hide-and-seek with the administrator pretty nicely this way.

In systemd we place every process that is spawned in a control group named after its service. Control groups (or cgroups) at their most basic are simply groups of processes that can be arranged in a hierarchy and labelled individually. When processes spawn other processes these children are automatically made members of the parents cgroup. Leaving a cgroup is not possible for unprivileged processes. Thus, cgroups can be used as an effective way to label processes after the service they belong to and be sure that the service cannot escape from the label, regardless how often it forks or renames itself. Furthermore this can be used to safely kill a service and all processes it created, again with no chance of escaping.

In today's installment I want to introduce you to two commands you may use to relate systemd services and processes. The first one, is the well known ps command which has been updated to show cgroup information along the other process details. And this is how it looks:

$ ps xawf -eo pid,user,cgroup,args
  PID USER     CGROUP                              COMMAND
    2 root     -                                   [kthreadd]
    3 root     -                                    \_ [ksoftirqd/0]
[...]
 4281 root     -                                    \_ [flush-8:0]
    1 root     name=systemd:/systemd-1             /sbin/init
  455 root     name=systemd:/systemd-1/sysinit.service /sbin/udevd -d
28188 root     name=systemd:/systemd-1/sysinit.service  \_ /sbin/udevd -d
28191 root     name=systemd:/systemd-1/sysinit.service  \_ /sbin/udevd -d
 1096 dbus     name=systemd:/systemd-1/dbus.service /bin/dbus-daemon --system --address=systemd: --nofork --systemd-activation
 1131 root     name=systemd:/systemd-1/auditd.service auditd
 1133 root     name=systemd:/systemd-1/auditd.service  \_ /sbin/audispd
 1135 root     name=systemd:/systemd-1/auditd.service      \_ /usr/sbin/sedispatch
 1171 root     name=systemd:/systemd-1/NetworkManager.service /usr/sbin/NetworkManager --no-daemon
 4028 root     name=systemd:/systemd-1/NetworkManager.service  \_ /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-wlan0.pid -lf /var/lib/dhclient/dhclient-7d32a784-ede9-4cf6-9ee3-60edc0bce5ff-wlan0.lease -
 1175 avahi    name=systemd:/systemd-1/avahi-daemon.service avahi-daemon: running [epsilon.local]
 1194 avahi    name=systemd:/systemd-1/avahi-daemon.service  \_ avahi-daemon: chroot helper
 1193 root     name=systemd:/systemd-1/rsyslog.service /sbin/rsyslogd -c 4
 1195 root     name=systemd:/systemd-1/cups.service cupsd -C /etc/cups/cupsd.conf
 1207 root     name=systemd:/systemd-1/mdmonitor.service mdadm --monitor --scan -f --pid-file=/var/run/mdadm/mdadm.pid
 1210 root     name=systemd:/systemd-1/irqbalance.service irqbalance
 1216 root     name=systemd:/systemd-1/dbus.service /usr/sbin/modem-manager
 1219 root     name=systemd:/systemd-1/dbus.service /usr/libexec/polkit-1/polkitd
 1242 root     name=systemd:/systemd-1/dbus.service /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid
 1249 68       name=systemd:/systemd-1/haldaemon.service hald
 1250 root     name=systemd:/systemd-1/haldaemon.service  \_ hald-runner
 1273 root     name=systemd:/systemd-1/haldaemon.service      \_ hald-addon-input: Listening on /dev/input/event3 /dev/input/event9 /dev/input/event1 /dev/input/event7 /dev/input/event2 /dev/input/event0 /dev/input/event8
 1275 root     name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-rfkill-killswitch
 1284 root     name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-leds
 1285 root     name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-generic-backlight
 1287 68       name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-acpi
 1317 root     name=systemd:/systemd-1/abrtd.service /usr/sbin/abrtd -d -s
 1332 root     name=systemd:/systemd-1/getty@.service/tty2 /sbin/mingetty tty2
 1339 root     name=systemd:/systemd-1/getty@.service/tty3 /sbin/mingetty tty3
 1342 root     name=systemd:/systemd-1/getty@.service/tty5 /sbin/mingetty tty5
 1343 root     name=systemd:/systemd-1/getty@.service/tty4 /sbin/mingetty tty4
 1344 root     name=systemd:/systemd-1/crond.service crond
 1346 root     name=systemd:/systemd-1/getty@.service/tty6 /sbin/mingetty tty6
 1362 root     name=systemd:/systemd-1/sshd.service /usr/sbin/sshd
 1376 root     name=systemd:/systemd-1/prefdm.service /usr/sbin/gdm-binary -nodaemon
 1391 root     name=systemd:/systemd-1/prefdm.service  \_ /usr/libexec/gdm-simple-slave --display-id /org/gnome/DisplayManager/Display1 --force-active-vt
 1394 root     name=systemd:/systemd-1/prefdm.service      \_ /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-f2KUOh/database -nolisten tcp vt1
 1495 root     name=systemd:/user/lennart/1             \_ pam: gdm-password
 1521 lennart  name=systemd:/user/lennart/1                 \_ gnome-session
 1621 lennart  name=systemd:/user/lennart/1                     \_ metacity
 1635 lennart  name=systemd:/user/lennart/1                     \_ gnome-panel
 1638 lennart  name=systemd:/user/lennart/1                     \_ nautilus
 1640 lennart  name=systemd:/user/lennart/1                     \_ /usr/libexec/polkit-gnome-authentication-agent-1
 1641 lennart  name=systemd:/user/lennart/1                     \_ /usr/bin/seapplet
 1644 lennart  name=systemd:/user/lennart/1                     \_ gnome-volume-control-applet
 1646 lennart  name=systemd:/user/lennart/1                     \_ /usr/sbin/restorecond -u
 1652 lennart  name=systemd:/user/lennart/1                     \_ /usr/bin/devilspie
 1662 lennart  name=systemd:/user/lennart/1                     \_ nm-applet --sm-disable
 1664 lennart  name=systemd:/user/lennart/1                     \_ gnome-power-manager
 1665 lennart  name=systemd:/user/lennart/1                     \_ /usr/libexec/gdu-notification-daemon
 1670 lennart  name=systemd:/user/lennart/1                     \_ /usr/libexec/evolution/2.32/evolution-alarm-notify
 1672 lennart  name=systemd:/user/lennart/1                     \_ /usr/bin/python /usr/share/system-config-printer/applet.py
 1674 lennart  name=systemd:/user/lennart/1                     \_ /usr/lib64/deja-dup/deja-dup-monitor
 1675 lennart  name=systemd:/user/lennart/1                     \_ abrt-applet
 1677 lennart  name=systemd:/user/lennart/1                     \_ bluetooth-applet
 1678 lennart  name=systemd:/user/lennart/1                     \_ gpk-update-icon
 1408 root     name=systemd:/systemd-1/console-kit-daemon.service /usr/sbin/console-kit-daemon --no-daemon
 1419 gdm      name=systemd:/systemd-1/prefdm.service /usr/bin/dbus-launch --exit-with-session
 1453 root     name=systemd:/systemd-1/dbus.service /usr/libexec/upowerd
 1473 rtkit    name=systemd:/systemd-1/rtkit-daemon.service /usr/libexec/rtkit-daemon
 1496 root     name=systemd:/systemd-1/accounts-daemon.service /usr/libexec/accounts-daemon
 1499 root     name=systemd:/systemd-1/systemd-logger.service /lib/systemd/systemd-logger
 1511 lennart  name=systemd:/systemd-1/prefdm.service /usr/bin/gnome-keyring-daemon --daemonize --login
 1534 lennart  name=systemd:/user/lennart/1        dbus-launch --sh-syntax --exit-with-session
 1535 lennart  name=systemd:/user/lennart/1        /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
 1603 lennart  name=systemd:/user/lennart/1        /usr/libexec/gconfd-2
 1612 lennart  name=systemd:/user/lennart/1        /usr/libexec/gnome-settings-daemon
 1615 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfsd
 1626 lennart  name=systemd:/user/lennart/1        /usr/libexec//gvfs-fuse-daemon /home/lennart/.gvfs
 1634 lennart  name=systemd:/user/lennart/1        /usr/bin/pulseaudio --start --log-target=syslog
 1649 lennart  name=systemd:/user/lennart/1         \_ /usr/libexec/pulse/gconf-helper
 1645 lennart  name=systemd:/user/lennart/1        /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=24
 1668 lennart  name=systemd:/user/lennart/1        /usr/libexec/im-settings-daemon
 1701 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfs-gdu-volume-monitor
 1707 lennart  name=systemd:/user/lennart/1        /usr/bin/gnote --panel-applet --oaf-activate-iid=OAFIID:GnoteApplet_Factory --oaf-ior-fd=22
 1725 lennart  name=systemd:/user/lennart/1        /usr/libexec/clock-applet
 1727 lennart  name=systemd:/user/lennart/1        /usr/libexec/wnck-applet
 1729 lennart  name=systemd:/user/lennart/1        /usr/libexec/notification-area-applet
 1733 root     name=systemd:/systemd-1/dbus.service /usr/libexec/udisks-daemon
 1747 root     name=systemd:/systemd-1/dbus.service  \_ udisks-daemon: polling /dev/sr0
 1759 lennart  name=systemd:/user/lennart/1        gnome-screensaver
 1780 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfsd-trash --spawner :1.9 /org/gtk/gvfs/exec_spaw/0
 1864 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfs-afc-volume-monitor
 1874 lennart  name=systemd:/user/lennart/1        /usr/libexec/gconf-im-settings-daemon
 1903 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfsd-burn --spawner :1.9 /org/gtk/gvfs/exec_spaw/1
 1909 lennart  name=systemd:/user/lennart/1        gnome-terminal
 1913 lennart  name=systemd:/user/lennart/1         \_ gnome-pty-helper
 1914 lennart  name=systemd:/user/lennart/1         \_ bash
29231 lennart  name=systemd:/user/lennart/1         |   \_ ssh tango
 2221 lennart  name=systemd:/user/lennart/1         \_ bash
 4193 lennart  name=systemd:/user/lennart/1         |   \_ ssh tango
 2461 lennart  name=systemd:/user/lennart/1         \_ bash
29219 lennart  name=systemd:/user/lennart/1         |   \_ emacs systemd-for-admins-1.txt
15113 lennart  name=systemd:/user/lennart/1         \_ bash
27251 lennart  name=systemd:/user/lennart/1             \_ empathy
29504 lennart  name=systemd:/user/lennart/1             \_ ps xawf -eo pid,user,cgroup,args
 1968 lennart  name=systemd:/user/lennart/1        ssh-agent
 1994 lennart  name=systemd:/user/lennart/1        gpg-agent --daemon --write-env-file
18679 lennart  name=systemd:/user/lennart/1        /bin/sh /usr/lib64/firefox-3.6/run-mozilla.sh /usr/lib64/firefox-3.6/firefox
18741 lennart  name=systemd:/user/lennart/1         \_ /usr/lib64/firefox-3.6/firefox
28900 lennart  name=systemd:/user/lennart/1             \_ /usr/lib64/nspluginwrapper/npviewer.bin --plugin /usr/lib64/mozilla/plugins/libflashplayer.so --connection /org/wrapper/NSPlugins/libflashplayer.so/18741-6
 4016 root     name=systemd:/systemd-1/sysinit.service /usr/sbin/bluetoothd --udev
 4094 smmsp    name=systemd:/systemd-1/sendmail.service sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
 4096 root     name=systemd:/systemd-1/sendmail.service sendmail: accepting connections
 4112 ntp      name=systemd:/systemd-1/ntpd.service /usr/sbin/ntpd -n -u ntp:ntp -g
27262 lennart  name=systemd:/user/lennart/1        /usr/libexec/mission-control-5
27265 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-haze
27268 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-logger
27270 lennart  name=systemd:/user/lennart/1        /usr/libexec/dconf-service
27280 lennart  name=systemd:/user/lennart/1        /usr/libexec/notification-daemon
27284 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-gabble
27285 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-salut
27297 lennart  name=systemd:/user/lennart/1        /usr/libexec/geoclue-yahoo

(Note that this output is shortened, I have removed most of the kernel threads here, since they are not relevant in the context of this blog story)

In the third column you see the cgroup systemd assigned to each process. You'll find that the udev processes are in the name=systemd:/systemd-1/sysinit.service cgroup, which is where systemd places all processes started by the sysinit.service service, which covers early boot.

My personal recommendation is to set the shell alias psc to the ps command line shown above:

alias psc='ps xawf -eo pid,user,cgroup,args'

With this service information of processes is just four keypresses away!

A different way to present the same information is the systemd-cgls tool we ship with systemd. It shows the cgroup hierarchy in a pretty tree. Its output looks like this:

$ systemd-cgls
+    2 [kthreadd]
[...]
+ 4281 [flush-8:0]
+ user
| \ lennart
|   \ 1
|     +  1495 pam: gdm-password
|     +  1521 gnome-session
|     +  1534 dbus-launch --sh-syntax --exit-with-session
|     +  1535 /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
|     +  1603 /usr/libexec/gconfd-2
|     +  1612 /usr/libexec/gnome-settings-daemon
|     +  1615 /ushr/libexec/gvfsd
|     +  1621 metacity
|     +  1626 /usr/libexec//gvfs-fuse-daemon /home/lennart/.gvfs
|     +  1634 /usr/bin/pulseaudio --start --log-target=syslog
|     +  1635 gnome-panel
|     +  1638 nautilus
|     +  1640 /usr/libexec/polkit-gnome-authentication-agent-1
|     +  1641 /usr/bin/seapplet
|     +  1644 gnome-volume-control-applet
|     +  1645 /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=24
|     +  1646 /usr/sbin/restorecond -u
|     +  1649 /usr/libexec/pulse/gconf-helper
|     +  1652 /usr/bin/devilspie
|     +  1662 nm-applet --sm-disable
|     +  1664 gnome-power-manager
|     +  1665 /usr/libexec/gdu-notification-daemon
|     +  1668 /usr/libexec/im-settings-daemon
|     +  1670 /usr/libexec/evolution/2.32/evolution-alarm-notify
|     +  1672 /usr/bin/python /usr/share/system-config-printer/applet.py
|     +  1674 /usr/lib64/deja-dup/deja-dup-monitor
|     +  1675 abrt-applet
|     +  1677 bluetooth-applet
|     +  1678 gpk-update-icon
|     +  1701 /usr/libexec/gvfs-gdu-volume-monitor
|     +  1707 /usr/bin/gnote --panel-applet --oaf-activate-iid=OAFIID:GnoteApplet_Factory --oaf-ior-fd=22
|     +  1725 /usr/libexec/clock-applet
|     +  1727 /usr/libexec/wnck-applet
|     +  1729 /usr/libexec/notification-area-applet
|     +  1759 gnome-screensaver
|     +  1780 /usr/libexec/gvfsd-trash --spawner :1.9 /org/gtk/gvfs/exec_spaw/0
|     +  1864 /usr/libexec/gvfs-afc-volume-monitor
|     +  1874 /usr/libexec/gconf-im-settings-daemon
|     +  1882 /usr/libexec/gvfs-gphoto2-volume-monitor
|     +  1903 /usr/libexec/gvfsd-burn --spawner :1.9 /org/gtk/gvfs/exec_spaw/1
|     +  1909 gnome-terminal
|     +  1913 gnome-pty-helper
|     +  1914 bash
|     +  1968 ssh-agent
|     +  1994 gpg-agent --daemon --write-env-file
|     +  2221 bash
|     +  2461 bash
|     +  4193 ssh tango
|     + 15113 bash
|     + 18679 /bin/sh /usr/lib64/firefox-3.6/run-mozilla.sh /usr/lib64/firefox-3.6/firefox
|     + 18741 /usr/lib64/firefox-3.6/firefox
|     + 27251 empathy
|     + 27262 /usr/libexec/mission-control-5
|     + 27265 /usr/libexec/telepathy-haze
|     + 27268 /usr/libexec/telepathy-logger
|     + 27270 /usr/libexec/dconf-service
|     + 27280 /usr/libexec/notification-daemon
|     + 27284 /usr/libexec/telepathy-gabble
|     + 27285 /usr/libexec/telepathy-salut
|     + 27297 /usr/libexec/geoclue-yahoo
|     + 28900 /usr/lib64/nspluginwrapper/npviewer.bin --plugin /usr/lib64/mozilla/plugins/libflashplayer.so --connection /org/wrapper/NSPlugins/libflashplayer.so/18741-6
|     + 29219 emacs systemd-for-admins-1.txt
|     + 29231 ssh tango
|     \ 29519 systemd-cgls
\ systemd-1
  + 1 /sbin/init
  + ntpd.service
  | \ 4112 /usr/sbin/ntpd -n -u ntp:ntp -g
  + systemd-logger.service
  | \ 1499 /lib/systemd/systemd-logger
  + accounts-daemon.service
  | \ 1496 /usr/libexec/accounts-daemon
  + rtkit-daemon.service
  | \ 1473 /usr/libexec/rtkit-daemon
  + console-kit-daemon.service
  | \ 1408 /usr/sbin/console-kit-daemon --no-daemon
  + prefdm.service
  | + 1376 /usr/sbin/gdm-binary -nodaemon
  | + 1391 /usr/libexec/gdm-simple-slave --display-id /org/gnome/DisplayManager/Display1 --force-active-vt
  | + 1394 /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-f2KUOh/database -nolisten tcp vt1
  | + 1419 /usr/bin/dbus-launch --exit-with-session
  | \ 1511 /usr/bin/gnome-keyring-daemon --daemonize --login
  + getty@.service
  | + tty6
  | | \ 1346 /sbin/mingetty tty6
  | + tty4
  | | \ 1343 /sbin/mingetty tty4
  | + tty5
  | | \ 1342 /sbin/mingetty tty5
  | + tty3
  | | \ 1339 /sbin/mingetty tty3
  | \ tty2
  |   \ 1332 /sbin/mingetty tty2
  + abrtd.service
  | \ 1317 /usr/sbin/abrtd -d -s
  + crond.service
  | \ 1344 crond
  + sshd.service
  | \ 1362 /usr/sbin/sshd
  + sendmail.service
  | + 4094 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
  | \ 4096 sendmail: accepting connections
  + haldaemon.service
  | + 1249 hald
  | + 1250 hald-runner
  | + 1273 hald-addon-input: Listening on /dev/input/event3 /dev/input/event9 /dev/input/event1 /dev/input/event7 /dev/input/event2 /dev/input/event0 /dev/input/event8
  | + 1275 /usr/libexec/hald-addon-rfkill-killswitch
  | + 1284 /usr/libexec/hald-addon-leds
  | + 1285 /usr/libexec/hald-addon-generic-backlight
  | \ 1287 /usr/libexec/hald-addon-acpi
  + irqbalance.service
  | \ 1210 irqbalance
  + avahi-daemon.service
  | + 1175 avahi-daemon: running [epsilon.local]
  + NetworkManager.service
  | + 1171 /usr/sbin/NetworkManager --no-daemon
  | \ 4028 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-wlan0.pid -lf /var/lib/dhclient/dhclient-7d32a784-ede9-4cf6-9ee3-60edc0bce5ff-wlan0.lease -cf /var/run/nm-dhclient-wlan0.conf wlan0
  + rsyslog.service
  | \ 1193 /sbin/rsyslogd -c 4
  + mdmonitor.service
  | \ 1207 mdadm --monitor --scan -f --pid-file=/var/run/mdadm/mdadm.pid
  + cups.service
  | \ 1195 cupsd -C /etc/cups/cupsd.conf
  + auditd.service
  | + 1131 auditd
  | + 1133 /sbin/audispd
  | \ 1135 /usr/sbin/sedispatch
  + dbus.service
  | +  1096 /bin/dbus-daemon --system --address=systemd: --nofork --systemd-activation
  | +  1216 /usr/sbin/modem-manager
  | +  1219 /usr/libexec/polkit-1/polkitd
  | +  1242 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid
  | +  1453 /usr/libexec/upowerd
  | +  1733 /usr/libexec/udisks-daemon
  | +  1747 udisks-daemon: polling /dev/sr0
  | \ 29509 /usr/libexec/packagekitd
  + dev-mqueue.mount
  + dev-hugepages.mount
  \ sysinit.service
    +   455 /sbin/udevd -d
    +  4016 /usr/sbin/bluetoothd --udev
    + 28188 /sbin/udevd -d
    \ 28191 /sbin/udevd -d

(This too is shortened, the same way)

As you can see, this command shows the processes by their cgroup and hence service, as systemd labels the cgroups after the services. For example, you can easily see that the auditing service auditd.service spawns three individual processes, auditd, audisp and sedispatch.

If you look closely you will notice that a number of processes have been assigned to the cgroup /user/1. At this point let's simply leave it at that systemd not only maintains services in cgroups, but user session processes as well. In a later installment we'll discuss in more detail what this about.

So much for now, come back soon for the next installment!


Video Interview With Yours Truly

Google just published a video interview with yours truly. Watch it! Oh, and Vincent, I even put on a red shirt for you!


systemd Status Update

It has been a while since my original announcement of systemd. Here's a little status update, on what happened since then. For simplicity's sake I'll just list here what we worked on in a bulleted list, with no particular order and without trying to cover this comprehensively:

  • systemd has been accepted as Feature for Fedora 14, and as it looks right now everything worked out nicely and we'll ship F14 with systemd as init system.
  • We added a number of additional unit types: .timer for cron-style timer-based activation of services, .swap exposes swap files and partitions the same way we handle mount points, and .path can be used to activate units dependending on the existance/creation of files or fill status of spool directories.
  • We hooked systemd up to SELinux: systemd is now capabale of properly labelling directories, sockets and FIFOs it creates according to the SELinux policy for the services we maintain.
  • We hooked systemd up to the Linux auditing subsystem: as first init system at all systemd now generates auditing records for all services it starts/stops, including their failure status.
  • We hooked systemd up to TCP wrappers, for all socket connections it accepts.
  • We hooked systemd up to PAM, so that optionally, when systemd runs a service as a different user it initializes the usual PAM session setup and teardown hooks.
  • We hooked systemd up to D-Bus, so that D-Bus passes activation requests to systemd and systemd becomes the central point for all kinds of activation, thus greatly extending the control of the execution environment of bus activated services, and making them accessible through the same utilities as SysV services. Also, this enables us to do race-free parallelized start-up for D-Bus services and their clients, thus speeding up things even further.
  • systemd is now able to handle various Debian and OpenSUSE-specific extensions to the classic SysV init script formats natively, on top of the Fedora extensions we already parse.
  • The D-Bus coverage of the systemd interface is now complete, allowing both introspection of runtime data and of parsed configuration data. It's fun now to introspect systemd with gdbus or d-feet.
  • We added a systemd PAM module, which assigns the processes of each user session to its own cgroup in the systemd cgroup tree. This also enables reliable killing of all processes associated with a session when the user logs out. This also manages a secure per-user /var/run-style directory which is supposed to be used for sockets and similar files that shall be cleaned up when the user logs out.
  • There's a new tool systemd-cgls, which plots a pretty process tree based on the systemd cgroup hierarchy. It's really pretty. Try it!
  • We now have our own cgroup hierarchy beneath /cgroup/systemd (though is will move to /sys/fs/ before the F14 release).
  • We have pretty code that automatically spawns a getty on a serial port when the kernel console is redirected to a serial TTY.
  • systemctl got beefed up substantially (it can even draw dependency graphs now, via dot!), and the SysV compatiblity tools were extended to more completely and correctly support what was historically provided by SysV. For example, we'll now warn the user when systemd service files have changed but systemd was not asked to reload its configuration. Also, you can now use systemd's native client tools to reboot or shut-down an Upstart or sysvinit system, to facilitate upgrades.
  • We provide a reference implementation for the socket activation and other APIs for nicer interaction with systemd.
  • We have a pretty complete set of documentation now, some of it even extending to areas not directly related to systemd itself.
  • Quite a number of upstream packages now ship with systemd service files out-of-the-box now, that work across all distributions that have adopted systemd. It is our intention to unify the boot and service management between distributions with systemd, and this shows fruits already. Furthermore a number of upstream packages now ship our patches for socket-based activation.
  • Even more options that control the process execution environment or the sockets we create are now supported.
  • Earlier today I began my series of blog stories on systemd for administrators.
  • We reimplemented almost all boot-up and shutdown scripts of the standard Fedora install in much smaller, simpler and faster C utilities, or in systemd itself. Most of this will not be enabled in F14 however, even though it is shipped with systemd upstream. With this enabled the entire Linux system gains a completely new feeling as the number of shells we spawn approaches zero, and the PID of the first user terminal is way < 500 now, and the early boot-up is fully parallelized. We looked at the boot scripts of Fedora, OpenSUSE and Debian and distilled from this a list of functionality that makes up the early boot process and reimplemented this in C, if possible following the bahaviour of one of the existing implementations from these three distributions. This turned out to be much less effort than anticipated, and we are actually quite excited about this. Look forward to the fruits of this work in F15, when we might be able to present you a shell-less boot at least for standard desktop/laptop systems.
  • We spent some time reinvestigating the current syslog logic, and came up with an elegant and simple scheme to provide /dev/log compatible logging right from the time systemd is first initialized right until the time the kernel halts the machine. Through the wonders of socket based activation we first connect the /dev/log socket with a minimal bridge to the kernel log buffer (kmsg) and then, as soon as the real syslog is started up as part of the later bootup phase, we dynamically replace this minimal bridge by the real syslog daemon -- without losing a single log message. Since one of the first things the real syslog daemon does is flushing the kernel log buffer into log files, all logged messages will sooner or later be stored on disk, regardless whether they have been generated during early boot, late boot or system runtime. On top of that if the syslog daemon terminates or is shut down during runtime, the bridge becomes active again and log output is written to kmsg again. The same applies when the system goes down. This provides a simple an robust way how we can ensure that no logs will ever be lost again, and logging is available from the beginning of boot-up to the end of shut-down. Plymouth will most likely adopt a similar scheme for initrd logging, thus ensuring that everything ever logged on the system will properly end up in the log files, whether it comes from the kernel, from the initrd, from early-boot, from runtime or shutdown. And if syslogd is not around, dmesg will provide you with access to the log messages. While this bridge is part of systemd upstream, we'll most likely enable this bridge in Fedora only starting with F15. Also note that embedded systems that have no interest in shipping a full syslogd solution can simply use this syslog bridge during the entire runtime, and thus making the kernel log buffer the centralized log storage, with all the advantages this offers: zero disk IO at runtime, access to serial and netconsole logging, and remote debug access to the kernel log buffer.
  • We now install autofs units for many "API" kernel virtual file systems by default, such as binfmt_misc or hugetlbfs. That means that the file system access is readily available, client code no longer has to manually load the respective kernel modules, as they are autoloaded on first access of the file system. This has many advantages: it is not only faster to set up during boot, but also simpler for applications, as they can just assume the functionality is available. On top of that permission problems for the initialization go away, since manual module loading requires root privileges.
  • Many smaller fixes and enhancements, all across the board, which if mentioned here would make this blog story another blog novel. Suffice to say, we did a lot of polishing to ready systemd for F14.

All in all, systemd is progressing nicely, and the features we have been working on in the last months are without exception features not existing in any other of the init systems available on Linux and our feature set already was far ahead of what the older init implementations provide. And we have quite a bit planned for the future. So, stay tuned!

Also note that I'll speak about systemd at LinuxKongress 2010 in Nuremberg, Germany. Later this year I'll also be speaking at the Linux Plumbers Conference in Boston, MA. Make sure to drop by if you want to learn about systemd or discuss exiciting new ideas or features with us.

© Lennart Poettering. Built using Pelican. Theme by Giulio Fidente on github. .