Index | Archives | Atom Feed | RSS Feed

systemd Status Update

It has been a while since my original announcement of systemd. Here's a little status update, on what happened since then. For simplicity's sake I'll just list here what we worked on in a bulleted list, with no particular order and without trying to cover this comprehensively:

  • systemd has been accepted as Feature for Fedora 14, and as it looks right now everything worked out nicely and we'll ship F14 with systemd as init system.
  • We added a number of additional unit types: .timer for cron-style timer-based activation of services, .swap exposes swap files and partitions the same way we handle mount points, and .path can be used to activate units dependending on the existance/creation of files or fill status of spool directories.
  • We hooked systemd up to SELinux: systemd is now capabale of properly labelling directories, sockets and FIFOs it creates according to the SELinux policy for the services we maintain.
  • We hooked systemd up to the Linux auditing subsystem: as first init system at all systemd now generates auditing records for all services it starts/stops, including their failure status.
  • We hooked systemd up to TCP wrappers, for all socket connections it accepts.
  • We hooked systemd up to PAM, so that optionally, when systemd runs a service as a different user it initializes the usual PAM session setup and teardown hooks.
  • We hooked systemd up to D-Bus, so that D-Bus passes activation requests to systemd and systemd becomes the central point for all kinds of activation, thus greatly extending the control of the execution environment of bus activated services, and making them accessible through the same utilities as SysV services. Also, this enables us to do race-free parallelized start-up for D-Bus services and their clients, thus speeding up things even further.
  • systemd is now able to handle various Debian and OpenSUSE-specific extensions to the classic SysV init script formats natively, on top of the Fedora extensions we already parse.
  • The D-Bus coverage of the systemd interface is now complete, allowing both introspection of runtime data and of parsed configuration data. It's fun now to introspect systemd with gdbus or d-feet.
  • We added a systemd PAM module, which assigns the processes of each user session to its own cgroup in the systemd cgroup tree. This also enables reliable killing of all processes associated with a session when the user logs out. This also manages a secure per-user /var/run-style directory which is supposed to be used for sockets and similar files that shall be cleaned up when the user logs out.
  • There's a new tool systemd-cgls, which plots a pretty process tree based on the systemd cgroup hierarchy. It's really pretty. Try it!
  • We now have our own cgroup hierarchy beneath /cgroup/systemd (though is will move to /sys/fs/ before the F14 release).
  • We have pretty code that automatically spawns a getty on a serial port when the kernel console is redirected to a serial TTY.
  • systemctl got beefed up substantially (it can even draw dependency graphs now, via dot!), and the SysV compatiblity tools were extended to more completely and correctly support what was historically provided by SysV. For example, we'll now warn the user when systemd service files have changed but systemd was not asked to reload its configuration. Also, you can now use systemd's native client tools to reboot or shut-down an Upstart or sysvinit system, to facilitate upgrades.
  • We provide a reference implementation for the socket activation and other APIs for nicer interaction with systemd.
  • We have a pretty complete set of documentation now, some of it even extending to areas not directly related to systemd itself.
  • Quite a number of upstream packages now ship with systemd service files out-of-the-box now, that work across all distributions that have adopted systemd. It is our intention to unify the boot and service management between distributions with systemd, and this shows fruits already. Furthermore a number of upstream packages now ship our patches for socket-based activation.
  • Even more options that control the process execution environment or the sockets we create are now supported.
  • Earlier today I began my series of blog stories on systemd for administrators.
  • We reimplemented almost all boot-up and shutdown scripts of the standard Fedora install in much smaller, simpler and faster C utilities, or in systemd itself. Most of this will not be enabled in F14 however, even though it is shipped with systemd upstream. With this enabled the entire Linux system gains a completely new feeling as the number of shells we spawn approaches zero, and the PID of the first user terminal is way < 500 now, and the early boot-up is fully parallelized. We looked at the boot scripts of Fedora, OpenSUSE and Debian and distilled from this a list of functionality that makes up the early boot process and reimplemented this in C, if possible following the bahaviour of one of the existing implementations from these three distributions. This turned out to be much less effort than anticipated, and we are actually quite excited about this. Look forward to the fruits of this work in F15, when we might be able to present you a shell-less boot at least for standard desktop/laptop systems.
  • We spent some time reinvestigating the current syslog logic, and came up with an elegant and simple scheme to provide /dev/log compatible logging right from the time systemd is first initialized right until the time the kernel halts the machine. Through the wonders of socket based activation we first connect the /dev/log socket with a minimal bridge to the kernel log buffer (kmsg) and then, as soon as the real syslog is started up as part of the later bootup phase, we dynamically replace this minimal bridge by the real syslog daemon -- without losing a single log message. Since one of the first things the real syslog daemon does is flushing the kernel log buffer into log files, all logged messages will sooner or later be stored on disk, regardless whether they have been generated during early boot, late boot or system runtime. On top of that if the syslog daemon terminates or is shut down during runtime, the bridge becomes active again and log output is written to kmsg again. The same applies when the system goes down. This provides a simple an robust way how we can ensure that no logs will ever be lost again, and logging is available from the beginning of boot-up to the end of shut-down. Plymouth will most likely adopt a similar scheme for initrd logging, thus ensuring that everything ever logged on the system will properly end up in the log files, whether it comes from the kernel, from the initrd, from early-boot, from runtime or shutdown. And if syslogd is not around, dmesg will provide you with access to the log messages. While this bridge is part of systemd upstream, we'll most likely enable this bridge in Fedora only starting with F15. Also note that embedded systems that have no interest in shipping a full syslogd solution can simply use this syslog bridge during the entire runtime, and thus making the kernel log buffer the centralized log storage, with all the advantages this offers: zero disk IO at runtime, access to serial and netconsole logging, and remote debug access to the kernel log buffer.
  • We now install autofs units for many "API" kernel virtual file systems by default, such as binfmt_misc or hugetlbfs. That means that the file system access is readily available, client code no longer has to manually load the respective kernel modules, as they are autoloaded on first access of the file system. This has many advantages: it is not only faster to set up during boot, but also simpler for applications, as they can just assume the functionality is available. On top of that permission problems for the initialization go away, since manual module loading requires root privileges.
  • Many smaller fixes and enhancements, all across the board, which if mentioned here would make this blog story another blog novel. Suffice to say, we did a lot of polishing to ready systemd for F14.

All in all, systemd is progressing nicely, and the features we have been working on in the last months are without exception features not existing in any other of the init systems available on Linux and our feature set already was far ahead of what the older init implementations provide. And we have quite a bit planned for the future. So, stay tuned!

Also note that I'll speak about systemd at LinuxKongress 2010 in Nuremberg, Germany. Later this year I'll also be speaking at the Linux Plumbers Conference in Boston, MA. Make sure to drop by if you want to learn about systemd or discuss exiciting new ideas or features with us.


systemd for Administrators, Part 1

As many of you know, systemd is the new Fedora init system, starting with F14, and it is also on its way to being adopted in a number of other distributions as well (for example, OpenSUSE). For administrators systemd provides a variety of new features and changes and enhances the administrative process substantially. This blog story is the first part of a series of articles I plan to post roughly every week for the next months. In every post I will try to explain one new feature of systemd. Many of these features are small and simple, so these stories should be interesting to a broader audience. However, from time to time we'll dive a little bit deeper into the great new features systemd provides you with.

Verifying Bootup

Traditionally, when booting up a Linux system, you see a lot of little messages passing by on your screen. As we work on speeding up and parallelizing the boot process these messages are becoming visible for a shorter and shorter time only and be less and less readable -- if they are shown at all, given we use graphical boot splash technology like Plymouth these days. Nonetheless the information of the boot screens was and still is very relevant, because it shows you for each service that is being started as part of bootup, wether it managed to start up successfully or failed (with those green or red [ OK ] or [ FAILED ] indicators). To improve the situation for machines that boot up fast and parallelized and to make this information more nicely available during runtime, we added a feature to systemd that tracks and remembers for each service whether it started up successfully, whether it exited with a non-zero exit code, whether it timed out, or whether it terminated abnormally (by segfaulting or similar), both during start-up and runtime. By simply typing systemctl in your shell you can query the state of all services, both systemd native and SysV/LSB services:

[root@lambda] ~# systemctl
UNIT                                          LOAD   ACTIVE       SUB          JOB             DESCRIPTION
dev-hugepages.automount                       loaded active       running                      Huge Pages File System Automount Point
dev-mqueue.automount                          loaded active       running                      POSIX Message Queue File System Automount Point
proc-sys-fs-binfmt_misc.automount             loaded active       waiting                      Arbitrary Executable File Formats File System Automount Point
sys-kernel-debug.automount                    loaded active       waiting                      Debug File System Automount Point
sys-kernel-security.automount                 loaded active       waiting                      Security File System Automount Point
sys-devices-pc...0000:02:00.0-net-eth0.device loaded active       plugged                      82573L Gigabit Ethernet Controller
[...]
sys-devices-virtual-tty-tty9.device           loaded active       plugged                      /sys/devices/virtual/tty/tty9
-.mount                                       loaded active       mounted                      /
boot.mount                                    loaded active       mounted                      /boot
dev-hugepages.mount                           loaded active       mounted                      Huge Pages File System
dev-mqueue.mount                              loaded active       mounted                      POSIX Message Queue File System
home.mount                                    loaded active       mounted                      /home
proc-sys-fs-binfmt_misc.mount                 loaded active       mounted                      Arbitrary Executable File Formats File System
abrtd.service                                 loaded active       running                      ABRT Automated Bug Reporting Tool
accounts-daemon.service                       loaded active       running                      Accounts Service
acpid.service                                 loaded active       running                      ACPI Event Daemon
atd.service                                   loaded active       running                      Execution Queue Daemon
auditd.service                                loaded active       running                      Security Auditing Service
avahi-daemon.service                          loaded active       running                      Avahi mDNS/DNS-SD Stack
bluetooth.service                             loaded active       running                      Bluetooth Manager
console-kit-daemon.service                    loaded active       running                      Console Manager
cpuspeed.service                              loaded active       exited                       LSB: processor frequency scaling support
crond.service                                 loaded active       running                      Command Scheduler
cups.service                                  loaded active       running                      CUPS Printing Service
dbus.service                                  loaded active       running                      D-Bus System Message Bus
getty@tty2.service                            loaded active       running                      Getty on tty2
getty@tty3.service                            loaded active       running                      Getty on tty3
getty@tty4.service                            loaded active       running                      Getty on tty4
getty@tty5.service                            loaded active       running                      Getty on tty5
getty@tty6.service                            loaded active       running                      Getty on tty6
haldaemon.service                             loaded active       running                      Hardware Manager
hdapsd@sda.service                            loaded active       running                      sda shock protection daemon
irqbalance.service                            loaded active       running                      LSB: start and stop irqbalance daemon
iscsi.service                                 loaded active       exited                       LSB: Starts and stops login and scanning of iSCSI devices.
iscsid.service                                loaded active       exited                       LSB: Starts and stops login iSCSI daemon.
livesys-late.service                          loaded active       exited                       LSB: Late init script for live image.
livesys.service                               loaded active       exited                       LSB: Init script for live image.
lvm2-monitor.service                          loaded active       exited                       LSB: Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
mdmonitor.service                             loaded active       running                      LSB: Start and stop the MD software RAID monitor
modem-manager.service                         loaded active       running                      Modem Manager
netfs.service                                 loaded active       exited                       LSB: Mount and unmount network filesystems.
NetworkManager.service                        loaded active       running                      Network Manager
ntpd.service                                  loaded maintenance  maintenance                  Network Time Service
polkitd.service                               loaded active       running                      Policy Manager
prefdm.service                                loaded active       running                      Display Manager
rc-local.service                              loaded active       exited                       /etc/rc.local Compatibility
rpcbind.service                               loaded active       running                      RPC Portmapper Service
rsyslog.service                               loaded active       running                      System Logging Service
rtkit-daemon.service                          loaded active       running                      RealtimeKit Scheduling Policy Service
sendmail.service                              loaded active       running                      LSB: start and stop sendmail
sshd@172.31.0.53:22-172.31.0.4:36368.service  loaded active       running                      SSH Per-Connection Server
sysinit.service                               loaded active       running                      System Initialization
systemd-logger.service                        loaded active       running                      systemd Logging Daemon
udev-post.service                             loaded active       exited                       LSB: Moves the generated persistent udev rules to /etc/udev/rules.d
udisks.service                                loaded active       running                      Disk Manager
upowerd.service                               loaded active       running                      Power Manager
wpa_supplicant.service                        loaded active       running                      Wi-Fi Security Service
avahi-daemon.socket                           loaded active       listening                    Avahi mDNS/DNS-SD Stack Activation Socket
cups.socket                                   loaded active       listening                    CUPS Printing Service Sockets
dbus.socket                                   loaded active       running                      dbus.socket
rpcbind.socket                                loaded active       listening                    RPC Portmapper Socket
sshd.socket                                   loaded active       listening                    sshd.socket
systemd-initctl.socket                        loaded active       listening                    systemd /dev/initctl Compatibility Socket
systemd-logger.socket                         loaded active       running                      systemd Logging Socket
systemd-shutdownd.socket                      loaded active       listening                    systemd Delayed Shutdown Socket
dev-disk-by\x1...x1db22a\x1d870f1adf2732.swap loaded active       active                       /dev/disk/by-uuid/fd626ef7-34a4-4958-b22a-870f1adf2732
basic.target                                  loaded active       active                       Basic System
bluetooth.target                              loaded active       active                       Bluetooth
dbus.target                                   loaded active       active                       D-Bus
getty.target                                  loaded active       active                       Login Prompts
graphical.target                              loaded active       active                       Graphical Interface
local-fs.target                               loaded active       active                       Local File Systems
multi-user.target                             loaded active       active                       Multi-User
network.target                                loaded active       active                       Network
remote-fs.target                              loaded active       active                       Remote File Systems
sockets.target                                loaded active       active                       Sockets
swap.target                                   loaded active       active                       Swap
sysinit.target                                loaded active       active                       System Initialization

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
JOB    = Pending job for the unit.

221 units listed. Pass --all to see inactive units, too.
[root@lambda] ~#

(I have shortened the output above a little, and removed a few lines not relevant for this blog post.)

Look at the ACTIVE column, which shows you the high-level state of a service (or in fact of any kind of unit systemd maintains, which can be more than just services, but we'll have a look on this in a later blog posting), whether it is active (i.e. running), inactive (i.e. not running) or in any other state. If you look closely you'll see one item in the list that is marked maintenance and highlighted in red. This informs you about a service that failed to run or otherwise encountered a problem. In this case this is ntpd. Now, let's find out what actually happened to ntpd, with the systemctl status command:

[root@lambda] ~# systemctl status ntpd.service
ntpd.service - Network Time Service
	  Loaded: loaded (/etc/systemd/system/ntpd.service)
	  Active: maintenance
	    Main: 953 (code=exited, status=255)
	  CGroup: name=systemd:/systemd-1/ntpd.service
[root@lambda] ~#

This shows us that NTP terminated during runtime (when it ran as PID 953), and tells us exactly the error condition: the process exited with an exit status of 255.

In a later systemd version, we plan to hook this up to ABRT, as soon as this enhancement request is fixed. Then, if systemctl status shows you information about a service that crashed it will direct you right-away to the appropriate crash dump in ABRT.

Summary: use systemctl and systemctl status as modern, more complete replacements for the traditional boot-up status messages of SysV services. systemctl status not only captures in more detail the error condition but also shows runtime errors in addition to start-up errors.

That's it for this week, make sure to come back next week, for the next posting about systemd for administrators!


Dear Lazy Web,

does anybody know how to decode those Lenovo ThinkPad model IDs? I am interested in the T410s. For example, there's the model NUK3AGE, and there's NUHFXGE, and there's NUHYXGE. Some web sites claim NUK3AGE has Nvidia graphics, others claim VGA is Intel-only. Some web sites claim it has a touch screen, others say the contrary. The Lenovo web site isn't helpful to figure out the differences between the models and what the feature set of the various models really is. I figured out the GE suffix indicates a german keyboard, but what about the remaining code? Anybody knows how to decypher those IDs or knows a reliable source explaining their feature set?

Love,

Lennart


Me too!

I too forgot to mention that my accommodation at GUADEC was sponsored by the GNOME Foundation. Thanks guys!

Sponsored


Dear Canonical,

#ignore yes

Today I came across this blog post of your design team. In context of the recent criticism you had to endure regarding upstream contributions I am disappointed that you have not bothered to ping anybody from the upstream freedesktop sound theme (for example yours truly) about this in advance. No, you went to cook your own soup. What really disappoints me is that we have asked multiple times for help and support and contributions for the sound theme, to only very little success, and I even asked some of the Canonical engineers about this topic and in particular regarding some clarifications of the licensing of the old Ubuntu sound theme. I am sorry, but if you had listened, or looked, or asked you would have been aware that we were looking for somebody to maintain this actively, upstream -- and because we didn't have the time to maintain this we only did the absolute minimum work necessary and we only maintain this ourselves because noone else wanted to.

It should be upstream first, downstream second.

I am sorry if I sound like an always complaining prick to you. But believe me, I am not saying this because I wouldn't like you or anything like that. I am just saying this because I believe you could do things oh so much better.

Please fix this. We want your contributions. Upstream.


Beating a Dead Horse

I guess it's a bit beating a dead horse, but I had a good laugh today when I learned that I alone contributed more to GNOME than the entirety of Canonical, and only 800 additional commits seperating me from being more awesome than Nokia.

/me is amused


Interview With Yours Truly

Here's a podcast interview with yours truly where I speak a little about PulseAudio and systemd. Seek to 64:43 for my lovely impetuous voice. There's also an interview with Owen just before mine.


Linux Plumbers Conference 2010 CFP Ending Soon!

#nocomments y

The Call for Papers for the Linux Plumbers Conference (LPC) in November in Cambridge, Massachusetts is ending soon, on July 19th 2010 (That's the upcoming monday!). It's a conference about the core infrastructure of Linux systems: the part of the system where userspace and the kernel interface. It's the only conference where the focus is specifically on getting together the kernel people who work on the userspace interfaces and the userspace people who have to deal with kernel interfaces. It's supposed to be a place where all the people doing infrastructure work sit down and talk, so that both parties understand better what the requirements and needs of the other are, and where we can work towards fixing the major problems we currently have with our lower-level infrastructure and APIs.

The two previous LPCs were hugely successful (as reported on LWN on various occasions), and this time we hope to repeat that.

Like the previous years, I will be running the Audio conference track of LPC, this time together with Mark Brown. Audio infrastructure on Linux has been steadily improving the last years all over the place, but there's still a lot to do. Join us at the LPC to discuss the next steps and help improving Linux audio further! If you are doing audio infrastructure work on Linux, make sure to attend and submit a paper!

Sign up soon! Send in your paper quickly! Only three days left to the end of the CFP!

Plumbers Logo

(I am also planning to do a presentation there about systemd, together with Kay. Make sure to attend if you are interested in that topic.)

See you in Boston!


Addendum on the Brokenness of File Locking

I forgot to mention another central problem in my blog story about file locking on Linux:

Different machines have access to different features of the same file system. Here's an example: let's say you have two machines in your home LAN. You want them to share their $HOME directory, so that you (or your family) can use either machine and have access to all your (or their) data. So you export /home on one machine via NFS and mount it from the other machine.

So far so good. But what happens to file locking now? Programs on the first machine see a fully-featured ext3 or ext4 file system, where all kinds of locking works (even though the API might suck as mentioned in the earlier blog story). But what about the other machine? If you set up lockd properly then POSIX locking will work on both. If you didn't one machine can use POSIX locking properly, the other cannot. And it gets even worse: as mentioned recent NFS implementations on Linux transparently convert client-side BSD locking into POSIX locking on the server side. Now, if the same application uses BSD locking on both the client and the server side from two instances they will end up with two orthogonal locks and although both sides think they have properly acquired a lock (and they actually did) they will overwrite each other's data, because those two locks are independent. (And one wonders why the NFS developers implemented this brokenness nonetheless...).

This basically means that locking cannot be used unless it is verified that everyone accessing a file system can make use of the same file system feature set. If you use file locking on a file system you should do so only if you are sufficiently sure that nobody using a broken or weird NFS implementation might want to access and lock those files as well. And practically that is impossible. Even if fpathconf() was improved so that it could inform the caller whether it can successfully apply a file lock to a file, this would still not give any hint if the same is true for everybody else accessing the file. But that is essential when speaking of advisory (i.e. cooperative) file locking.

And no, this isn't easy to fix. So again, the recommendation: forget about file locking on Linux, it's nothing more than a useless toy.

Also read Jeremy Allison's (Samba) take on POSIX file locking. It's an interesting read.


On the Brokenness of File Locking

It's amazing how far Linux has come without providing for proper file locking that works and is usable from userspace. A little overview why file locking is still in a very sad state:

To begin with, there's a plethora of APIs, and all of them are awful:

  • POSIX File locking as available with fcntl(F_SET_LK): the POSIX locking API is the most portable one and in theory works across NFS. It can do byte-range locking. So much on the good side. On the bad side there's a lot more however: locks are bound to processes, not file descriptors. That means that this logic cannot be used in threaded environments unless combined with a process-local mutex. This is hard to get right, especially in libraries that do not know the environment they are run in, i.e. whether they are used in threaded environments or not. The worst part however is that POSIX locks are automatically released if a process calls close() on any (!) of its open file descriptors for that file. That means that when one part of a program locks a file and another by coincidence accesses it too for a short time, the first part's lock will be broken and it won't be notified about that. Modern software tends to load big frameworks (such as Gtk+ or Qt) into memory as well as arbitrary modules via mechanisms such as NSS, PAM, gvfs, GTK_MODULES, Apache modules, GStreamer modules where one module seldom can control what another module in the same process does or accesses. The effect of this is that POSIX locks are unusable in any non-trivial program where it cannot be ensured that a file that is locked is never accessed by any other part of the process at the same time. Example: a user managing daemon wants to write /etc/passwd and locks the file for that. At the same time in another thread (or from a stack frame further down) something calls getpwuid() which internally accesses /etc/passwd and causes the lock to be released, the first thread (or stack frame) not knowing that. Furthermore should two threads use the locking fcntl()s on the same file they will interfere with each other's locks and reset the locking ranges and flags of each other. On top of that locking cannot be used on any file that is publicly accessible (i.e. has the R bit set for groups/others, i.e. more access bits on than 0600), because that would otherwise effectively give arbitrary users a way to indefinitely block execution of any process (regardless of the UID it is running under) that wants to access and lock the file. This is generally not an acceptable security risk. Finally, while POSIX file locks are supposedly NFS-safe they not always really are as there are still many NFS implementations around where locking is not properly implemented, and NFS tends to be used in heterogenous networks. The biggest problem about this is that there is no way to properly detect whether file locking works on a specific NFS mount (or any mount) or not.
  • The other API for POSIX file locks: lockf() is another API for the same mechanism and suffers by the same problems. One wonders why there are two APIs for the same messed up interface.
  • BSD locking based on flock(). The semantics of this kind of locking are much nicer than for POSIX locking: locks are bound to file descriptors, not processes. This kind of locking can hence be used safely between threads and can even be inherited across fork() and exec(). Locks are only automatically broken on the close() call for the one file descriptor they were created with (or the last duplicate of it). On the other hand this kind of locking does not offer byte-range locking and suffers by the same security problems as POSIX locking, and works on even less cases on NFS than POSIX locking (i.e. on BSD and Linux < 2.6.12 they were NOPs returning success). And since BSD locking is not as portable as POSIX locking this is sometimes an unsafe choice. Some OSes even find it funny to make flock() and fcntl(F_SET_LK) control the same locks. Linux treats them independently -- except for the cases where it doesn't: on Linux NFS they are transparently converted to POSIX locks, too now. What a chaos!
  • Mandatory locking is available too. It's based on the POSIX locking API but not portable in itself. It's dangerous business and should generally be avoided in cleanly written software.
  • Traditional lock file based file locking. This is how things where done traditionally, based around known atomicity guarantees of certain basic file system operations. It's a cumbersome thing, and requires polling of the file system to get notifications when a lock is released. Also, On Linux NFS < 2.6.5 it doesn't work properly, since O_EXCL isn't atomic there. And of course the client cannot really know what the server is running, so again this brokeness is not detectable.

The Disappointing Summary

File locking on Linux is just broken. The broken semantics of POSIX locking show that the designers of this API apparently never have tried to actually use it in real software. It smells a lot like an interface that kernel people thought makes sense but in reality doesn't when you try to use it from userspace.

Here's a list of places where you shouldn't use file locking due to the problems shown above: If you want to lock a file in $HOME, forget about it as $HOME might be NFS and locks generally are not reliable there. The same applies to every other file system that might be shared across the network. If the file you want to lock is accessible to more than your own user (i.e. an access mode > 0700), forget about locking, it would allow others to block your application indefinitely. If your program is non-trivial or threaded or uses a framework such as Gtk+ or Qt or any of the module-based APIs such as NSS, PAM, ... forget about about POSIX locking. If you care about portability, don't use file locking.

Or to turn this around, the only case where it is kind of safe to use file locking is in trivial applications where portability is not key and by using BSD locking on a file system where you can rely that it is local and on files inaccessible to others. Of course, that doesn't leave much, except for private files in /tmp for trivial user applications.

Or in one sentence: in its current state Linux file locking is unusable.

And that is a shame.

Update: Check out the follow-up story on this topic.

© Lennart Poettering. Built using Pelican. Theme by Giulio Fidente on github. .