flak rss random

process listing consistency

POSIX specifies that there is a ps utility to list processes, although it doesn’t describe how the command is implemented. In fact, it’s not possible to implement ps using only POSIX interfaces. However it’s implemented, it’s unlikely to use double buffering, which means on a sufficiently busy system, the results may be inconsistent. If lots of processes are being created and exited while ps runs, some of the output may be “before” and some “after”. Much like a game without vsync.

In order to test for inconsistency, we need to create lots of processes, but in a predictable way. Then we run ps over and over, looking for discrepancies. Enter the chicken and the egg.

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <signal.h>

int
main(int argc, char **argv)
{
        int fd[2];
        char ch;

        signal(SIGPIPE, SIG_IGN);

        pipe(fd);
        switch (fork()) {
        case -1:
                printf("fork failed!\n");
                exit(1);
        case 0:
                break;
        default:
                _exit(0);
        }
        close(fd[1]);
        read(fd[0], &ch, 1);
        switch (argv[0][0]) {
        case 'e':
                execl("./chicken", "chicken", NULL);
        case 'c':
        default:
                execl("./egg", "egg", NULL);
        }
        return 1;
}

We compile this code and copy it to chicken and egg. Then we run it. (A word of caution: neither the chicken nor the egg are particularly well behaved guests. Don’t try this at work.)

Each process will create a pipe (pair of file descriptors) and fork a child. The parent immediately exits. The child closes the write side of the pipe. Then we try reading from the pipe. This will block until all the writers are closed, ensuring the parent has exited. Watching for a closed pipe is a good way for a group of processes to monitor each other when SIGCHLD won’t work, such as when the child wants notification of the parent’s exit.

Satisfied that the parent has exited, the child grows up. Eggs turn into chickens, and chickens lay eggs. We might try running ps a few times to catch a glimpse of what’s happening. Of course, by the time we see the listing, the chicken has already forked off thousands of generations.

This seems like a lot of work to accomplish nothing, although it gives an opportunity to watch a changing system. At any given time, there may be an egg or a chicken, but not both. Immediately after the fork, there may be two eggs or two chickens, but at no time will there ever be a chicken and an egg. Also, for completeness, there will always be a chicken or an egg, never none.

So we run ps | egrep "chicken|egg" in a loop, watching for inconsistencies. If we ever see a process listing with both a chicken and an egg, or one with neither, that means ps is stitching together information. In the first case, ps read some info, saw a chicken, then read some more info, but in the meantime the chicken forked off an egg, which gets noticed in the second read.

Long ago, OpenBSD ps used the kvm interface to read kernel memory to create the process listing. Of course, there’s no guarantee that while ps is running, nothing changes. The system doesn’t stop working while ps is poking about. The kvm interface was replaced by sysctl, offering a few advantages. Userland is no longer coupled to kernel data structures, and the operation can be performed atomically. ps makes a single call to sysctl with KERN_PROC_ALL and the kernel returns info about every process on the system.

Internally, sysctl has a few tricks to keep it fast. The output memory region that will be written to is wired down to prevent page faults. Thus sysctl can iterate over the entire process list, copying out information to ps, without blocking. If we prevent processes from forking or exiting during this time, we get a consistent snapshot. The snapshot may be stale, but it will never show us a viewpoint that never happened.

Is there a way to trick ps on OpenBSD? Not everything is consistent. There’s a separate sysctl, KERN_PROC_ARGV, that reads the command line arguments for a process, but it only works on one process at a time. Processes can modify their own argv at any time. We can cook up a similar test case for this.

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <signal.h>

char *myname;
pid_t parent, child;
int state;

void
catch(int signum)
{
        switch (state) {
        case 0:
                state = 1;
                memcpy(myname, "state 1", 8);
                break;
        case 1:
                state = 2;
                memcpy(myname, "state 2", 8);
                break;
        case 2:
                state = 0;
                memcpy(myname, "state 0", 8);
                break;
        }
        if (child)
                kill(child, SIGUSR1);
        else
                kill(parent, SIGUSR1);
}

int
main(int argc, char **argv)
{
        myname = argv[0];

        if (strlen(myname) < 7) {
                printf("need a bigger name\n");
                exit(1);
        }
        signal(SIGUSR1, catch);
        parent = getpid();
        switch ((child = fork())) {
        case -1:
                printf("fork failed!\n");
                exit(1);
        case 0:
                state = 1;
                break;
        }
        if (child) {
                printf("parent %d\nchild %d\n", parent, child);
                kill(child, SIGUSR1);
        }
        while (1)
                sleep(10);
        return 1;
}

Just two processes, a parent and child, this time. All we want to do is sleep, but instead we’re going to be spending a lot of time playing pass the hot signal. Every time a signal is received, we advance a little state machine. State 0, state 1, state 2, back to zero. Update our name. Then we signal the other process.

What are the invariants? The parent starts in state 0, the child in state 1. The child receives the first signal, advancing to state 2. Then it signals the parent, advancing it to state 1, before signalling back and advancing the child to state 0. At no point will the two processes ever be in the same state. The child will always be a step or two ahead of the parent.

When running ps, sometimes we’ll see both processes in the same state, because ps read argv from one process then the other, but signals were flying all the while.

Consistency is overrated anyway.

Posted 06 Oct 2016 12:26 by tedu Updated: 06 Oct 2016 12:26
Tagged: c openbsd programming