flak rss random

the stack overflow that wasn't

There was a recent bug in OpenBSD install kernels. At random times during the install, messages like the following would appear:

/upgrade: //install.sub[168]: sleep: Cannot allocate memory
/upgrade: //install.sub[168]: cat: Cannot allocate memory

This is pretty unusual. sleep and cat are not usually memory intensive. Clearly, something had changed. There were a few initial suspects but they had been pretty well tested. What was different?

The bsd.rd kernel has a few differences. It’s single processor; it’s compiled with the special small kernel option; it contains a ramdisk filesystem; all the binaries in the ramdisk are really hardlinked copies of one binary.

Backing out recent changes didn’t make the bug go away. Unfortunately, the ramdisk kernel isn’t tested quite as frequently as regular kernels. Most kernel changes are only tested by running a kernel with the relevant changes. So this meant going back quite a bit farther in time. To make things more interesting, the errors were intermittent. The installer actually runs a loop of sleep .1 in the background, which is where most of the crashes were coming from. Experienced users, however, could easily run through the installer fast enough not to observe the problem. The test procedure therefore consisted of booting bsd.rd and leaving it sitting idly until sufficient time had passed to either observe a crash or not.

The stackgap increase quickly became a suspect. Why would it only affect the ramdisk but not normal systems? First, an experiment. The stackgap was not increased on i386 systems. Check. Nope, no bug. Next test. increase the stackgap to 16MB. Boom. init fails to exec every time.

The problem is that the gap still counts against a process’s stack usage. It’s not really a gap, but rather an amount of stack that comes pre-used. On ramdisks, the stack limit is also 2MB. With some random chance, even the tiniest of processes like sleep and cat would fail to run because they ran out of stack. Technically, the errors above come from ksh, when execve fails because there’s no room left in the new process to copyout argv. Sometimes other errors like segfaults would occur as a program overflowed its stack through no fault of its own. The bug was never observed on multiuser systems because the default stacksize in login.conf is 4MB.

The short term fix was to revert the change until we can properly exclude the gap from the process’s usage. Way back when, random stackgap was a cheap hack to avoid totally fixed stack pointers, but it’s obviously not enough for truly random stacks. There are only so many bits to spare on 32-bit architectures, but 64-bit architectures can certainly afford to relocate the stack much farther.

Work on the long term fix didn’t take long at all.

Posted 08 Feb 2015 23:54 by tedu Updated: 09 Feb 2015 12:16
Tagged: openbsd