A few thoughts I had after reading Exploiting the Linux kernel via packet sockets. Not really about the exploit itself, but what it reveals about the state of systems security.
“It should be noted that if a kernel has unprivileged user namespaces enabled, then an unprivileged user is able to create packet sockets.”
Two types of privilege restriction are currently in vogue. There’s the seccomp/pledge model of restricting access to system calls, often referred to as sandboxes. Then there’s the jail/container approach. Hey, sure, give away root access because it’s not really root. Pseudo virtualization. In some sense, these two approaches are similar. Take some code, let it do some stuff, hope it can’t do too much.
Now, as regards attack surface, they can’t be more different. A sandbox specifically reduces the attack surface of the kernel. All the fun features that an attacker would like to use are locked up. Even if there’s an exploit in say sysctl, if it can’t be called by a sandboxed process, that’s one less thing to worry about. But containers and namespaces do the opposite. They expose all this new attack surface, previously only accessible to root, and let even regular users poke and prod it. And thus what might be a root to root exploit (kinda boring) becomes a privilege escalation.
“The state of SMEP & SMAP on the current CPU core is controlled by the 20th and 21st bits of the CR4 register. To disable them we should zero out these two bits. For this we can use the func(data) primitive to call native_write_cr4(X)”
Some further mitigation work may help here. Does the function to disable these protections need to be so trivially callable? CFI may or may not help here, since the prototypes apparently match. Depends on compiler probably. But maybe such a powerful and obvious target for exploitation should have a unique prototype. Or a calling convention that makes it difficult to fake? Just some raw thoughts, here, but I’m curious. There’s still other exploit techniques, but when it’s this trivial to disable SMEP and jump to a userland payload, why would you?
On KASLR and randomization. Linux has KASLR to protect text, but it’s trivially defeated by running dmesg and looking for a printout. Oops. And there’s some iterations to fix that, but infoleaks are everywhere. Even a very limited infoleak will reveal the address of the kernel code, from which other functions are easily calculated. On the other hand, from the writeup, not a lot of effort was spent shaping the heap. Allocate a bunch of objects and you get about what you expect. More randomization here would up the difficulty and reduce reliability of the exploit. With a nicely randomized heap, it takes more than the leak of a single pointer to reveal what’s happening. You need a dynamic infoleak that allows looking around.