guest - flak

medium rare

Is it crazy that a Medium post about javascript bloat would have itself have megabytes of javascript and stylesheets? I wouldn’t know, since I didn’t see it. I have a little proxy like service running that rewrites its HTML. This particular service was an experiment to replace some python code with go, to evaluate suitability for future hacks.

I’ve been using the python lxml library for HTML parsing for ages. Seems to work pretty well. There’s actually a bunch of little one off scripts that share a similar skeleton, which is modified as needed. After all, the best code isn’t reusable, it’s reeditable. A little while ago that turned into a script to download Medium posts after I read them and save the important parts, so that sometime later when I want to read about the Riemann Hypothesis, it’s all still there in a place I can find it.

Continue reading medium rare...

Posted 2017-02-13 14:13:00 by tedu Updated: 2017-02-18 20:10:56
Tagged: go programming web

to errno or to error

Unlike other languages which have one preferred means of signalling an error, C is a multi error paradigm language. Error handling styles in C can be organized into one of several distinct styles, such as popular or correct. Some examples of each.

in band sentinel

One very popular option is the classic unix style. -1 is returned to indicate an error.

Continue reading to errno or to error...

Posted 2017-01-24 20:52:42 by tedu Updated: 2017-01-24 20:52:42
Tagged: c programming

go garbage collector and liveness

Depending on language, compiler, and runtime, sometimes the garbage collector needs a few hints from the programmer. You know you’re done with an object, but to the GC, if a variable appears live, it can’t be collected. Sometimes the problem really is programmer error, as objects continue to collect in a container that’s never inspected. Other times the variable will be overwritten soon enough, but does it help to overwrite it sooner?

A trivial example.

Continue reading go garbage collector and liveness...

Posted 2017-01-02 15:18:31 by tedu Updated: 2017-01-02 21:14:07
Tagged: go programming

exfiltration via receive timing

Another similar way to create a backchannel but without transmitting anything is to introduce delays in the receiver and measure throughput as observed by the sender. All we need is a protocol with transmission control. Hmmm.

Actually, it’s easier (and more reliable) to code this up using a plain pipe, but the same principle applies to networked transmissions.

First the reader code. We’ll assume an input string of decimal digits, 1-9.

Continue reading exfiltration via receive timing...

Posted 2016-12-22 15:20:19 by tedu Updated: 2016-12-22 15:20:39
Tagged: c network programming security

exfiltration via request timing

There are any number of ways to exfiltrate data via covert channels. For example, a popular technique is to make DNS lookups for a series of hostnames like “attack.example.com”, “atdawn.example.com”, etc. which will be passed through most firewalls. For a long time DNS requests weren’t monitored, but savvy network operators have grown wise. So if we wanted to beam some data off a device surreptitiously, what else can we do?

There are some even lower level techniques, like varying IP packet size or options, but this too may trigger alarms. Instead, let’s move up the stack and try to make our tunnel look as normal as possible. Consider the scenario where we’re Apple or Google and we want to extract Signal private keys off a device. It’s a small amount of data, and we already have an established channel: update checks. The trick is to piggyback our channel onto update requests. (This is not an entirely original idea, I just wanted to explore it.)

Continue reading exfiltration via request timing...

Posted 2016-12-19 17:30:45 by tedu Updated: 2016-12-19 17:30:45
Tagged: c network programming security

sorted arrays, theory and practice

The average time to check if a random array is sorted is e - 1. This was not the result I was expecting, but it’s also easy to check.

#include <stdio.h> #include <stdlib.h> typedef unsigned int unt; const unt length = 100; unt sortlen(unt *arr) { unt i; for (i = 0; i < length - 1; i++) if (arr[i] > arr[i + 1]) break; return i + 1; } void shuffle(unt *arr) { unt i; for (i = 0; i < length - 1; i++) { unt j = i + arc4random_uniform(length - i); unt t = arr[j]; arr[j] = arr[i]; arr[i] = t; } } int main(int argc, char **argv) { const unt trials = 1000000; unt arr[length]; unt i; unt count = 0; for (i = 0; i < length; i++) arr[i] = i; for (i = 0; i < trials; i++) { shuffle(arr); count += sortlen(arr); } printf("avg sort len %f\n", (double)count / trials); }

And the results:

carbolite:/tmp> ./sort avg sort len 1.718539 carbolite:/tmp> ./sort avg sort len 1.717483 carbolite:/tmp> ./sort avg sort len 1.718032

I’m not sure if it’s reasonable to expect the inputs to such a function to be uniformly random arrays. Any program which checks if an array is sorted probably deals with sorted arrays more frequently. But at least the math checks out.

Posted 2016-10-30 23:34:37 by tedu Updated: 2016-10-30 23:34:37
Tagged: math programming

process listing consistency

POSIX specifies that there is a ps utility to list processes, although it doesn’t describe how the command is implemented. In fact, it’s not possible to implement ps using only POSIX interfaces. However it’s implemented, it’s unlikely to use double buffering, which means on a sufficiently busy system, the results may be inconsistent. If lots of processes are being created and exited while ps runs, some of the output may be “before” and some “after”. Much like a game without vsync.

In order to test for inconsistency, we need to create lots of processes, but in a predictable way. Then we run ps over and over, looking for discrepancies. Enter the chicken and the egg.

Continue reading process listing consistency...

Posted 2016-10-06 12:26:37 by tedu Updated: 2016-10-06 12:26:37
Tagged: c openbsd programming

tilted abstractions

Abstractions are nice when they help us gloss over seemingly unimportant details, but they also shape our perception of the underlying reality. Learning to work without abstractions can make it easier to switch between abstractions.

Consider the analogy of a photographer who takes pictures and an observer who views the results. Normally, they would both be standing upright. Everything lines up. Either the photographer may tilt their camera askew or the observer may tilt their head. If the picture is tilted, the upright viewer can probably make sense of it. Even tilt their own head to compensate. Or if the picture is straight, an observer with a neck cramp can still view it. But what happens if the camera is tilted left and the observer is tilted right? It doesn’t make any sense. Now the trees are growing upside down!

One can imagine writing code on a tilted monitor. It’s a little annoying at first, but eventually one learns to tilt their head to compensate and now their own code looks perfectly normal. Using a friend’s properly oriented monitor is a little cumbersome, but adequate. Using a monitor tilted in the opposite direction? Terrible! Horrendous! How can you call this code? All the loops are backwards and the control structures are inside out!

Having spent a few days learning a low level API that’s conveniently wrapped by a library, I wondered, what’s the point? I’m probably going to switch to a different library, er framework, in the future. Will this knowledge transfer? Yes, I think so. These are the API functions that all libraries use as well. They just apply their own slant. It can’t hurt to know what’s happening underneath. Is the knowledge of one framework directly transferable to another? That depends greatly on the compatibility of their tilt. If all I know is a left tilted world, those expectations may actually be a hindrance in a right tilted world.

Posted 2016-10-05 14:39:40 by tedu Updated: 2016-10-05 14:39:40
Tagged: programming

choose boring bugs

When there’s more than one way to do things, it can be rewarding to relentlessly polish code, but this sometimes causes trouble later on. Boring code tends to have boring bugs, and since bugs are inevitable, this suggests we should prefer boring code.

Continue reading choose boring bugs...

Posted 2016-10-04 16:26:50 by tedu Updated: 2016-10-05 17:48:39
Tagged: programming

backlight battery indicator

The last few models of Thinkpads are sadly devoid of indicators. How do you tell if caps lock is on? Type something and see if it matches expectations. If it happens to be the lock screen, loltastic. More importantly, how do you know if AC power has accidentally been disconnected and the battery is running low? The red dot on the opposite side of the lid isn’t much use.

It’s possible to use some sort of desktop environment status bar, but I prefer a low thrills environment. I don’t need a big honking battery icon distracting me. Accordingly, I have only a small (text) battery display in the corner. It’s there when I need it, but unobtrusive. The only problem is if I think I’m plugged into the wall, but I’m not, I won’t be checking battery and may not notice even as the situation grows dire.

Continue reading backlight battery indicator...

Posted 2016-08-28 02:43:19 by tedu Updated: 2016-09-09 21:02:52
Tagged: c computers openbsd programming

connect doesn’t restart

There was an interesting bug where pkg_add failed when resizing the terminal. The bug was actually in ftp, specifically the way it calls connect. When the terminal is resized, SIGWINCH is sent, which interrupts the connect system call. Sometimes syscalls restart, but connect is not among those that do. This may be a little surprising, because the previous bug involved the server side counterpart to connect, accept. On the server, accept restarts, but on the client, connect does not.

Behind the scenes, what’s happening? As the man page says, connect “initiates a connection on a socket”. It doesn’t say much about finishing the connection, though, which may be a bit surprising. Depending on whether the socket is blocking or nonblocking, there are two ways that may happen. This all assumes TCP, which involves some interplay of SYNs and ACKs that does not take place instantaneously. (Which explains why accept behaves differently. It is never in a half connected state.)

Continue reading connect doesn’t restart...

Posted 2016-08-15 21:00:54 by tedu Updated: 2016-08-15 21:00:54
Tagged: c openbsd programming

it’s hard work printing nothing

It all starts with a bug report to LibreSSL that the openssl tool crashes when it tries to print NULL. This bug doesn’t manifest on OpenBSD because libc will convert NULL strings to ”(null)” when printing. However, this behavior is not required, and as observed, it’s not universal. When snprintf silently accepts NULL, that simply leads to propagating the error.

There’s an argument to be made that silly error messages are better than crashing browsers, but stacking layers of sand seems like a poor means of building robust software in the long term.

As soon as development for the next release of OpenBSD restarted, some developers began testing a patch that would remove this crutch from printf.

Continue reading it’s hard work printing nothing...

Posted 2016-08-08 17:00:03 by tedu Updated: 2016-10-10 19:46:11
Tagged: c openbsd programming

broken features aren’t used

One of the difficulties in removing a feature is identifying all the potential users. A feature here could be a program bundled with an operating system, or a command line option, or maybe just a function in a library. If we remove a feature, users that depend on it will be sad. Unfortunately, absence of evidence is not evidence of absence. I’ve never heard of anybody running ls -p but it’s not impossible that somebody does.

The reasons why we want to remove an existing feature can vary. Sometimes it’s old code that interferes with maintenance. Sometimes a nearly complete rewrite can improve performance. In other cases, the feature in question is really more of a misfeature. It may have security implications, where the existence of the feature can be used to facilitate the exploitation of other vulnerabilities, and removing the feature will help mitigate the exploit.

There’s no general test that can be used, but there is one test that works in many cases. Test that the feature works. If the feature doesn’t work, that’s compelling evidence that nobody is using it, because nobody can be using it. You don’t need to fix it. You can just remove it.

(If you’ll pardon the heresy, this may be an argument against exhaustive unit tests. Many times a feature will start life in a functional state, but over time falls out of use and then gets broken by subsequent changes. Nobody notices and life goes on. If you have a perfect test suite, you’ll never have broken features, making it harder to identify the unused ones.)

Posted 2016-07-29 21:32:53 by tedu Updated: 2016-07-30 01:27:46
Tagged: programming software

true string indices

The other day cperciva answered why strchr returns a pointer. Many other languages do return an offset, but of course many of those lanuages don’t have pointers. Poor things. I happened to be writing a bunch of code using strchr recently, and needed both pointers and offsets.

Let’s imagine we have two similar functions, strchr and index.

Continue reading true string indices...

Posted 2016-06-24 13:42:27 by tedu Updated: 2016-06-25 17:03:49
Tagged: c programming

select works poorly

At the bottom of the OpenBSD man page for select is a little note. “Internally to the kernel, select() and pselect() work poorly if multiple processes wait on the same file descriptor.” There’s a similar warning in the poll man page. Where does this warning come from and what does it mean?

The code to implement these system calls lives in src/sys/kern/sys_generic.c. Despite differences in interface, the internal implementation is mostly shared, which is why they both have the same affliction. select and poll both scan a set of file descriptors for readiness, then if none are ready we sleep and wait.

The primary function for sleeping is tsleep, which requires a wait channel. Conceptually similar to a condition variable. At some later point, when something changes, another process or interrupt will call wakeup on the same wait channel and we’ll resume running. For example, if we’re trying to read from a pipe, but there’s no data, we’ll sleep using the address of the pipe data structure. When data is written to the pipe, it will call wakeup with the same address. We only wake up the reader(s) of this pipe, and don’t disturb the slumber of all the readers blocked waiting on other pipes.

Continue reading select works poorly...

Posted 2016-06-07 13:59:14 by tedu Updated: 2016-06-07 13:59:14
Tagged: c openbsd programming

accidentally nonblocking

A continuation, perhaps culmination, of a series that includes rough idling, firefox vs rthreads, and browser ktrace browsing.

I really like using ktrace to inspect programs. It’s somewhat primitive, to be sure, and unlike source review, it can be difficult to understand the programmer’s intentions. However, my CPU doesn’t execute intentions; it executes instructions.

poll

Browser source code can be nearly impenetrable due to size, but some other programs share similar behaviors. Perhaps they will prove a tractable target for further investigation. xterm contains some code that looks about like this. (Despite the “Abandon All Hope” warning at the entrance to this ifdef maze, I still found it quite navigable.)

Continue reading accidentally nonblocking...

Posted 2016-06-06 05:41:09 by tedu Updated: 2016-06-06 12:57:18
Tagged: c network openbsd programming

a prog by any other name

What is a name, really?

Sometimes two similar programs are really the same program with two names. For example, grep and egrep are two commands that perform very similar functions and are therefore implemented as a single program. Running ls -i and observing the inode number of each file will reveal that there is only one file. Calling the program egrep is a shorthand for -E and does the same thing.

names

In fact, every program has three names: its name in the filesystem, the name it has been invoked with, and whatever it believes its own name to be. Under normal circumstances the first two will be the same, but it is possible to call execve with a path and argv[0] not in alignment. Sometimes by accident, as in mv.

Continue reading a prog by any other name...

Posted 2016-04-28 12:26:04 by tedu Updated: 2016-04-29 02:22:50
Tagged: c openbsd programming

more input validation unnecessary

There’s a widespread belief that validating user input prevents security vulnerabilities. This is true as far as it goes, but doesn’t tell the whole story. Consider the following example, distilled from any number of real world examples.

if (!valid_input(buffer)) { free(buffer); error = BADSTUFF; goto ungood; } error = process_input(buffer); ungood: free(buffer); return error;

A not uncommon mistake. A vulnerability report may, quite accurately, say something like “Invalid inputs may result in remote code execution.” However, further input validation won’t fix this bug, nor will tweeting “This is why you always validate your inputs!” prevent future occurrences.

Lots of problems may share similar or even identical descriptions without sharing fixes. It’s a small point, really, but no less important. And of course, hardly limited to the field of security.

Posted 2016-04-25 18:14:33 by tedu Updated: 2016-04-25 18:14:33
Tagged: c programming security

outrageous roaming fees

Unexpected roaming fees are the worst. You’re just cruising along, having a jolly old time, and then boom. $20 per megabyte??? Should have read the fine print. Of course, if you had known to read the fine print, you probably would have already known about the roaming fees, and therefore not needed to read the fine print. And so it goes, in life and in ssh.

What, ssh has roaming??? Should have read the fine print. The Qualys Security Advisory is more than thorough. Now that we’ve read the fine print, what can we do differently?

The main bug (ignoring the second overflow for now) is that some sensitive memory was recycled and leaked. The possibility of this happening has been known for some time, and there’s some countermeasures in place, but they’re not foolproof.

Continue reading outrageous roaming fees...

Posted 2016-01-15 14:55:50 by tedu Updated: 2016-01-19 04:17:28
Tagged: c openbsd programming security thoughts

SIGPIPE can happen to you

Some recent flak outages were mysterious. One day things would be working, but the next they wouldn’t. All the flak.lua processes had disappeared. No error messages were reported in any observable location. No unusual looking requests were observed in any recorded location. Sometimes a process would survive days of heavy traffic. Other times it would die after only a few hours of light traffic. It was as if the process involved simply lost the will to live.

Finally, hooking up ktrace to the process, the culprit was, after a few days, revealed: SIGPIPE after a write. It’s unclear what I changed to cause this to be a problem now, after several years of successfully ignoring the problem, but that’s life.

Continue reading SIGPIPE can happen to you...

Posted 2015-12-02 16:06:12 by tedu Updated: 2015-12-02 16:06:12
Tagged: network programming