Lecture on Overflows

I built this lecture to be interactive and therefore littered these slides with questions.

I expect you to participate in the discussion, but if you do not wish to do so, please place an arbitrary solid-state object onto your head. A hat is a good candidate for that and readily folded from surplus paper.

The following sample represents a strand of the family tree of programming languages.

FORTRAN (1957)
ALGOL (1958 and 1960)
CPL (1963)
BCPL (1966)
B (1969)

What is next?

C (1972)

Forged by the great patriarch Dennis Ritchie and supported by the iron hand of Bell Labs, the programming language to guarantee the eternal employment of security professionals came into existence. Decades passed, but to this day people gather to investigate its wonders. In honor of that tradition, we begin our journey.

A Different Outlook

In comparison to Java or, say, Haskell, C is a bit different. It was designed to be a portable way to produce machine code for various ancient architectures and that shows.

Some of its defining characteristics are

no garbage collection,
no first-class functions,
no exceptions,
no classes,
no hygienic macros,
no type variables and
no reference implementation.

What other practical languages fit these restrictions?

Example

Consider the following Java program.

static String join(final String x,
    final String y) {
  if (x == null || y == null)
    throw new NullPointerException();

  return x + '\n' + y;
}

Haskell is fine too.

join x y = x ++ '\n' : y

The same thing in C is not as trivial.

#include <errno.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>

char* join(char const* const x,
    char const* const y) {
  size_t n_x;
  size_t n_y;
  char* s;

  if (x == NULL || y == NULL) {
    errno = EINVAL;
    return NULL;
  }

  n_x = strlen(x);
  n_y = strlen(y);
  if (n_y > SIZE_MAX - 1 - n_x - 1) {
    errno = ERANGE;
    return NULL;
  }

  s = malloc(n_x + 1 + n_y + 1);
  if (s == NULL)
    return NULL;

  memcpy(s, x, n_x);
  s[n_x] = '\n';
  memcpy(&s[n_x + 1], y, n_y + 1);

  return s;
}

You also have to check the return value, eventually call free on it to avoid leaking memory and store errno if needed.

Heed the Specification

The lack of a reference implementation has an important consequence: there is no way to guarantee that a program that works once will work again. Correctness must be proven manually by consulting the specification. It is not a trivial task considering the specification is hundreds of pages long, written by a committee, proprietary and expensive.

The original specification was called ANSI X3.159-1989. As its name hints, the language evolved freely from its conception in 1972 to the day it was standardized in 1989. Amendments to the specifications are still released periodically and sometimes even supported by compiler writers.

The key concepts of writing correct C programs are

undefined behavior,
unspecified behavior and
implementation-defined behavior.

They are subtly different, but as a rule of thumb refer to bad things.

Try This at Home

Here is a simple program that prints a character as if it was a string.

#include <stdio.h>
#include <stdlib.h>

int main(void) {
  char const c = 'C';

  return printf("%s\n", &c) == EOF
    ? EXIT_FAILURE
    : EXIT_SUCCESS;
}

The compiler does not care and the resulting program works.

$ cc -o name name.c
$ ./name
C

Should we rejoice?

Probably not. The program only works due to automatic storage being implemented on the stack, which just so happens to contain a zero byte at &c + 1. Things go south if something, such as compiler optimizations, changes the stack layout.

$ cc -O3 -o name name.c
$ ./name
C?5&??

Signs of Good Design

To be frank, the design of C has some issues. It does not make sense to cover them all in detail, for that would take weeks, but some of the essential ones will be covered. Others might pop up along the way, as they tend to.

Syntax is the first one of the bunch.

This Shinichiro Hamaji’s C program won a contest in 2011.

#include<stdint.h>//       4
#include<stdio.h>/*      76.                               1
                         ..2321     1       57             3
                         21....     .       ..             .11         1   2
                    1  88..1321  2  33    5512      1      277        14   1
                    099..12....  .  ..    ....    1 4 11111...111 122 5.1  .
                  11...221.821112411123455676489  51.176543232666 902 .27  1
                  10111...1.....................11.417...........1...21..11.
                  ..2239921176566156225563322299887...6533233233182469196894
                  22...............................111......................
*///              3269548556987776665556662131223412347543332334543322223456
/*
          13.3.37 */char C[120]  [60   ];int R[120][60],W,H,J,K,B[61][61],/*
          12.2.39 */r,i,j,c,*q  =&   H,t=7200,x,y,k;int64_t*U,T[28800],*S=/*
          11.2.40 */T,O[120],  Z[   120],v,z;void D(){for(k=-1;7200>++k;S[/*
            10.39 */k]=v)r=!         ~(v=U[k])&&*R[k/60]?2:S[k]-v?1:r;;;;}/*
             9.36 */void L(             ){for(r=1;r==1;){r=3;;for(i=0;120>/*
             9.31 */i;i++){                  for(k=z=1,j=0;v=R[i][j];j++)O/*
           8.3.32 */[i]|=(  1LL             <<v)-1<<k,k+=v,0,Z[i]=z|=1LL<</*
           7.3.30 */k++;;  v=~                (3LL<<k-2);for(j=-61;++j<60;/*
           5.3.29 */v=(   v|~                  z)&(j<0?v>>1:v<<1|1))v=S[60/*
       3.1.3.1.28 */*         i    +(j        < 0?~j:j)]|=j?v:~3;}for(z=0;/*
         1.2.4.31 ;         */    7200       >z;z++)i=z/60,j=z%60,(B[i<60?/*
           7.5.31     */i:j][   i<60?        j:i-60]=~S[z]&O[i]?~S[z]&Z[i]/*
         1.6.5.30 ;  */?r=0    :(U=O          ,1):(U=Z,2))?k=i<60?j+60:j,S/*
       5.10.12.16 */[i%    60+60*k]|=         ~U[k]:0;U=S;  S-=t*=-1;D();z/*
    5.11.5.4.12.4 */*9;  }}int main(          ){for  (;K=  scanf("%d",R [*/*
    15.2.4.4.11.5 */q+c*60]+j)<1?          q= &W,j  --,2   >++c:'\n';j =-K/*
   16.3.4.2.2.5.5 */+getchar()?j+1        :++ *q*0   );     L(   );;if (!r/*
         14.8.7.3 */)for(K=0;K<W          *60;K++)              if(K%60  </*
     12.1.4.1.6.2 */W&!B[K/60]          [ K%60 ]                  ){for(  /*
       11.3.4.6.1 */c=64;c--;         )if (!(1                     &S[K/*  ;
     10.1.8.8.5.1 */]>>c))U=        S ,000,S+=         J=14400,     D()/*  ;
         9.8.11.5 */,S[K]=~         (1LL<<c)         ,L(),S-=J,S    [K]/*
    9.5.6.1.2.4.1 */|=r==2?          1LL<<          c:0;L(  ) ;}     q=/*  ;
        9.3.5.1.5 */&K;;for           (J=          K=i=0       ;    120/*
        3.2.6.5.1 */>    i;                       Z[i]=k            --,/*  ;
        2.6.3.5.1 */                             i>59?q    =&J     :0,/*   ;
    2.1.2.6.3.6.1 */ *     q<                    k?*q=k    :0,    0,C[/*   ;
2.1.2.2.1.5.2.5.1 */ i  ++  ][                k  ]=' '     ){     j=k/*    ;
    2.1.2.5.5.6.3 */ =  0;  for(;         x=R[i  ][j++]            ;/*
      4.1.9.7.5.1 */0)  k  +=sprintf   ((00,C[    i]+/*                    ;
       5.19.6.1.2 */k), "%d.",x);}i=~J;;r&1       ||puts        (         /*
         24.6.3.3 */r?"invalid":"failed");         for(;i      <H;       i/*
     24.1.2.9.7.4 */++,puts(""))for(j=~K;j  <  W;  )putchar( i<0?j<0    ||/*
        25.1.2.25 */(k=i+Z[j+60])<0?' ':C[j +  60  ][k]:j<00?(k=j+Z[i])<0?/*
          28.1.26 */' ':C[i][k]:"?X "[B[i][j]]  ) ,j++;return 0;} /* 2012 */

What does it do?

It obviously solves nonograms.

From a security perspective the lack of clarity is the least of your problems. While it hinders understanding, it does not cause any direct harm. Direct harm is commonly caused by overflows of various kinds.

Integer Overflows

Mathematicians (and computer scientists) have a magnificent concept called induction. It allows reasoning about recursive structures of arbitrary size through simple case analysis.

Unfortunately it all breaks down for modular arithmetic, which is the only kind of precise arithmetic C provides.

To make matters worse, only unsigned integers are guaranteed to wrap around in a predictable fashion. When a compiler encounters a signed integer overflow, it is permitted to produce anything. The set of anything contains

producing a negative value,
raising a signal,
emitting loud sounds and
destroying the operating system,

among other things. Luckily the latter are less common.

Real World Example

Assume we are maintaining an important web service. We need to dispatch lots of connections due to high traffic, so we decide to use C, because C is allegedly really fast.

Inside the web server we need to tackle the simple problem of finding the length of a line in a string.

int length(char const* const s) {
  int n = 0;

  while (s[n] != '\0' && s[n] != '\n')
    ++n;

  return n;
}

The code compiles and runs, but a few days later the server gets stuck. What is wrong?

The size of *s is sizeof (char), which is guaranteed to be 1, so the length of the string is equal to its size minus one. Further the size of the string is at most SIZE_MAX, which is guaranteed to be at least 65535. Finally the range of n is from INT_MIN to INT_MAX, which are guaranteed to be at most -32768 and at least 32767 respectively.

The server needs to be fixed. What can we do?

Fix it.

#include <stddef.h>

size_t length(char const* const s) {
  size_t n = 0;

  while (s[n] != '\0' && s[n] != '\n')
    ++n;

  return n;
}

Another Real Example

Assume we are still working on important things.

We are doing some sorting operations and need to find a pivot point between two array indices. This time we use the correct data type and feel confident in choosing the arithmetic mean, rounded down.

#include <stddef.h>

size_t pivot(size_t const x,
    size_t const y) {
  return (x + y) / 2;
}

The calculation overflows and our data is mangled. Is there an easy solution?

Yes.

#include <stddef.h>

size_t pivot(size_t const x,
    size_t const y) {
  return x < y
    ? x + (y - x) / 2
    : y + (x - y) / 2;
}

Two.

#include <stddef.h>

size_t pivot(size_t const x,
    size_t const y) {
  return x / 2 + y / 2 + (x % 2 + y % 2) / 2;
}

That is not always the case. Here is generic arithmetic for comparison.

#include <errno.h>
#include <limits.h>
#include <stdbool.h>
#include <stddef.h>

bool add(int* const p,
    int const x) {
  int y;

  y = *p;

  if ((x > 0 && y > INT_MAX - x) || (x < 0 && y < INT_MIN - x)) {
    errno = ERANGE;
    return false;
  }

  *p += x;

  return true;
}

bool mul(int* const p,
    int const x) {
  int y;

  y = *p;

  if (y > 0) {
    if (x > 0) {
      if (y > INT_MAX / x) {
        errno = ERANGE;
        return false;
      }
    } else {
      if (x < INT_MIN / y) {
        errno = ERANGE;
        return false;
      }
    }
  } else {
    if (x > 0) {
      if (y < INT_MIN / x) {
        errno = ERANGE;
        return false;
      }
    } else {
      if (y != 0 && x < INT_MAX / y) {
        errno = ERANGE;
        return false;
      }
    }
  }

  *p *= x;

  return true;
}

What have we learned?

~~That C sucks.~~
That we need to be careful.

An optimist would conclude that matters could be worse. At least integer overflows are easy to observe.

What if that was not the case?

Buffer Overflows

Take a good look at the following authentication system.

#include <stdbool.h>
#include <stdio.h>
#include <string.h>

static bool authenticate(bool* const result,
    FILE* const stream) {
  char const password[] = "secret";
  char buf[BUFSIZ];
  size_t size;

  size = fread(buf, 1, sizeof buf, stream);
  if (ferror(stream) != 0)
    return false;

  buf[size] = '\0';

  *result = strcmp(buf, password) == 0;

  return true;
}

We can compile a program that uses it and see that it works fine.

$ cc -o authenticator authenticator.c
$ printf %s secret | ./authenticator
Access granted!
$ printf %s wrong | ./authenticator
Access denied!

Is it actually fine?

Of course not.

$ printf '\0%08191d' 0 | ./authenticator
Access granted!

Why not?

Let us fire up a debugger to see what is happening.

$ gdb -q ./authenticator
Reading symbols from ./authenticator...done.
(gdb) break authenticate
Breakpoint 1 at 0x400666: file authenticator.c, line 8.
(gdb) run
Starting program: ./authenticator

Breakpoint 1, authenticate (result=0x7fffffffe33f, stream=0x7ffff7dd4640 <_IO_2_1_stdin_>)
    at authenticator.c:8
8               char const password[] = "secret";
(gdb) info locals
password = "\000\000\000\000\000\000"
buf = '\000' <repeats 5384 times>...
size = 0
(gdb) print (void* )password
$1 = (void *) 0x7fffffffe310
(gdb) print (void* )buf
$2 = (void *) 0x7fffffffc310
(gdb) print (void* )&size
$3 = (void *) 0x7fffffffe318
(gdb) print sizeof buf
$4 = 8192
(gdb) print password - buf
$5 = 8192
(gdb) quit
A debugging session is active.

        Inferior 1 [process 1764] will be killed.

Quit anyway? (y or n) y

A better view of the stack is in order.

Address	Variable
…	…
`0x7fffffffc300`	`&stream`
`0x7fffffffc308`	`&result`
`0x7fffffffc310`	`buf`
`0x7fffffffc311`	`&buf[1]`
…	…
`0x7fffffffe30f`	`&buf[8191]`
`0x7fffffffe310`	`password`
`0x7fffffffe311`	`&password[1]`
…	…
`0x7fffffffe317`	`&password[7]`
`0x7fffffffe318`	`&size`
…	…

The situation is definitely bad.

How can we fix it?

There are two simple options.

#include <stdbool.h>
#include <stdio.h>
#include <string.h>

static bool authenticate(bool* const result,
    FILE* const stream) {
  char const password[] = "secret";
  char buf[BUFSIZ];
  size_t size;

  /* #1 */

  /* size = fread(buf, 1, sizeof buf, stream); */
  size = fread(buf, 1, sizeof buf - 1, stream);
  if (ferror(stream) != 0)
    return false;

  /* #2 */

  /* buf[size] = '\0'; */
  buf[size] = '\0';

  /* *result = strcmp(buf, password) == 0; */
  *result = strncmp(buf, password, sizeof buf) == 0;

  return true;
}

Keep in mind that the overflow was restricted to one byte and already had potential for massive damage. More bytes would be more trouble.

What have we learned?

~~That C is for cookie, not for computers.~~
That we need to be extremely careful.

Being extremely careful takes a lot of time and effort. That is not a good fit for lazy programmers, which is all of them.

What can we do to not have to think?

Remedies

We could make computers think for us. Tools exist for that purpose.

Debuggers and Tracers

Debuggers and tracers are useful for analyzing the behavior of programs. However they are not of much use for finding new defects, unless it involves reverse engineering binary blobs.

One common use case is to fix crashes. Another less common one is to inspect how libraries or system calls are used and link or inject code into them.

As an example, let us adjust a binary blob clock to academic time (15 minutes late).

$ date --rfc-3339=seconds
2015-04-01 16:20:00+03:00

First we find out where the program gets its time.

$ alias quietly="3>&1 1> /dev/null 2>&3"
$ quietly ltrace date --rfc-3339=seconds | grep time
clock_gettime(0, 0x7fffc777a150, 0, 0)           = 0
localtime(0x7fffc777a0d0)                        = 0x7f0bcbc83de0
$ quietly strace date --rfc-3339=seconds | grep time
open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 4

Then we write a timestamp adjustment program.

#include <dlfcn.h>
#include <time.h>
#include <stddef.h>
#include <stdio.h>

static struct tm* (* f)(time_t const*) = NULL;

static void init(void) {
  f = dlsym(RTLD_NEXT, "localtime");
  if (f == NULL)
    (void )fprintf(stderr, "dlsym: %s\n", dlerror());
}

struct tm* localtime(time_t const* const timep) {
  if (f == NULL)
    init();

  time_t t = *timep;
  t -= 60 * 15;

  return f(&t);
}

Finally we link the code dynamically, at runtime.

$ LD_PRELOAD=./party.so date --rfc-3339=seconds
2015-04-01 16:05:30+03:00

Do you feel like a power user yet?

Unit Tests

Consider the following test suite for a mathematical library.

#include <cheat.h>
#include <cheats.h>

#include "array.h"
#include "mean.h"

CHEAT_TEST(mean,
  double elements[] = {1, 1, 2, 3, 5, 8, 13};
  struct array array = {
    .count = sizeof elements / sizeof *elements,
    .elements = elements
  };
  cheat_assert_double(mean(&array), 4.7, 0.1);
)

It passes with flying colors.

$ cc -c -o mean.o mean.c
$ cc -I . -c -o tests.o -std=c99 tests.c
$ cc -o tests mean.o tests.o -lm
$ ./tests
.
----
1 successful of 1 run
SUCCESS

How about empty arrays?

Unit tests only help find expected problems. Coming by new problems requires profiling, static analysis or fuzzing.

Profiling with Valgrind

Here is a command line program that calculates the arithmetic mean of its arguments. It has a memory leak.

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>

#include "array.h"
#include "mean.h"

int main(int const count,
    char** const arguments) {
  struct array array;
  size_t index;

  array.count = 0;
  array.elements = malloc((count - 1) * sizeof *array.elements);
  if (array.elements == NULL) {
    perror("malloc");

    return EXIT_FAILURE;
  }

  for (index = 1;
      index < count;
      ++index) {
    double x;
    char* nptr;
    char* endptr;

    errno = 0;
    x = strtod(arguments[index], &endptr);
    if (endptr == arguments[index])
      errno = EINVAL;
    if (errno != 0) {
      perror("strtod");

      return EXIT_FAILURE;
    }

    array.elements[array.count++] = x;
  }

  printf("The arithmetic mean is %f!\n", mean(&array));

  free(array.elements);

  return EXIT_SUCCESS;
}

Where is it?

Valgrind’s memory checker (memcheck) can tell us that.

$ valgrind --leak-check=full --tool=memcheck ./leaky 0.1 0.2 nope 0.8
==1764== Memcheck, a memory error detector
==1764== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==1764== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==1764== Command: ./leaky 0.1 0.2 nope 0.8
==1764==
strtoul: Invalid argument
==1764==
==1764== HEAP SUMMARY:
==1764==     in use at exit: 32 bytes in 1 blocks
==1764==   total heap usage: 2 allocs, 1 frees, 600 bytes allocated
==1764==
==1764== 32 bytes in 1 blocks are definitely lost in loss record 1 of 1
==1764==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1764==    by 0x4006D7: main (leaky.c:14)
==1764==
==1764== LEAK SUMMARY:
==1764==    definitely lost: 32 bytes in 1 blocks
==1764==    indirectly lost: 0 bytes in 0 blocks
==1764==      possibly lost: 0 bytes in 0 blocks
==1764==    still reachable: 0 bytes in 0 blocks
==1764==         suppressed: 0 bytes in 0 blocks
==1764==
==1764== For counts of detected and suppressed errors, rerun with: -v
==1764== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

One caveat exists: the branch that causes the leak must be executed for the leak to be detected.

In addition to memory checking, Valgrind comes with

cache profilers (cachegrind and callgrind),
a heap profiler (massif) and
thread profilers (helgrind and drd).

$ valgrind --tool=cachegrind ./leaky 0.1 0.2 0.4 0.8
==1764== Cachegrind, a cache and branch-prediction profiler
==1764== Copyright (C) 2002-2013, and GNU GPL'd, by Nicholas Nethercote et al.
==1764== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==1764== Command: ./leaky 0.1 0.2 0.4 0.8
==1764==
The arithmetic mean is 0.375000!
==1764==
==1764== I   refs:      171,385
==1764== I1  misses:      1,023
==1764== LLi misses:      1,018
==1764== I1  miss rate:    0.59%
==1764== LLi miss rate:    0.59%
==1764==
==1764== D   refs:       57,386  (42,783 rd   + 14,603 wr)
==1764== D1  misses:      3,044  ( 2,477 rd   +    567 wr)
==1764== LLd misses:      2,446  ( 1,951 rd   +    495 wr)
==1764== D1  miss rate:     5.3% (   5.7%     +    3.8%  )
==1764== LLd miss rate:     4.2% (   4.5%     +    3.3%  )
==1764==
==1764== LL refs:         4,067  ( 3,500 rd   +    567 wr)
==1764== LL misses:       3,464  ( 2,969 rd   +    495 wr)
==1764== LL miss rate:      1.5% (   1.3%     +    3.3%  )

They can be used for reverse engineering binary blobs too. Is that pretty cool or what?

Static Analysis

The most accessible static analyzer is the compiler itself.

Compilers of course produce error messages, but they can also produce useful warnings, if so instructed. The distinction is that errors are fatal to the compiler while warnings are just fatal to the user.

Even the lowest warning level of GCC finds a mistake in the previously profiled program.

$ gcc -Wall -c -o leaky.o leaky.c
leaky.c: In function 'main':
leaky.c:25:9: warning: unused variable 'nptr' [-Wunused-variable]
   char* nptr;
         ^

Why not enable all warnings?

That is a trick question. You should obviously enable all warnings and only disable them if they turn out to be bogus or useless.

There is still the issue that compilers primarily focus on producing code instead of useful feedback. That brings up another question. How can we get more warnings?

Luckily there are programs that specialize on producing complaints instead of executables.

Splint, Lint and Friends

Splint is short for Secure Programming Lint, which refers to an ancient tool called Lint. Originally Lint came from the same place as C did, Bell Labs, and only a few years later than C itself. One can only wonder why.

Throwing our leaky and sloppily written program at Splint creates a wall of text.

$ splint leaky.c
Splint 3.1.2 --- 03 May 2009

leaky.c: (in function main)
leaky.c:22:4: Operands of < have incompatible types (size_t, int):
                 index < count
  To allow arbitrary integral types to match any integral type, use
  +matchanyintegral.
leaky.c:31:12: Unrecognized identifier: EINVAL
  Identifier used in code has not been declared. (Use -unrecog to inhibit
  warning)
leaky.c:35:24: Only storage array.elements (type double *) derived from
                  variable declared in this scope is not released (memory leak)
  A storage leak due to incomplete deallocation of a structure or deep pointer
  is suspected. Unshared storage that is reachable from a reference that is
  being deallocated has not yet been deallocated. Splint assumes when an object
  is passed as an out only void pointer that the outer object will be
  deallocated, but the inner objects will not. (Use -compdestroy to inhibit
  warning)
leaky.c:25:9: Variable nptr declared but not used
  A variable is declared but never used. Use /*@unused@*/ in front of
  declaration to suppress message. (Use -varuse to inhibit warning)
leaky.c:41:46: Passed storage array contains 1 undefined field: elements
  Storage derivable from a parameter, return value or global is not defined.
  Use /*@out@*/ to denote passed or returned storage which need not be defined.
  (Use -compdef to inhibit warning)

Finished checking --- 5 code warnings

Are the complaints appropriate?

Those are the known remedies. Is there anything else we can do?

Forget about C

Using another language may help in not creating any of these ridiculously convoluted problems.

Java might be unpleasant and suffer from integer overflows, but it is still a great improvement in terms of security.

Sanely written Haskell does not have any of these problems and allows some mathematical reasoning due to its pure nature.

There are even languages like Coq that never get stuck in infinite loops and provide the means to prove automatically verified theorems about programs.

How about C++?

C++ is way worse. It might not be a superset of C, but it has all of its problems and then some.

Fast Track into Depression

Our brief journey ends here, with many dark corners left uncharted.

Hopefully this was not too depressing, because the situation is significantly worse when

floating-point numbers,
concurrency or
interfacing with foreign code

is needed. Luckily that only happens in science, engineering, statistics, finance, medicine…

In conclusion, the choice of language matters and has a huge impact on security!