Compiling for old glibc - the story of fscanf

An exploit in C

A while ago I tried to compile a local privesc exploit targeting an old Linux machine. It built fine on my Kali VM but when I ran it on the target the dynamic linker complained that I needed a minimum of glibc 2.7. The installed glibc was a few versions behind.

I had a look at the binary and saw that it was trying to locate the symbol __isoc99_fscanf.

# objdump -T exploit | grep GLIBC_2.7
00000000      DF *UND*	00000000  GLIBC_2.7   __isoc99_fscanf

The exploit source code uses plain old fscanf, which should be available in all versions of libc. I did some Googling and found a Red Hat bug report that claims this can be fixed by defining the _GNU_SOURCE macro. Indeed it does:

# gcc 33321.c -o exploit_fix -D_GNU_SOURCE
# objdump -T exploit_fix  | grep fscanf
00000000      DF *UND*	00000000  GLIBC_2.0   fscanf

Problem solved. (Well, mostly—I didn’t get the exploit to work.) But I was curious and made a note to find out why this works later.

Conformance woes

The problem stems from a conflict with the C99 standard. Back in the day glibc introduced a non-standard extension to the scanf functions. If you were extracting string content and didn’t want to prepare your own buffer space you could use the %a modifier. This way it would malloc memory of sufficient size automatically and fill in your argument with the pointer. You would place flags in your format string like %as or %a[a-zA-Z0-9]. Life was good provided you were on a GNU system.

Then ISO C99 came along. It introduced a different meaning for %a—the input should be parsed as a floating point number. Under this interpretation %as would be “a floating point number followed by the literal letter s”, whereas glibc would put the entire string into a newly allocated buffer and return the pointer. This inconsistency spawned the fairly epic Debian bug #155835 which began all the way back in 2002.

Ultimately the situation was resolved in glibc 2.7. According to the changelog, on 2007-09-17 a redirect to the ISO C99 version was added. Unless the __USE_GNU flag is set any use of fscanf would actually default to the ISO implementation __isoc99_fscanf.

#if defined __USE_ISOC99 && !defined __USE_GNU \
	&& (!defined __LDBL_COMPAT || !defined __REDIRECT) \
	&& (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)
# ifdef __REDIRECT
/* For strict ISO C99 or POSIX compliance disallow %as, %aS and %a[
GNU extension which conflicts with valid %a followed by letter
s, S or [.  */
extern int __REDIRECT (fscanf, (FILE *__restrict __stream,
				const char *__restrict __format, ...),
				__isoc99_fscanf) __wur;
...

Separately, POSIX.1-2008 introduced the %m modifier to do the same job as the old %a. It’s slightly more flexible and also deliberately chosen not to conflict with the ISO %a. Support for this was also introduced in glibc 2.7 according to the fscanf man page. This appears to be the state of the art if you want to do auto-allocating buffers in C code.

Now the original problem is clear. When the exploit was originally written, using fscanf would give you the GNU implementation. After a few years it would compile pointing to the ISO implementation instead, even though it made no difference for this particular code. The __isoc99_fscanf symbol did not exist in earlier versions so a minimum version of 2.7 had to be specified. Using the -D_GNU_SOURCE flag explicitly opts us in to the GNU extensions. The GNU implementation uses the symbol fscanf and is backwards compatible all the way to glibc 2.0. This works fine on my old Linux target.