Node:Bounds Checking

Bounds Checking

The C part of the GNU compiler now supports full fine-grained pointer checking at runtime. This work was originally done by Richard W.M. Jones `rjones@orchestream.com', and has been extended by the work of many other kind contributors.1

The runtime checking library, test kit and various tools can be found in the bounds/ subdirectory.

The brief manual here is a distillation of the original paper that appeared at the same time as the original patches to GCC. The paper contains more details about the inner workings of bounds checking GCC. The paper can be found in PostScript format in bounds/report/bcrep2.ps.gz.


Node:Compiling with checks, Next:, Up:Bounds Checking

Compiling with checks

To compile all or part of your program with bounds checking, simply add the -fbounds-checking flag when compiling and linking. In the simplest instance, you might do:

gcc -fbounds-checking program.c -o program

Or, linking several checked files together:

gcc -fbounds-checking -c file1.c -o file1.o
gcc -fbounds-checking -c file2.c -o file2.o
gcc -fbounds-checking -c file3.c -o file3.o
gcc -fbounds-checking file1.o file2.o file3.o -o program

If your program uses a Makefile, you will probably only need to add the -fbounds-checking flag to CFLAGS, and remake the program from scratch.


Node:Incompatibilities with checking, Next:, Previous:Compiling with checks, Up:Bounds Checking

Incompatibilities with checking


Node:Unchecked code and libraries, Next:, Previous:Incompatibilities with checking, Up:Bounds Checking

Unchecked code and libraries

You can normally freely mix unchecked and checked code. This is why you don't need to make any changes to your C or X11 libraries when you install GCC with bounds checking. The checking library will detect code compiled with and without checking automagically, and let the two run together. You can mix unchecked object files with checked ones for the same reason. Always pass the -fbounds-checking flag to the link stage.

gcc -fbounds-checking -c file1.c -o file1.o
gcc -c unchecked.c -o unchecked.o
gcc -fbounds-checking file1.o unchecked.o -o program

The checking library will usually only know about variables that are declared in checked code, and about memory allocated with malloc. So if a variable is declared in unchecked.c above, then references to it will not be checked, even when these references occur in checked code.

Say that file unchecked.c contains the following code:

int a[10];

int *get_ptr_to_a () { return a; }

and file file1.c contains:

extern int *get_ptr_to_a ();

main ()
{
  int *ptr_to_a = get_ptr_to_a ();
  int i;

  for (i = 0; i < 20; ++i) ptr_to_a[i] = 0;
}

The references to ptr_to_a will not be checked. You can resolve this by adding a, either by hand, or semi-automatically. See Unchecked objects.

If you place extern int a[10]; anywhere in file1.c, bounds checking GCC will also be able to find and check the array references properly.

If you include bounds/run-includes/unchecked.h, you get facilities to turn bounds checking on and off over short stretches of code and within single expressions and statements. Even when bounds checking is switched off, you may still use these features. The macros are silently ignored if bounds checking is off, or if the compiler is not GCC.


Node:Debugging with GDB, Next:, Previous:Unchecked code and libraries, Up:Bounds Checking

Debugging with GDB

If you have GDB (or another debugger) on your system, you will be able to debug bounds checked programs easily and efficiently. To help you catch bounds errors before the program aborts (which sometimes causes the program's stack to disappear), place a breakpoint at __bounds_breakpoint. The checking library always calls this breakpoint before aborting. If the -never-fatal flag has been supplied See Environment at runtime, you will need to place this breakpoint, since the program does not abort when it hits a bounds error.


Node:Environment at runtime, Next:, Previous:Debugging with GDB, Up:Bounds Checking

Environment at runtime

You can customize the way a bounds-checked program runs by passing options to it in the environment variable `GCC_BOUNDS_OPTS'. For instance, suppose you don't want the banner message that appears when bounds checked programs start up. With sh or ksh, you might type:

% GCC_BOUNDS_OPTS='-no-message' program

With csh:

% setenv GCC_BOUNDS_OPTS '-no-message'; program

You can put any combination of the following flags in GCC_BOUNDS_OPTS. Place spaces or tabs between each flag.

-no-message
Don't print the introductory message.
-no-statistics
Don't print library call statistics with the program quits.
-?, -help
Print this list of options, then quit the program before it starts.
-reuse-heap (*)
-reuse-age=<age>
-no-reuse-heap
See the discussion of heap memory, See Managing the heap.
-warn-unchecked-statics
-no-warn-unchecked-statics (*)
-warn-unchecked-stack
-no-warn-unchecked-stack (*)
See the discussion of unchecked objects, See Unchecked objects.
-warn-free-null (*)
-no-warn-free-null
Give a warning if free (0) is called. Note that this may be correct in ANSI C, and some libraries, notably X11, do it quite often.
-warn-misc-strings (*)
-no-warn-misc-strings
Miscellaneous warnings with str* and mem* functions, such as trying to call memcpy with size = 0.
-warn-illegal
-no-warn-illegal (*)
Warn when ILLEGAL pointers are generated. These patches, provided by Don Lewis <gdonl@gv.ssi1.com>, help to track down ILLEGAL pointer errors when they happen.
-warn-unaligned (*)
-no-warn-unaligned
Warn when a pointer is used in an unaligned manner, for instance reading integer data as chars. This warning is turned on by default, but may be disabled, since some programs do this quite harmlessly. These patches were suggested by Stuart Kemp and Eberhard Mattes.
-warn-all
Turn on all of the warnings above.
-array-index-check (*)
-no-array-index-check
Check 2D array indices correctly. This is turned on by default. This will only check arrays with size > 1. This is done to avoid problems with structure hack definintions. See Checking 2D array indices.
-never-fatal
Normally the library will call abort() after it detects the first bounds error. If this flag is given, the library attempts to proceed. The first error may generate more errors itself afterwards, so only the first error is guaranteed to be correct.
-print-calls
-no-print-calls (*)
Print calls to the checking library. This option is only useful if you want to debug bounds checking GCC itself.
-print-heap
-print-heap-long
-no-print-heap (*)
Print the contents of the heap at program end. This is useful to find memory leaks in your program. -print-heap shows the total leaked memory for each malloc call, while -print-heap-long shows each leaked allocation in detail.

Items marked with a `(*)' are the default.


Node:Managing the heap, Next:, Previous:Environment at runtime, Up:Bounds Checking

Managing the heap

The bounds checking library includes a customized version of the GNU malloc library. Calls to malloc, free, realloc, calloc, cfree, valloc and memalign are checked. You will get a bounds error if you try to:

There are several strategies for tracking stale memory pointers. Ideally, we would like to never reuse VM after the programmer has freed it, so that we will always be able to detect a stale pointer, no matter how long the program runs before using it. If you wish this behaviour, then pass the -no-reuse-heap option in `GCC_BOUNDS_OPTS' See Environment at runtime.2

In practice, we found this technique to be wasteful, so the default is to reuse heap memory immediately. However, in order to provide some protection against stale pointers, you may pass the -reuse-age=<age> option to the library. This will add freed blocks to a queue of pending blocks. You must call free <age> times before the block is actually reused.

Notice that the most common error is:

free_list (list *p)
{
  for (; p != NULL; p = p->next)
    free (p);
}

The default flags, -reuse-heap -reuse-age=0, will catch this error.


Node:Unchecked objects, Next:, Previous:Managing the heap, Up:Bounds Checking

Unchecked objects

Variables declared in files that are not compiled with -fbounds-checking are not normally known about by the checking library. Pointers that point to these variables are not checked, even where the operations on these pointers happen within checked code. To be sure that your program is running without any errors, you should turn on warnings about unchecked operations by giving the -warn-unchecked-statics and/or -warn-unchecked-stack flags at runtime. See Environment at runtime.

To avoid these warnings, and check all operations, you should take steps to add these objects to the tree used by the checking library. There are three approaches:


Node:Miscellaneous features, Next:, Previous:Unchecked objects, Up:Bounds Checking

Miscellaneous features


Node:Checking 2D array indices, Next:, Previous:Miscellaneous features, Up:Bounds Checking

Checking 2D array indices

2D arrays (and, indeed, n-D arrays with n >= 2) are checked as you might expect. We consider such arrays to be flattened before checking. For instance a mathematical 3x3-matrix A might be defined as:

double A[3][3];

When -no-array-index-check is active we consider such a array as flattened. Bounds checking will then consider this to be a flat array with 9 elements. So, it is perfectly sound to write A[1][4], since 1*3+4 == 7, and 0 <= 7 < 9. Similarly, A[0][8] and A[2][-1] will not generate bounds errors. (Interestingly, though, errors in the first index will be caught -- this is to do with a subtlety in the way bounds checking works).

When -array-index-check is present every dimension of the array is checked separatly. This is the default. This flag is only used for arrays with size > 1. For arrays with size of 1 or 0 the flattened model is used. This allows the following code to work correctly.

struct _string_t
{
  int len;
  char str[1];
};


Node:What errors are caught, Next:, Previous:Checking 2D array indices, Up:Bounds Checking

What errors are caught

A lot of people tell me that they have Purify, and bounds checking GCC seems unnecessary, since it seems to duplicate Purify but more slowly. Well, there are important reasons why bounds checking GCC is better than Purify, and if you rely on Purify alone, you will certainly miss bugs in your program.

This is what bounds checking GCC will find, which Purify won't:

This is what Purify will find, which bounds checking GCC won't:

There is a freeware program which emulates Purify available from Tristan Gingold <gingold@amoco.saclay.cea.fr>. It only runs under Linux. Purify only works on Sun SPARCstations and HP-PA machines, and, of course, costs lots of cash.


Node:Performance, Next:, Previous:What errors are caught, Up:Bounds Checking

Performance

This page is under construction.


Node:Stubborn bugs, Next:, Previous:Performance, Up:Bounds Checking

Stubborn bugs

The very latest list of bugs can be found in bounds/BUGS. This is a list of some of the most stubborn bugs, some of which have been around since the first version. Please send bug reports and (even better) bug fixes to `rjones@orchestream.com' or `Haj.Ten.Brugge@net.HCC.nl'.


Node:Using G77 with bounds checking, Previous:Stubborn bugs, Up:Bounds Checking

Using G77 with bounds checking

Bounds checking patches break the current G77 patches. You can get round this very easily. Copy cp/bounds.c into the f/ subdirectory. Alter f/Makefile.in so that it compiles bounds.c along with the other G77 object files.

Notice that this doesn't add bounds checking to FORTRAN (:-<). Just lets you compile it.


Footnotes

  1. See the file bounds/CONTRIBUTORS for a full list of the people who gave their effort for free to make all this possible.

  2. In a future version of bounds checking GCC, we will be able to unmap this memory. Thus the operating system will be able to reuse physical memory, whilest virtual memory addresses remain unused.