Noeud:What Gperf is, Noeud « Next »:, Noeud « Previous »:Looking for Keywords, Noeud « Up »:Scanning with Gperf



What Gperf is

Gperf is a generator of small and fast recognizers for compile-time fixed sets of keywords. Unless specified differently through command line options, the result is C code consisting of a static hash table and a hash function optimized for a given set of keywords. There is no restriction on the use of its output, v.g., no license is imposed on its output.

This recognizer is typically used to discriminate reserved words from identifiers in compilers. The GNU Compiler Collection to GCC (FIXME: ref to GCC.), GCC, uses it at least for Ada, C, C++, Chill, Java, Modula 2, Modula 3, Objective C, and Pascal. GNU Indent also bases its keyword recognition on Gperf, but its uses are not limited to programming languages. For instance makeinfo, the Texinfo to Info translator, uses it to recognize its directives. Finally, it proves itself useful for handling sets of options, especially because it provides a very convenient interface --performance is less likely to matter. For instance GNU a2ps uses Gperf to read its configuration files, GNU Libiconv, the GNU character set conversion library, uses Gperf to resolve more than three hundred aliases such as 11 different names for ASCII.

A hash function is a fast function which maps any member of a k element user-specified keyword set K onto an integer range 0..n - 1. The result integer, called hash code, is then used as an index into an n element table containing the keywords, sorted according to their hash code.

A hash function is perfect if no two keywords have the same hash code, in other words if the result hash table is collision-free. This means that at runtime time cost is reduced to the minimum: examining a single string at runtime suffices. For perfect hash table, we have n >= k. A hash function is minimal when its space cost is reduced to the minimum: n = k.

Usually time optimality is more important than space optimality, and fortunately it is easier to generate perfect hash functions than minimal perfect hash functions. Moreover, non-minimal perfect hash functions frequently execute faster than minimal ones in practice. This phenomenon occurs since searching a sparse keyword table increases the probability of locating a "null" entry, thereby reducing string comparisons. gperf's default behavior generates near-minimal perfect hash functions for keyword sets. However, gperf provides many options that permit user control over the degree of minimality and perfection.