DragonFly BSD
DragonFly bugs List (threaded) for 2005-01
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Pretty please: no more lower-case macros !!!


From: Adrian Bocaniciu <a.bocaniciu@xxxxxxxxxxxx>
Date: Thu, 06 Jan 2005 09:00:50 +0000
Cc: Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>

As I have already mentioned in a follow-up to my recent bug report, the essential ports net/quagga and net/zebra have been totally broken by the use of lower-case macros in the DragonFly system headers.

There is an old C tradition that the identifiers used in macro definitions must be written entirely with capital letters and that no other identifiers should be written like that. This tradition has not been caused by esthetic reasons but by the need to separate the name spaces for macros and for other identifiers.

The way in which preprocessing is implemented in C dictates that this name space separation is mandatory, otherwise it is not possible to guarantee the correctness of any C program. C is neither Lisp nor Scheme nor a language that uses a special marker, e.g. $, for macro expansions, where the name space overlap does not cause any troubles.

When you have more than 10,000 ports, written by more than 100,000 programmers, one can be certain that almost every possible alphanumeric string, which is not too long, has already been used by someone, somewhere, as the name of a local variable. Therefore, it is truly naive to believe that if a lower-case macro definition is inserted in a system header used by all these ports, then no name collision will happen.

The following pathological example, which is suitable for a C textbook, is not made up, it is extracted from the current DragonFly sources :-)

Let's say that programmer Unlucky, which contributes code to the port #13013, writes the following code fragment:

. ..
#include <some_system_header.h>
. ..
{
. ..
struct s1 var1;
struct s2 var2;
struct s3 var3;
. ..
memset(& var2, MAGIC_VALUE, sizeof var2);
. ..
}
. ..

After testing his program, Unlucky declares proudly that his code is foolproof. A thousand auditors read his source code and they also conclude that the code is foolproof.

Then comes DragonFly and the following macro definition is inserted in some_system_header.h:

#define var2 foo[0]

When the port is recompiled, the compiler will allocate for var2 an array of null length, whose address will be thus the same with that of one of the adjacent variables. However, in the sizeof context, foo[0] will be an array component of type struct s2, therefore the compiler will translate the memset invocation as it would have been:

memset(& var1, MAGIC_VALUE, sizeof(struct s2));

Thus, unexpectedly, var1 will be destroyed, resulting in more or less spectacular crashes. Moreover, if struct s2 is large enough, the memset invocation will also destroy other parts of the stack, including the return address, leading to even more spectacular crashes. Even more perverted examples can be easily constructed.

One solution to avoid the name collisions would be to precede each program block with "#undef" directives for all its local identifiers. It is obvious that such a stupid solution is not acceptable. The traditional convention about the use of capital letters to enforce the name space separation is much more convenient.

In the case of net/route.h, where I first encountered such macros, they were used to provide alternate names for some structure members. That can be done in a perfectly safe way without macros, by using anonymous unions. Even if, for some very stupid reason (as the named unions were really a mistake in the original C design) the anonymous unions have not made their way yet into the official C standard, gcc and most other C compilers support anonymous unions also in C, not only in C++.

Best regards !



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]