The big design hole of software build systems

In the previous step we saw this hole. The make utility is not able to cope with frequent code redesigns very well. Everytime a module or even a module reference is added or deleted in the sources, the developer must tinker with makefile to keep the build system informed about the current state of the dependencies between his program files. And the worse thing is that if forgets to do so, nothing will tell him. Strange behavior which I called "ghosts of bugs" (manifestations of bugs which were fixed long ago which are due to the misinformed build system failing to recompile the affected module) happens instead.

Out newbie Jim now got tired - soon enough to avoid to see other aspects of this big hole: broken library systems, problems determining what and where to install, lack of a library dependency tracking system and other problems. It seems that everybody out there who is making useful code is painstaking enought to maintain huge scripts for the ancient build system still used and cope with nasty bugs coming from the inconsistencies between these build scripts and the source codes.

But I'm not painstaking, I'm lazy. Very lazy. So lazy that I began to dream about a better build system for OSHS than trying to tinker huge scripts and constantly removing bugs from them. And this build system of my dreams also comes with its own software integration language.

About this big hole

You might be wondering now, whether I'm alone seeing this hole and why someone didn't do anything with it. After all everything that is needed to fix this big hole is to have the compiler to write the list of the referenced .h files somewhere aside so the build system can automatically decide what to build next and then finally what to link together to produce a working binary.

The short answer to these questions is "broken standard". This broken standard prevents anyone from implementing the simple solution I suggested in the previous paragraph. To see how, we need to look at the design of the C compiler that is required by the standard.

As said before, the C language source code compilation is performed by two steps. First the file is fed into something called C preprocessor, which is responsible for interpreting directives like #include in the file. The result of the C preprocessor is what actually enters the compilation phrase.

To see an example we will use the files hello.c, banner.h and banner.c from the modularization example. First produce banner.o file so we don't have to bother with it anymore:

$ gcc -c banner.c
$ _

Now we can use the -E flag to see what exactly enters the compiler. The -E flag tells gcc to send the preprocessed source to stdout instead of the compiler:

$ gcc -E hello.c
# 1 "hello.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "hello.c"
# 1 "banner.h" 1
void NewLine(void);
void GenerateBanner(char *Str);
# 2 "hello.c" 2

int main(void)
{
  NewLine();
  GenerateBanner("Hello, world !");
  NewLine();
  return(0);
}
$ _

As you can see, the content of the banner.h appears in the resulting file along with an indication where it came from (the # 1 "banner.h" 1 line above). However this origin information is not used to determine that a banner module is needed for the build, is is only used by the compiler to provide an exact location of any errors it finds in the file. The reason is that the standard requires that. More specifically, the standard says that the following example must also work (assuming we have the banner.o file already compiled):

$ cat hello1.c
void NewLine(void);
void GenerateBanner(char *Str);

int main(void)
{
  NewLine();
  GenerateBanner("Hello, world !");
  NewLine();
  return(0);
}
$ gcc -c hello1.c
$ gcc -o hello hello1.o banner.o
$ _

This new file hello1.c does not #include the header file banner.h but embeds its content (the first two lines in the file) directly. The C standard says that this must work like the previous file with the #include directive.