Without working compiler we cannot create binaries. But a working compiler itself is a binary. So how to cope with this "chicken and egg" issue?
In fact there is no way to get around - we must have a minimal set of binaries that allow us to create other binaries. However how minimal this set should be? If you tried Gentoo Linux, you probably realized that it comes with a 10 MB tarball containing binaries of a minimal operating system plus a lot of gigabytes of (mostly) source code packages. The binary image provided the basic build enviroment for the rest of the distribution. After unpacking it turns into 50 MB of binaries.
This solution is nice but I consider 50 MB for building enviroment to be too much. In fact an optimizing compiler supporting plethora language tricks (as is in the Gentoo basic building enviroment) is not needed to be able to bootstrap an operating system build, because we already have the sources. So I chose different solution.
The P programming language is a base programming language that every ESPM compiler must know about. It is derived from C/C++ and though somewhat simple, it is powerful enough to be able to describe a complete optimizing compiler. This language is so simple that a "stupid but complete bootstrapping compiler" can have only few hundreths of kilobytes. And this bootstrapping compiler is the one thing that is needed to bootstrap the whole system - compare this to the 50 MB of the Gentoo basic build system.
The optimizing compiler is written in P and is compiled by the "stupid bootstrapping compiler". However this optimizing compiler would be slow, so the next step is to compile the same optimizing compiler again but with the already compiled optimizing compiler this time. This produces another binary of the optimizing compiler but an optimized one this time. Once this optimizing compiler is created, it can be used to compile the rest of the system.
Also other things such as shell, linker etc. must be written in P so they can be built by the bootstrapping compiler.