Plan 9 and Inferno at the Google Summer of Code

On Ken

The Plan 9 compilers present an interesting challenge in porting. Because Plan 9 was, and is, a research operating system, the architects were allowed to throw away convention, and implement things in the best manner they were able. For the compilers, this has manifested in a number of ways. The most visable to the users is that ANSI C has been both extended and restricted, in ways which Plan 9 programmers generally find agreeable. But, for the compiler architect and porter, there are obvious, larger changes, which present obstacles to porting.

To begin with, the system does not use the standard stack semantics of any given architecture. This used to read that, on Plan 9, the stack always grows away from 0, as I was misled to believe. Actually, on all architectures on Plan 9, the stack grows towards the lower address range, i.e. 0 Arguments and local variables are accessed at known offsets from the stack pointer, and the base register remains free for the compiler to use. While there is much to appreciate about this convention, in order to interoperate with code generated by Linux and BSD native compilers, the compiler must follow the same stack and calling conventions. This may seem like a simple problem to solve, but there are exacerbating factors. For one thing, the compiler loses a register from its normal complement.

So, how do we deal with this? Well, to begin with, it helps to have a working compiler. The idea is to initially link together only code generated by 8c, and load it into memory with a special purpose loader, which preloads a function to map calls to it, using the Plan 9 convention, into system calls, using the BSD convention. Once the compiler this works, we begin transforming the calling convention, while trying to maintain a working compiler. Initially, this means freeing EBP (the aforementioned base pointer register) for use in stack frame generation. From there, work progresses on achieving compatible calling conventions, while mantaining a working compiler all th way.

But, that’s not the only area where the compilers diverge from convention. The C compiler, itself, handles, in one program, the preprocessing, lexing, machine code generation, and initial optimization, though it is clearly divided between machine independant and dependant arenas. The objects that it outputs are basically compressed, binary representations of assembly language. The loader processes these objects into machine code, subjecting code from the compilers and assemblers alike to global optimization in the process. This means that the compilers and loaders are intrinsically linked. The two must be used together. The main problem with this is that the design of the loader is not sympathetic to the cause of loading standard object files.

There are several ways to deal with this problem, and it’s arguable which way is best. The first option would be to modify 8l to actually link ELF object files. This is not a particularly pleasant prospect, and is likely to yield few benefits. The second is to have 8l output a relocatable ELF object to be linked by the system linker. This would slow things down, and complicate them a bit, but would not pe particularly difficult. The third, which is likely to be the first implemented, is to only link ELF libraries dynamically. The loaders and compilers already been modified to be useful in dynamic linking, but only when used with Plan 9 a.out objects. Modifiying them to generate dynamic ELF executables and shared objects is likely to be fairly painless and achievable.

These are certainly not the only compatibility issues, but they present the general flavor of the problem. It should be fun…

Here’s to a sane compiler on Lunix, Kris Maglione