Home » Development and Build » Compilation » Understanding compilation stages – Preprocessor, Compiler, Assembler, Linker, Loader

Understanding compilation stages – Preprocessor, Compiler, Assembler, Linker, Loader

When we compile Any program in Linux using “gcc” for example ” gcc -o helloworld helloworld.c” it creates an executable with “helloworld” name in single command, but actually in background it goes on following first 4 stages as mentioned below,

  1. Preprocessor
  2. Compiler
  3. Assembler
  4. Linker
  5. Loader

1) Preprocessor

The C preprocessor is the macro preprocessor for the C language. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control. For example, when we write a code something like below,

#define TEST 5
printf("%d \n", TEST);

After the pre-processor steps the same code becomes as,

printf("%d \n", 5);

I.e. preprocessor goes on finding all #define, #include etc and add relative source code , definitions directly into the code.

2) Compiler – GCC : GNU project C and C++ compile

Help – “man gcc”

When you invoke GCC, it normally does preprocessing, compilation, assembly and linking. The “overall options” allow you to stop this process at an intermediate stage.
For example, the -c option says not to run the linker. Then the output consists of object files output by the assembler.

3) Assembler (as)

GNU as is really a family of assemblers.
“as” is primarily intended to assemble the output of the GNU C compiler “gcc” for use by the linker “ld”.

If you are invoking as via the GNU C compiler, you can use the -Wa option to pass arguments through to the assembler.
The assembler arguments must be separated from each other (and the -Wa) by commas. For example:

 $ gcc -c -g -O -Wa,-alh,-L file.c

This passes two options to the assembler: -alh (emit a listing to standard output with high-level and assembly source)
and -L (retain local symbols in the symbol table).

4) Linker – ld – The GNU linker

ld combines a number of object and archive files, relocates their data and ties up symbol references.
Usually the last step in compiling a program is to run ld.

The Loader, as we seen below is not the step of compilation, but its one of the first stages of execution of a program, in which loader tries to load all the libraries along with the application during start time.

5) Loader –ld.so/ld-linux.so – dynamic linker/loader

ld.so loads the shared libraries needed by a program, prepares the program to run, and then runs it.
Unless explicitly specified via the -static option to ld during compilation, all Linux programs are incomplete and require further linking at run time.

In Next Two posts we will understand how these steps actually works when we tries to compile “helloworld.c” program.
1. understanding gcc compilation steps : linux compilation steps
2. from source code to executable : how executable is created during compilation on linux

1 thought on “Understanding compilation stages – Preprocessor, Compiler, Assembler, Linker, Loader”

Leave a Comment