Home » Development and Build » Compilation » Step-by-Step Guide to GCC Compilation in Linux: Understanding GCC Compilation Process

Step-by-Step Guide to GCC Compilation in Linux: Understanding GCC Compilation Process

When you’re working with C or C++ on a Linux system, understanding the compilation process is crucial. The GNU Compiler Collection (GCC) is the standard compiler used in most Linux distributions. This blog post will guide you through the GCC compilation steps, helping you understand how your code transforms from source files into an executable program.

The Basics of GCC Compilation

The GCC compilation process involves several stages: preprocessing, compilation, assembly, and linking. Each stage has its purpose, and understanding these steps will make it easier to troubleshoot and optimize your code.

  1. Preprocessing: The first stage is preprocessing. The preprocessor handles directives like #include, #define, and #ifdef. It expands macros, includes header files, and removes comments. The output is a pure C/C++ code file with all the macros and includes resolved. For example, consider the following code:
   #include <stdio.h>
   #define PI 3.14

   int main() {
       printf("Value of PI: %f\n", PI);
       return 0;
   }

After preprocessing, #include <stdio.h> is replaced with the actual content of the stdio.h header file, and PI is replaced with 3.14. To run the preprocessing step, you would use the command:

   gcc -E program.c -o program.i

This generates program.i, the preprocessed output.

  1. Compilation: The second stage is compilation, where the preprocessed code is converted into assembly language. The compiler translates high-level code into a low-level assembly code that is specific to the machine architecture. To generate the assembly code, you can use the following command:
   gcc -S program.i -o program.s

This generates program.s, the assembly code.

  1. Assembly: In the assembly stage, the assembly code is translated into machine code, producing an object file. This object file is a binary representation of the code, but it isn’t yet an executable. You can generate the object file using:
   gcc -c program.s -o program.o

This generates program.o, the object file.

  1. Linking: The final stage is linking. The linker combines the object files into a single executable. It resolves references to external libraries, combines code from different object files, and handles the memory layout of the program. To generate the final executable, you can use:
   gcc program.o -o program

This generates program, the final executable.

Putting It All Together

Let’s say you have a simple C program called hello.c. To compile it using GCC, you could run:

gcc hello.c -o hello

This single command runs through all the stages: preprocessing, compilation, assembly, and linking, producing the executable hello. However, understanding each stage separately can help you optimize the process, troubleshoot errors, and control the compilation flow.

Advanced Compilation Techniques

Optimizations: Use -O flags like -O2 or -O3 for different levels of optimization during compilation.

Debugging: Include -g to add debugging information to your executable, making it easier to debug with tools like GDB.

Multi-file Compilation: Use gcc -c to compile multiple C files into object files separately and then link them together.

Example – Compilation of C program helloworld.c to helloword.bin executable formation

file.c => The file name with extension “dot c” is the C source code that must be preprocessed.

file.i => C source code that should not be preprocessed.

file.h => C, header file to be turned into a precompiled header (default)

file.s => Assembler code.

file.S => Assembler code that must be preprocessed.

$ vim helloworld.c
#include <stdio.h>

#define MY_NAME "DevBee"

int main(int argc, char **argv) {
        printf("validating preprocessor string: %s\n", MY_NAME);
        printf("Hello world for understanding compilation\n");
        return 0;
}

Check the GCC version as,

$ gcc --version
gcc (Ubuntu 12.2.0-3ubuntu1) 12.2.0

Generate all the temporary files which are created during compilation as,

 $ gcc -o helloworld -v -save-temps helloworld.c
$ tree
.
├── helloworld
├── helloworld.c
├── helloworld.i
├── helloworld.o
└── helloworld.s

0 directories, 5 files

If we check the each file types, it is as below,

$ file helloworld*

helloworld:   ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=32deb5f47e5b1f4bac853292e55fc3eddc28ff6f, for GNU/Linux 3.2.0, not stripped

helloworld.c: C source, ASCII text

helloworld.i: C source, ASCII text

helloworld.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

helloworld.s: assembler source, ASCII text

Where, helloworld.c is the C source code which we have written and is in human readable format.

helloworld.i – is the preprocessed source code. In this all the headers “#include” and macros “#define” will get replaced with actual files and values.

helloworld.o – This is the object files, which the output of compilation and next step of preprocessing.

helloworld.s – This is assembly files and output of assembler stage.

helloworld – This is final executable.

Using GCC arguments we can also try the single steps during compilation as below,

-E – Preprocess only; do not compile, assemble or link ( -E Stop after the preprocessing stage; do not run the compiler proper. The output is in the form of preprocessed source code, which is sent to the standard output. )

$ gcc -E helloworld.c

-S – Compile only, do not assemble or link ( -S Stop after the stage of compilation proper; do not assemble. The output is in the form of an assembler code file for each non-assembler input file specified. By default, the assembler file name for a source file is made by replacing the suffix .c, .i, etc., with .s. )

$ gcc -S helloworld.c

-c – Compile and assemble, but do not link (-c Compile or assemble the source files, but do not link. The linking
stage simply is not done. The ultimate output is in the form of an object file for each source file. By default, the object file name for a source file is made by replacing the suffix .c, .i, .s, etc., with .o. )

$ gcc -c helloworld.c

-o file – Place the output into file ( Place output in file file. This applies to whatever sort of output is being produced, whether it be an executable file, an object file, an assembler file or preprocessed C code. )

$ gcc -o helloworld helloworld.c

1 thought on “Step-by-Step Guide to GCC Compilation in Linux: Understanding GCC Compilation Process”

Leave a Comment