C Program Compilation Process
The compilation of a C program is a sophisticated process that transforms high-level C code into machine-executable instructions. This multi-stage pipeline ensures that human-readable code is converted into efficient binary code that computers can execute directly.
The compilation of a C program is a sophisticated process that transforms high-level C code into machine-executable instructions. This multi-stage pipeline ensures that human-readable code is converted into efficient binary code that computers can execute directly.
The preprocessing stage prepares the source code for compilation by performing several text manipulations:
#include: Inserts the contents of header files into the source code
#define: Replaces macros and constants with their defined values
#ifdef, #ifndef, #endif: Handles conditional compilation
#pragma: Provides implementation-specific instructions to the compiler
For example, when encountering #include, the preprocessor locates the stdio.h file and inserts its entire content at that position. Similarly, for #define MAX_SIZE 100, every occurrence of MAX_SIZE in the code is replaced with 100.
The output of this stage is an expanded source code file (typically with a .i extension) that contains no preprocessor directives.
The compilation stage converts the preprocessed code into assembly language:
During this stage, the compiler detects errors such as missing semicolons, undeclared variables, type mismatches, and other syntax violations. The output is typically an assembly file (.s extension) containing human-readable assembly instructions.
The assembly stage converts assembly code into machine code:
The resulting object file contains machine instructions that correspond directly to the target processor's instruction set, but it cannot be executed yet as it lacks necessary linking information.
The linking stage combines multiple object files and libraries to create the final executable:
The linker detects errors such as undefined references, multiple definitions of the same symbol, or missing the main() function. The output is a complete executable program ready to be loaded and run by the operating system.
If we used functions stored inside libraries, it will link it to our file so that the program knows where to look at to find said function. There are two types of libraries:
- Static libraries (.lib, .a): its code is put inside the binary file, which increase its size;
- Dynamic libraries (.dll, .so): the name of the library is put inside the binary file and is loaded when first called.
By default, gcc use dynamic libraries.
Static Linking
Static lib have extension (.a)
Example:
- I have a file is isalpha to check if the character is alphabetic while the other checks if the integer is even
main.h
#ifdef MAIN_H
#define MAIN_H
// prototypes
int _iseven(int n);
int _isalpha(int c);
#endif /* MAIN_H*/
isalpha.c
// file isalpha.c
#include "main.h"
int _isalpha(int c) {
if (condition) {
return ...
}
}
Similar with file iseven.c
Then, stop at step assembly with:
terminal
# all file c
gcc -c *.c
Create the static library using:
terminal
ar cr libname.a *.o
Explain command:
Dynamic Linking
With the extension .so for dynamic libraries. Using libraries enable us to call functions without hard coding them inside the source code file as we can call a lot of different functions depending on the program we are creating.
How to create dynamic lib:
terminal
gcc -c -fPIC *.c
Explain command:
terminal
# create lib
gcc -shared -o lib(name).so *.o
Explain command:
To verify that we did it correctly and have the right functions as dynamic symbols we can use nm -D libname.so.
Consider a simple C program in a file named "example.c":
example.c
#include
#define GREETING "Hello, World!"
int main() {
// Print greeting message
printf("%s\n", GREETING);
return 0;
}
The compilation process would proceed as follows:
1. Preprocessing: Comments are removed, GREETING is replaced with "Hello, World!", and the contents of stdio.h are inserted.
2. Compilation: The preprocessed code is analyzed and converted to assembly language specific to the target architecture.
3. Assembly: The assembly code is converted to machine code, creating the object file example.o.
4. Linking: The object file is linked with the C standard library (which contains the implementation of printf) to create the final executable.
To compile a C program using GCC (GNU Compiler Collection), you can use:
terminal
gcc -o example example.c
This command performs all four stages and produces an executable named "example". To perform each stage separately:
terminal
# Preprocessing only
gcc -E example.c -o example.i
# Compilation to assembly
gcc -S example.i -o example.s
# Assembly to object code
gcc -c example.s -o example.o
# Linking to create executable
gcc example.o -o example
# It is possible to keep a file for each steps by doing
gcc file.c -save-temps
Reference link here
Reference link here
Reference link here
Reference link here
Reference link here
Comments