This project implements a compiler for a subset of the C language. It processes C source files through several compilation stages:
- Parsing the source code
- Building an Abstract Syntax Tree (AST)
- Creating an Intermediate Representation (IR)
- Generating x86-64 assembly code
The compiler uses ANTLR4 for lexical analysis and parsing. We implement a symbol table for variable management and construct a Control Flow Graph (CFG) to represent the program as basic blocks containing IR instructions.
Our compiler supports:
- Variable declarations and assignments
- Constant values
- Fixed-size array declarations
- Array initialization with value sequences
- Element access by index
- Addition (+)
- Subtraction (-)
- Multiplication (*)
- Division (/)
- Modulo (%)
- Logical AND (&&)
- Logical OR (||)
- Logical XOR (^)
- Comparison operators (<, >, >=, <=, ==, !=)
- Bit shifting (<<, >>)
- if, if-else, else-if statements
- while loops
- for loops
- Function declarations
- Function calls
- Return statements (allowed anywhere)
- Full support for variable scoping rules
Our compiler implements a specific approach to expression evaluation that differs from GCC in certain cases. It's important to understand this distinction:
GCC evaluates assignments from right to left, treating everything on the left of the assignment operator as an lvalue. Attempting to modify variables directly within expressions can lead to undefined behavior.
For example, in an expression like:
(a=1) || (b=2)Our compiler evaluates this as two separate assignments combined with the logical OR operator, which is the intuitive reading.
Meanwhile, GCC interprets it differently, essentially as:
(a=1 || b) = 2This difference can lead to unexpected results when porting code between compilers.
In our implementation:
- Each variable on the left side of an assignment is treated as an lvalue
- We follow traditional C semantics for most operations
- Be aware of this difference when writing expressions that combine assignments with other operations
Before building the compiler, install ANTLR4 using either:
- Your distribution's package manager
- The provided shell script:
install-antlr.sh
Use the provided Makefile to build the project. The Makefile includes configuration settings for ANTLR4 in a separate .mk file. You may need to modify these variables to match your ANTLR4 installation:
ANTLR: Path to the ANTLR toolANTLRJAR: Path to the ANTLR JAR fileANTLRINC: Path to ANTLR include directoryANTLRLIB: Path to ANTLR library directory
The default make target is used to build the executable compiler program. The make teststarget can also be used to build the executable compiler program and to launch all of our tests at the same time. You can run these commands inside the compiler directory.
After building the compiler, you can use it to compile C source files:
./ifcc input_file.c [-O0] [> output_file.s]
Argument -O0 does no optimization
The generated assembly code can then be assembled and linked using standard tools.
We provide a comprehensive testing framework to verify our compiler's behavior and performance. Test cases are organized by context in the testfiles folder. We also tested our compiler using the test files of hexanome 23 in a separate folder.
After compiling the program, you can run tests using the following commands:
cd compiler
make tests
cd compiler
make tests_optimized
python3 ifcc-test.py testfiles/test_if_else
python3 ifcc-test.py testfiles/test_if_else/5_if_double_comp_ou.c
The testing tool generates multiple output files in the ifcc-test-output directory:
- Assembly code generated by both GCC and our IFCC compiler
gcc-compile.txtandifcc-compile.txt: Compilation messagesgcc-execute.txtandifcc-execute.txt: Execution outputs and return values
This allows for easy comparison between our compiler and the GCC reference implementation.
(Currently we have 10 tests not working over 423)
The compiler can handle programs ranging from simple to complex within the supported feature set:
Simple program:
int main() {
return 42;
}More complex program:
int factorial(int n) {
if (n == 1 || n < 1) {
return 1;
}
return n * factorial(n-1);
}
int main() {
int result = factorial(5);
return result;
}- BOUZIANE Abderrahmane
- WIRANE Hamza
- GRIGUER Mehdi
- BEN BOUZID Selim
- SANCHEZ Lucas
- VIALLETON Rémi
https://github.com/Luyansi3/Compiler-project
