ccc - full native C compiler
This is a 2-pass C compiler written in C, currently under reconstruction from a paper printout. Pass 1 (cc1) is complete; pass 2 (cc2) is actively being developed and generates Z80 assembly code.
Pass 1 (cc1) - Complete ✓ Tagged as cc1_complete and self-parse
- Full C preprocessor, type system, expression/statement parsing, AST emission
- 142 tests passing, 18/18 source files self-host
- ~7,500 lines of C code
- See CLAUDE.md for detailed architecture and features
Debugging Tools
- AST Interpreter (interp.lisp): Execute AST without code generation
- AST Pretty Printer (astpp.lisp): Format AST for human inspection
- See INTERP.md and ASTPP.md for details
Pass 2 (cc2) - Active Development ✓ Generating Z80 Assembly
- Tree-based AST parser with complete function representation (~3,400 lines)
- Three-phase code generation: parse → codegen → emit
- Register allocation and stack frame management
- Generates working Z80 assembly for simple functions
- See CC2_ARCHITECTURE.md for implementation details
This is a 2-pass compiler:
Pass 1 (cc1): Recursive descent parser with embedded C preprocessor
- Parses and validates C source code
- Outputs AST in S-expression format (single-char operators)
- Uses Unix syscalls (write) instead of stdio for output
- ~7,500 lines of C code
Pass 2 (cc2): Tree-based code generator targeting Z80
- Reads AST from pass 1 (S-expression format)
- Three-phase architecture: parse → codegen → emit
- Builds complete function trees in memory before code generation
- Register allocation and stack frame management
- Uses Unix syscalls (read/write) instead of stdio
- parseast.c: ~3,400 lines (parser, code generation, emission)
- Handles memory width annotations (:b :s :l :p :f :d)
- Generates Z80 assembly code
Pass 1 (cc1) files:
- cc1.c - Main entry point, orchestration
- lex.c - Lexical analyzer (tokenizer)
- parse.c - Statement and declaration parsing
- expr.c - Expression parsing with precedence
- type.c - Type system management
- declare.c - Declaration processing
- outast.c - AST emission in S-expression format
- macro.c - CPP macro definition and expansion
- io.c - Character I/O and file stack management
- error.c - Error reporting
- util.c - Utilities (fdprintf, bitdef, etc.)
- kw.c - Keyword lookup tables
Pass 2 (cc2) files:
- cc2.c - Main entry point, command-line processing
- parseast.c - Table-driven AST parser
- util.c - Shared utilities (fdprintf)
Auto-generated files:
- tokenlist.c, enumlist.h - Token definitions
- error.h - Error code definitions
- debug.h, debugtags.c - Debug/verbose infrastructure
- op_pri.h - Operator priority table
Stub system headers (include/):
- stdio.h, stdlib.h, string.h, stdarg.h - C standard library stubs
- fcntl.h, unistd.h, signal.h - POSIX system call stubs
- libgen.h - Path manipulation stubs
- sys/stat.h, sys/wait.h - System header stubs
- Minimal declarations to avoid GNU libc advanced preprocessor features
Using the ccc driver (recommended):
# Full compilation (when cc2 is complete)
./ccc -o program source.c
# Execute with interpreter (debugging/testing)
./ccc -x source.c
# Keep intermediate AST file
./ccc -k -o program source.cPass 1 - Parse and output AST:
./cc1 -E source.c > output.astPass 2 - Generate Z80 assembly from AST:
./cc2 output.ast # Generates output.s assembly file
./cc2 output.ast -o custom.s # Specify output fileFull pipeline (when complete):
./cc1 -E source.c | ./cc2 -o executableThe -x option executes the generated AST with a Common Lisp interpreter,
providing a way to validate that the parser is producing correct AST without
needing a working code generator.
Quick validation:
./ccc -x tests/arith_widths.cThis compiles the source to AST, then executes it with the interpreter. If the program runs and produces the expected result, the parser is working correctly.
Debugging workflow:
- Write a test program with known expected behavior
- Compile and execute with
-x:./ccc -x mytest.c
- Check the exit code and output match expectations
- If incorrect, inspect the AST file (automatically saved as
mytest.ast) - Compare AST structure against expected operations
Example - verify arithmetic:
// test.c
int main() {
int a = 10;
int b = 20;
int c = a + b;
return c; // Should return 30
}$ ./ccc -x test.c
=== Pass 1: Parsing test.c ===
=== Executing AST with interpreter ===
Program exited with code: 30
AST saved to: test.astThe exit code of 30 confirms the parser correctly:
- Parsed declarations
- Generated assignment operations
- Performed arithmetic
For visual inspection of AST structure, use the standalone pretty printer:
# Generate AST
./cc1 -E test.c > test.ast
# Pretty print with human-readable formatting
./astpp.lisp test.astOutput:
FUNCTION main() -> _short_
{
BLOCK {
DECL a : _short_
DECL b : _short_
DECL c : _short_
EXPR:
(ASSIGN:short $a (NARROW:short 10))
EXPR:
(ASSIGN:short $b (NARROW:short 20))
EXPR:
(ASSIGN:short $c (ADD (DEREF:short $a) (DEREF:short $b)))
RETURN (DEREF:short $c)
}
}
The pretty printer translates single-char operators to readable names (M→DEREF, =→ASSIGN, +→ADD, etc.) and shows type width annotations, making it easy to verify the AST structure at a glance.
Use cases:
- Debug parser output by visual inspection
- Understand AST structure for complex constructs
- Compare AST between different versions
- Learn the AST format
See ASTPP.md for complete documentation.
Benefits of interpreter-based debugging:
- Test parser without implementing code generator
- Validate type conversions and promotions
- Verify control flow (loops, conditionals, function calls)
- Confirm expression evaluation and constant folding
- Quick iteration on parser changes
Interpreter limitations:
- Simplified memory model (doesn't simulate real memory addresses)
- No pointer arithmetic validation
- Type conversions are pass-through (no actual narrowing/widening)
- Some operations simplified for interpretation
See INTERP.md for complete interpreter documentation.