A simple REPL-Compiler-Interpreter example using Roman Numerals following the Project Euler rules.
This repo is supplemental material for project pypethon with the goal of providing a simple compiler and interpreter example. Romnomnom compiles Roman Numerals into Python bytecode and includes a tiny REPL for evaluating the compiled code interactively:
$ python3 romnomnom
> I
1
> II
2
> III
3
> IV
4
> V
5
The Romnomnom project includes a self-guided tutorial. The tutorial provides the starting point for a Romnomnom implementation in a single file and encourages the reader to implement the full scope of Roman Numeral rules that Romnomnom supports.
Romnomnom weighs in around 250 lines of code. Hopefully it's a nice, easy read. And maybe even easier after scrolling through this README. The diagrams below explain what's happening in the code and provide plenty of context which will be helpful before working through the pypethon tutorial.
A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language). - wikipedia.org/Compiler
That's it. In Romnomnom, our source language is the language of Roman Numerals and our target language is executable Python bytecode.
The three common patterns found in compilers that we'll explore in Romnomnom are
lexing,
parsing,
and code generation. These three patterns make up the
three "phases" of the compiler and can be seen composed together plainly in src/romnomnom/compiler.py
:
def compile(source):
return generate(parse(lex(source)))
Looking a little bit closer at each of the phases, here's how a simple compiler and interpreter might work together:
This is actually exactly how Romnomnom works (and, basically, how Python works).
There are two general approaches to programming language implementation:
- Interpretation: An interpreter takes as input a program in some language, and performs the actions written in that language on some machine.
- Compilation: A compiler takes as input a program in some language, and translates that program into some other language, which may serve as input to another interpreter or another compiler.
Notice that a compiler does not directly execute the program. Ultimately, in order to execute a program via compilation, it must be translated into a form that can serve as input to an interpreter. - wikipedia.org/Programming_language_implementation
Perfect! Let's work through the diagram above in detail using the following Romnomnom REPL session as an example:
$ python3 romnomnom
> XLII
42
>
Code: src/romnomnom/lexer.py
Related terminology: Lexing, Lexical Analysis, Scanning, Tokenization
Code: src/romnomnom/parser.py
Related terminology: Parsing, Syntactic Analysis, Semantic Analysis
Code: src/romnomnom/generator.py
Related terminology: Code Generation, Intermediate Representation
Code: ./romnomnom
Related terminology: Read Eval Print Loop, Interpreter