forked from nachivpn/silcnitc-monsoon-report
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathwork-done.tex
47 lines (29 loc) · 6.5 KB
/
work-done.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
\chapter{Literature review}
\section{Compilation process review}
A compiler translates a given source program in a specific programming language into executable code\cite{citation-2-name-here}. The executable code generated is dependent on the machine architecture on which the compiler is being built upon. A simple compiler (without any code optimizer) consist of five phases: Lexical Analysis, Syntax Analysis, Semantic Analysis, Intermediate code generation and Code generation (machine architecture specific)\cite{citation-1-name-here}. The first two phases are the initial focus of the project and are discussed below.
\section{Lexical analysis using LEX}
Lexical analysis is the process of breaking up a source program into tokens. A lexical token is a sequence of characters that can be treated as a unit in the grammar of a programming language \cite{citation-1-name-here}. A lexical analyzer scans a given input and produces an output of tokens.
LEX is a tool that translates a set of regular expression specifications into a C implementation of a corresponding finite state machine\cite{citation-1-name-here}. This C program when compiled, yields an executable lexical analyzer. Conceptually, LEX constructs a finite state machine to recognize all the regular expression patterns specified in the LEX program file. The lex.yy.c program stores information about the finite state machine in the form of a decision table (transition table). LEX makes it's decision table visible if we compile the LEX program with the \texttt{-T} flag. The finite state machine used by LEX is a deterministic finite state automaton (DFA). The lex.yy.c file simulates the DFA.
Also, LEX offers features to execute a single or compound C statement when a pattern match is found in the input stream. Given its ability to scan and identify a given pattern, and the ability to execute a corresponding action, LEX can be used to generate a lexical analyzer.
\section{Syntax analysis using YACC}
Syntax analysis follows lexical analysis in the compilation process. The syntax of a programming language can be expressed using Context Free Grammars (CFG). Any sentential form of the programming language's grammar is considered a syntactically correct program. The process of checking whether a program can be derived from the programming language's grammar is referred to as \textit{parsing}.
YACC (Yet Another Compiler Compiler) was developed in 1970 by Stephen C. Johnson at AT\& T Corporation. YACC is a tool that translates the given CFG specification in a YACC program to a corresponding Push Down Automaton (PDA) implementation in C language. The generated C program when compiled, yields an executable parser. The source program is fed to the parser to check if it is syntactically correct.
\section{Semantic analysis using YACC}
In addition to syntax analysis, YACC also provides features to support semantic analysis of the source program. Semantic analysis is achieved using C code. C code can be extensively embedded into a YACC program. YACC provides support for add an action to be executed with every grammar rule. These actions are written in C. To support the C code in the actions section, YACC provides an auxiliary functions section (also written in C).
\chapter{Design}
This chapter contains the proposed design which has been followed up till the current state of the project and will be followed (or improvised upon) in later developments of the project if any.
\section{Documentation}
This project will consist of vast amount of documentation on the usage of LEX and YACC to generate a compiler. Initially, the documentation phase will concentrate more on the mastery of the tools and gradually introduce compiler design concepts and how these can be implemented using these tools. The first four stages of the documentation have been designed and compiled successfully. More details of which can be noted in the "Current status" section.
The documentation follows a very simple explanation approach using plenty of examples and input/output samples. The documentation has been extensively embedded with URLs which link to external or internal resources and references. The idea is to reduce the amount of time a learner spends in the beginning stages and instead invest this time in later on stages of compiler construction which are comparatively more complex. This way, the learner will be able to spend more time in implementing challenging compiler patches instead of spending a lot of time in the learning phase.
\section{Testing}
The document has been embedded with plenty of code in examples and exercises. Before being used in the documentation, each and every code snippet has been implemented and tested in the laboratory.
\section{Source Language}
This project involves developing a compiler for a source language. Even though the project is still in it's early phases of compiler development, it proposes a source language using which the later stages of the project can be developed. SIL\cite{citation-3-name-here} is the chosen source language. As a part of the project, a few extensions have been provided to the existing language specification of SIL. These extensions could play a vital role in the learning process.
\section{Roadmap}
The Road map is the key to the achieving the project's objective. Along with the documentation and given code snippets, a learner would be asked to follow the roadmap. This project provides the the first four stages of the roadmap.
\section{Version Control}
Since this project contains various components and evolves through many stages, it needs to be maintained using a version control system. The advantage of using one would be the ease of being able to roll back to any version at any point of time during the development phase of the project. We have chosen Git for this purpose.
\section{Online platform}
To enhance the availability of the project to students, this project will be hosted online at the public domain \textbf{silcnitc.github.io}. The website is being developed with HTML5, CSS3 and JavaScript. Github is a remote server for Git. Under licensed conditions, the project will be released on an open source basis on Github.
\section{Assembling the framework}
Students will use the roadmap to build a compiler for SIL. Towards the code generation phase, they would be instructed to generate code for the SIM architecture\cite{citation-4-name-here}. Once all the individual components have been completely developed, they will be tested and/or proof read several times before they will be integrated with each other appropriately on the website.