The Structure of a Compiler

The compiler can be overviewed as a single box for translating the source program to the target equivalent program. When we carefully analyze this single box we find two primary functions that are taking part in the mapping of source program to equivalent target program in that box and they are – analysis and synthesis. The analysis part can be said as the front end of the compiler and the synthesis part can be said as the back end of the compiler.

Analysis

The work of the analysis part is to break the source program into their simpler constituent pieces and impose grammatical structure on those constituent pieces. With the constituent pieces and grammatical structure, the compiler creates an intermediate representation of the source program. 

When the analysis part detects that the source program does not follow the imposed grammatical structure then the analysis part can declare the source program to be either syntactically ill-formed or semantically unsound. The compiler will return an error or warning message to the user for taking some corrective actions. 

The analysis part also collects information about the source program and stores it in a symbol table, which is passed along the intermediate representation of the source program. 

Synthesis

The synthesis constructs the desired target program from the intermediate representation and the symbol table that it receives from the analysis part.

 

The compilation process operates in a sequence of phases, in each phase of compilation the compiler converts one representation of the source program to another representation that is suitable for that phase. Many of these phases can be grouped together and the intermediate representation for all these phases need not be constructed separately. 

There are some compilers that have a machine-independent optimization phase, the purpose of this machine optimization phase to perform a transformation on the intermediate representation to produce a better (or optimized) target program then it would have produced without the optimization phase. 

Phases of the compilation process

Lexical Analysis – the lexical analysis is the first phase in the compilation process. The lexical analysis reads the characters in the sources program and group the characters into meaningful sequences. Then it passes these tokens to the syntax analysis phase.

Syntax Analysis – the parser in the syntax analysis phase creates an intermediate tree-like representation (syntax tree) that depicts the grammatical structure of the token. 

Semantic Analysis – the semantic analyzer uses the syntax tree and symbol table for intermediate code generation. 

Intermediate code generation – after the syntax and semantic analysis, the compiler produces a machine-like intermediate representation. This code is easy to translate into the target machine code.

Code Optimization – this phase improves the intermediate code to produce a better target machine code.

Code Generation – the code generation phase receives the input of the machine level optimized code and maps it to the target language.

Symbol table – records the variable name present in the source programs and collects information about the attributes of these variable names.

Leave a Comment