Phases of Compiler with example

A phase is a logically cohesive operation that takes as input one representation of the source program and produces as output another representation. A compiler takes as input a source program and produces as output an equivalent sequence of machine instructions.

  1. Lexical Analyzer
  2. Syntax Analyzer
  3. Intermediate Code Generator
  4. Code Optimization
  5. Code Generation

1. Lexical Analyzer

  • This is the first phase of the compiler.
  • Also called a scanner.
  • Separates characters of the source language into groups that logically belong together.
  • Groups are called tokens (DO or IF, identifiers, operator symbols like <= or +, punctuation symbols like parentheses or commas).
  • The output of the lexical analyzer is a stream of tokens.
  • These tokens are passed to the next phase.
  • The tokens are represented by codes (e.g. DO might by 1, + by 2, identifier by 3, etc.).

2. Syntax Analyzer

  • This is the second phase of the compiler.
  • Also called a parser.
  • Groups tokens together into a syntactic structure called expression.
  • Expressions might be combined to form statements.
  • The syntactic structure can be regarded as a tree whose leaves are tokens.
  • The interior nodes of the tree represent strings of tokens that logically belong together.

3. Intermediate Code Generator

  • This is the third phase of the compiler.
  • Uses the structure produced by syntax analyzer to create a stream of simple instruction.
  • There may be many styles of intermediate code.
  • The most common style is instruction with one operator and a small no. of operands.
  • Instructions are like macros.
  • Intermediate code need not specify the registers to be used for each operation.

4. Code Optimization

  • This is the fourth and optional phase of the compiler.
  • Designed to improve the intermediate code.
  • Ultimate object program runs faster, takes less space.
  • Its output is another intermediate code program doing the same job as the original.
  • Saves time or space.

5. Code Generation

  • This is the last phase of the compiler.
  • Produces object code by deciding
    • Memory locations for data.
    • Selecting code to access each datum.
    • Selecting the registers in which each computation is to be done.
  • One of the difficult parts of the compiler.

Apart from these phases Routines that interact with all phases of compiler are

Table Management

  • Also called bookkeeping.
  • The compiler keeps track of the names used by the program.
  • Records essential information about each (such as integer, real, etc.).
  • The data structure used to record this information is called a symbol table.

Error Handler

  • Invoked when a flaw in the source program is detected.
  • Must warn the programmer by issuing diagnostic information.
  • The compilation is completed on flawed programs, at least through the syntax analysis phase, so that as many errors can be detected in one compilation.

Passes

In a compiler, portions of one or more phases are combined into a
a module called a pass.

  • A pass reads the source program or output of the previous pass.
  • Makes the transformations specified by its phases.
  • Writes output into an intermediate file, which may be read by a subsequent pass.
  • A multi-pass compiler is slower than a single pass compiler because each pass reads and writes an intermediate file.
  • Compiler running on small memory computer would use several passes.
  • Computer with a large RAM, fewer passes would be possible.

Leave a Reply