What does the parser do?
Group tokens into grammatical phrases
Discovers underlying structure of program
finds syntax errors
What is output of parser for legal program?
Abstract syntax tree (+ symbol table)
intermediate code
object code
terminal symbol
Tokens returned by scanner
productions
rules
Production LHS
Single non-terminal
Production RHS
Either epsilon or a sequence of one or more terminals and/or non-terminals
CFG 4-tuple (N, E, P, S)
N - set of nonterminals
E - set of terminals
P - set of productions
S - start non-terminals
Leftmost derivation
The leftmost nonterminal is always chosen to be replaced
Rightmost derivation
The rightmost nonterminal is always chosen to be replaced
How to construct a parse tree
Start with start non-terminal
Repeat: 1) choose a leaf nonterm X 2) choose a production X->alpha 3) The symbols in alpha become the children of X in the tree
The derived string is formed by reading leaf nodes from _______ to _______
The derived string is formed by reading the leaf nodes from left to right
Ambiguous grammer
> 1 leftmost or > 1 right-most derivation or > 1 parse tree
How to write a grammer to express precedence
Use a different nonterminal for each precendence leve
Start by writing a rule for the operator with the lowest precedence
exp->
What causes ambiguity?
Ill defined precedence and/or associativity
For left associativity, use
left recursion
For right associativity, use
right recursion
How to force left associativity?
exp->exp MINUS exp | term
exp->exp MINUS term | term
Syntax directed translation
defined by augmenting the CFS: a translation rule is defined for each production
A translation rule defines the translation of the LHS non-terminal as a function of:
constants
the RHS nonterminal’s translations
the RHS tokens’ values
To translate an input string using syntax directed translation:
2. Use the translation rules to computer the translation of each non-terminal in the tree, working bottom up
Why work bottom up for syntax directed translation?
A non-terminal’s value may depend on the value of the symbols on the RHS so need to work bottom up so those values are available
AST vs. Parse Tree
AST: operators appear at internal nodes instead of leaves, chains of single productions are collapsed, listed a flattened, syntactic details are omitted
Context Free Grammar
A set of recursive rewriting rules to generate patterns of strings
CFG in compiler
Start with a string and end with a parse tree for w if w exits in L(G)