3 Syntax Analysis
3 Syntax Analysis
College of science
Department of Computer Science
Compiler
Second Phase of the Compiler
Syntax Analyzer
2023-2024
Mzhda Hiwa Hama
Syntax Analyzer
• A parser scans the input string from left to right and it makes
use of production rules for choosing appropriate derivation.
x P z
Example1…cont’d
S xPz
2- Select rule P yw P yw | y
• Not correct S
x P z
y w
3- Select rule P y S
• Correct
x P z
y
Example2
• Try T var* T
• Token matches.
• This will match but + after T will be unmatched
• Has exhausted the choices for E T + E E
T E
+
• Backtrack to choice for E var * T
Example2…cont’d
• Try E T
• Follow same steps as before for T
• Try a rule for T var
E
Token matches var
• But there is no other token after var T
var
• Try T var* T E
• Then try T var
• Succeed with the following parse tree T
var T
*
var
Notes
S
•Easy to implement by hand.
S a
•But does not always work …
S a
•Consider a production S S a S a
•S will get into an infinite loop.
S a
•This case is called left-recursion. .
.
•Recursive descent does not work in .
such cases.
Left Recursion
• A production of grammar is said to have left recursion if the
leftmost variable of its RHS is same as variable of its LHS.
• A grammar containing a production having left recursion is
called as Left Recursive Grammar.
E → E + T|T
Eliminate immediate left recursion from the Grammar
Solution
Comparing E → E + T|T with A → A α |β
A = E, α = +T, β = T
A → A α |β is changed to A → βA′ and A′ → α A′|ε
A → βA′ means E → TE′
A′ → α A′|ε means E′ → +TE′|ε
Result
E → TE′
E′ → +TE′|ε
2- Top-Down Parsing by Predictive parser
• Consider Grammar
E → T E’
E’ → + T E’ | ε
T → F T’
T’ → * F T’ | ε
F → ( E ) | id
• String id + id * id
2- Top-Down Parsing by Predictive parser
• Predicts which production to use
• By looking at the next few tokens, using “lookahead”
variable
• No backtracking
• Predictive parsers accept LL(k) grammars
• L means “Left-to-Right” scan of input
• L means “Leftmost derivation”
• k means “predict based on k tokens of lookahead”
• In practice, LL(1) is used
• LL(k) grammar must be unambiguous
• LL(k) grammar must not include any left-recursion
LL(1) Parser
• input buffer : The string to be parsed
• Output: A production rule representing a step of the
derivation sequence (left-most derivation) of the string in the
input buffer.
• Stack: keeps the grammar symbols, initially contains $.
• The symbols in RHS of rule are pushed into the stack in
reverse order i.e. from right to left
• parsing table:
• a two-dimensional array
• each row is a non-terminal symbol
• each column is a terminal symbol or the special symbol $
• each entry holds a production rule.
20
Input token a + b
LL(1) Parser
Output
top
$
Stack
a + b $ Parsing table
A
B
C
Building Predictive Parser
Three steps
1. Compute FIRST and FOLLOW
2. Construct the predictive parsing table
3. Parse the input string
• S Aa
A bdZ |eZ
Z cZ |adZ | ε
Computing FOLLOW
• FOLLOW(E) = {) , $}
FOLLOW(E’) = {) , $}
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {*, +, ), $}
Example 2
E → T E'
E' → + T E‘ | - T E‘ | ε
T → F T'
T' → * F T‘ | / F T' | ε
F → num | id
First Follow
E {num , id} {$}
E’ {+, -, ε} {$}
T {num , id} {+, -, $}
T’ {*, /, ε} {+, -, $}
F {num , id} {*, /, +, -, $}
29
Notes
E → T E’
E’→ + T E’ | ε
T → F T’
T’→ * F T’ | ε
F → ( E ) | id
First Follow
E {( , id} {), $}
E’ {+, ε} {), $}
T {( , id} {+, ), $}
T’ {*, ε} {+, ), $}
F {( , id} {*, +, ), $}
32
Parse table Example 1
+ * ( ) id $
E
E’
T
T’
F
33
Parse table Example 1
• For production E → T E’
• Add E → T E’ to the E row for each symbol in FIRST(E)
+ * ( ) id $
E E → T E’ E → T E’
E’
T
T’
F
Parse table Example 1
+ * ( ) id $
E E → T E’ E → T E’
E’ E’→ + T E’ E’ → ε E’ → ε
T
T’
F
35
Parse table Example 1
• For production T → F T’
• Add T → F T’ to the T row for each symbol in FIRST(T)
+ * ( ) id $
E E → T E’ E → T E’
E’ E’→ + T E’ E’ → ε E’ → ε
T T → F T’ T → F T’
T’
F
36
Parse table Example 1
• For production T’→ * F T’ | ε
• Add T’→ * F T’ to the T’ row for each symbol in FIRST(T’)
• Add T’ → ε for each symbol in FOLLOW(T’)
+ * ( ) id $
E E → T E’ E → T E’
E’ E’→ + T E’ E’ → ε E’ → ε
T T → F T’ T → F T’
T’ T’ → ε T’→ * F T’ T’ → ε T’ → ε
F
37
Parse table Example 1
• For production F → ( E ) | id
• Add F → ( E ) to the F row and symbol (
• Add F → id to the F row and symbol id
+ * ( ) id $
E E → T E’ E → T E’
E’ E’→ + T E’ E’ → ε E’ → ε
T T → F T’ T → F T’
T’ T’ → ε T’→ * F T’ T’ → ε T’ → ε
F F→(E) F → id
38
Parser Actions
• The symbol at the top of the stack (say X) and the current symbol in
the input string (say a) determine the parser action.
3. If X is a non-terminal
parser looks at the parsing table entry M[X,a]. If M[X,a] holds a
production rule XY1Y2...Yk, it pops X from the stack and pushes
Yk,Yk-1,...,Y1 into the stack. The parser also outputs the production rule
XY1Y2...Yk to represent a step of the derivation.
S aBa
B bB |
Input is id+id
stack input output
$E id+id$ E TE’
$E’T id+id$ T FT ’
$E’ T ’F id+id$ F id
$ E’ T ’id id+id$
$ E’ T ’ +id$ T’
$ E’ +id$ E’ +FT ’
$ E’ T+ +id$
$ E’ T id$ T FT ’
$ E’ T ’ F id$ F id
$ E’ T ’id id$
$ E’ T ’ $ T’
$ E’ $ E’
$ $ accept