Unit 3 - Sessions 6 - 7 (A)
Unit 3 - Sessions 6 - 7 (A)
COMPILER DESIGN
UNIT 3
SESSION 6 & 7(a)
Topics that will be covered in this Session
• LR Parsers
• Why LR Parsers?
• Items and LR(0) Automaton
• Closure of Item Sets
• LR Parsing Algorithm
LR Parsers
LR Parsers
• LR parsers use an efficient, bottom-up syntax analysis technique that can be used to parse a
large class of context-free grammars
• Generally, the technique is called LR(k) parsing
• The “L” is for left-to-right scanning of the input
• The “R” is for constructing a rightmost derivation in reverse
• The “k” for the number of input symbols of lookahead that are used in making parsing
decisions
• When (k) is omitted, k is assumed to be 1
Why LR Parsers?
Why LR Parsing is Attractive?
• LR parsers can be constructed to recognize all programming language constructs for which
context-free grammars can be written
• It can be efficiently implemented
• The class of grammars that can be parsed using LR methods is a proper superset of the class of
grammars that can be parsed with predictive parsers
• An LR parser can detect a syntactic error as early as possible
Drawbacks of LR Parsing
E’ → E E’ → • E Note:
E→E+T|T E→•E+T All the sets of items of interest is divided into
E→•T two classes:
T→T*F|F
Kernel Items: The initial item, S’ → •S,
F → ( E ) | id T→•T*F and all items whose dots are not at the left
T→•F end
Nonkernel Items: All items with their dots
F→•(E)
at the left end, except for S’ → •S
F → • id
The GOTO Function
• The second useful function is GOTO(I,X) where I is a set of items and X is a grammar symbol
• GOTO(I,X) is defined to be the closure of the set of all items [A → X • β] such that [ A → •X β]
is in I
• The GOTO function is used to define the transitions in the LR(0) automaton for a grammar
• The states of the automaton correspond to sets of items, and GOTO*I,X) specifies the transition
from the state for I under input X
Example:
If I is the set of two items { [ E’ → E •], [E → E • + T ] }, then GOTO(I,+) contains the items
E→E+•T
Note: The grammar considered is
T→•T*F
Examine I for items with + immediately to E→E+T|T
T→•F
the right of the dot. We have E → E•+ T T→T*F|F
F→•(E)
Move the dot over the + to get E → E + • T F → ( E ) | id
F → • id
Take the closure of this singleton set
LR(0) Automaton
LR(0) Automaton LR(0) Automaton for the expression grammar
Xi – a grammar symbol
si – state symbol
• Instead of grammar symbols, the stack holds states from which grammar symbols can be
recovered
• Xi is the grammar symbol represented by state si
• s0 is the start state of the parser and it does not represent any grammar symbol
• s0 serves as the bottom marker of the stack and it pays an important role in parsing
The LR Parsing
Algorithm – cont..
Example: LR Parsing
Consider the expression grammar Moves of an LR Parser for the string id*id+id
(1) E → E + T (2) E → T Stack Input Action
(3) T → T * F (4) T → F 0 id * id + id $ Shift
(5) F → ( E ) (6) F → id 0 id 5 * id + id $ Reduce by F → id
0F3 * id + id $ Reduce by T → F
The LR Parsing Table
0T2 * id + id $ Shift
0T2*7 id + id $ Shift
0 T 2 * 7 id 5 + id $ Reduce by F → id
0 T 2 * 7 F 10 + id $ Reduce by T → T * F
0T2 + id $ Reduce by E → T
0E1 + id $ Shift
0E1+6 id $ Shift
0 E 1 + 6 id 5 $ Reduce by F → id
0E1+6F3 $ Reduce by T → F
0E1+6T9 $ Reduce by E → E + T
0E1 $ Accept
Techniques for constructing LR parsing table
1. Simple LR (SLR)
• Easiest to implement
• The least powerful of the three
• It may fail to produce a parsing table for certain grammars on which the other methods
succeed
2. Canonical LR (CLR)
• The most powerful
• The most expensive
3. Lookahead LR (LALR)
• Intermediate in power and cost between the other two
LR Grammar