0% found this document useful (0 votes)
81 views14 pages

Non-Recursive Predictive Parsing

This document discusses non-recursive predictive parsing and describes how to build a table-driven predictive parser without using recursion. It explains that a parsing table and set of parsing rules can be automatically generated from a grammar. The key aspects covered are: defining grammars, constructing FIRST and FOLLOW sets, identifying LL(1) grammars, building parsing tables, and performing error recovery by synchronizing on certain terminals.

Uploaded by

lahsivlahsiv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views14 pages

Non-Recursive Predictive Parsing

This document discusses non-recursive predictive parsing and describes how to build a table-driven predictive parser without using recursion. It explains that a parsing table and set of parsing rules can be automatically generated from a grammar. The key aspects covered are: defining grammars, constructing FIRST and FOLLOW sets, identifying LL(1) grammars, building parsing tables, and performing error recovery by synchronizing on certain terminals.

Uploaded by

lahsivlahsiv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Non-recursive predictive parsing

Observation:

Our recursive descent parser encodes state information in its


run-time stack, or call stack.

Using recursive procedure calls to implement a stack abstraction may not


be particularly efficient.

This suggests other implementation methods:

• explicit stack, hand-coded parser


• stack-based, table-driven parser

1
Non-recursive predictive parsing

Now, a predictive parser looks like:

stack

source tokens table-driven


scanner IR
code parser

parsing
tables

Rather than writing code, we build tables.

Building tables can be automated!


2
Table-driven parsers

A parser generator system often looks like:

stack

source tokens table-driven


scanner IR
code parser

parser parsing
grammar
generator tables

This is true for both top-down (LL) and bottom-up (LR) parsers

3
Non-recursive predictive parsing

Input: a string w and a parsing table M for G

tos ← 0
Stack[tos] ← EOF
Stack[++tos] ← Start Symbol
token ← next token()
repeat
X ← Stack[tos]
if X is a terminal or EOF then
if X = token then
pop X
token ← next token()
else error()
else /* X is a non-terminal */
if M[X,token] = X → Y1Y2 · · ·Yk then
pop X
push Yk ,Yk−1, · · · ,Y1
else error()
until X = EOF
4
Non-recursive predictive parsing

What we need now is a parsing table M.

Our expression grammar: Its parse table:


1 hgoali ::= hexpri
id num + − ∗ / $†
2 hexpri ::= htermihexpr′i
3 hexpr′i ::= +hexpri hgoali 1 1 – – – – –
4 | −hexpri hexpri 2 2 – – – – –
5 | ε hexpr′i – – 3 4 – – 5
6 htermi ::= hfactorihterm′i htermi 6 6 – – – – –
7 hterm′ i ::= ∗htermi hterm′i – – 9 9 7 8 9
8 | /htermi hfactori 11 10 – – – – –
9 | ε
10 hfactori ::= num
11 | id


we use $ to represent EOF
5
FIRST
For a string of grammar symbols α, define FIRST(α) as:

• the set of terminal symbols that begin strings derived from α:


{a ∈ Vt | α ⇒∗ aβ}
• If α ⇒∗ ε then ε ∈ FIRST (α)
FIRST (α) contains the set of tokens valid in the initial position in α
To build FIRST (X):

1. If X ∈ Vt then FIRST (X) is {X}


2. If X → ε then add ε to FIRST (X)
3. If X → Y1Y2 · · ·Yk :
(a) Put FIRST (Y1) − {ε} in FIRST (X)
(b) ∀i : 1 < i ≤ k, if ε ∈ FIRST (Y1) ∩ · · · ∩ FIRST(Yi−1)
(i.e., Y1 · · ·Yi−1 ⇒∗ ε)
then put FIRST(Yi) − {ε} in FIRST(X)
(c) If ε ∈ FIRST (Y1) ∩ · · · ∩ FIRST (Yk ) then put ε in FIRST (X)
Repeat until no more additions can be made.
6
FOLLOW

For a non-terminal A, define FOLLOW(A) as

the set of terminals that can appear immediately to the right of A


in some sentential form

Thus, a non-terminal’s FOLLOW set specifies the tokens that can legally
appear after it.
A terminal symbol has no FOLLOW set.
To build FOLLOW(A):

1. Put $ in FOLLOW(hgoali)
2. If A → αBβ:
(a) Put FIRST (β) − {ε} in FOLLOW(B)
(b) If β = ε (i.e., A → αB) or ε ∈ FIRST (β) (i.e., β ⇒∗ ε) then put
FOLLOW(A) in FOLLOW(B)
Repeat until no more additions can be made

7
LL(1) grammars

Previous definition

A grammar G is LL(1) iff. for all non-terminals A, each distinct pair


of productions A → β and A → γ satisfy the condition
FIRST (β) FIRST (γ) = φ.
T

What if A ⇒∗ ε?
Revised definition

A grammar G is LL(1) iff. for each set of productions


A → α1 | α2 | · · · | αn :
1. FIRST (α1), FIRST (α2), . . . , FIRST (αn) are all pairwise disjoint
2. If αi ⇒∗ ε then FIRST(α j ) FOLLOW (A) = φ, ∀1 ≤ j ≤ n, i 6= j.
T

If G is ε-free, condition 1 is sufficient.

8
LL(1) grammars

Provable facts about LL(1) grammars:

1. No left-recursive grammar is LL(1)


2. No ambiguous grammar is LL(1)
3. Some languages have no LL(1) grammar
4. A ε–free grammar where each alternative expansion for A begins with
a distinct terminal is a simple LL(1) grammar.

Example

• S → aS | a is not LL(1) because FIRST(aS) = FIRST(a) = {a}


• S → aS′
S′ → aS′ | ε
accepts the same language and is LL(1)

9
LL(1) parse table construction

Input: Grammar G
Output: Parsing table M
Method:

1. ∀ productions A → α:
(a) ∀a ∈ FIRST (α), add A → α to M[A, a]
(b) If ε ∈ FIRST (α):
i. ∀b ∈ FOLLOW(A), add A → α to M[A, b]
ii. If $ ∈ FOLLOW(A) then add A → α to M[A, $]
2. Set each undefined entry of M to error

If ∃M[A, a] with multiple entries then grammar is not LL(1).

Note: recall a, b ∈ Vt , so a, b 6= ε
10
Example
Our long-suffering expression grammar:
S→E T → FT ′
E → T E′ T ′ → ∗T | /T | ε
E ′ → +E | −E | ε F → id | num

FIRST FOLLOW
S {num, id} {$}
E {num, id} {$}
E ′ {ε, +, −} {$} id num + − ∗ / $
T {num, id} {+, −, $} S S→E S→E − − − − −
T ′ {ε, ∗, /} {+, −, $} E ′
E → TE E → TE ′
− − − − −
F {num, id} {+, −, ∗, /, $} E′ − − ′ ′
E → +E E → −E − − E →ε

′ ′
id {id} − T T → FT T → FT − − − − −
num {num} − T′ − − T → ε T → ε T → ∗T T → /T T → ε
′ ′ ′ ′ ′

∗ {∗} − F F → id F → num − − − − −
/ {/} −
+ {+} −
− {−} −

11
Building the tree

Again, we insert code at the right points:


tos ← 0
Stack[tos] ← EOF
Stack[++tos] ← root node
Stack[++tos] ← Start Symbol
token ← next token()
repeat
X ← Stack[tos]
if X is a terminal or EOF then
if X = token then
pop X
token ← next token()
pop and fill in node
else error()
else /* X is a non-terminal */
if M[X,token] = X → Y1Y2 · · ·Yk then
pop X
pop node for X
build node for each child and
make it a child of node for X
push nk ,Yk , nk−1,Yk−1, · · · , n1,Y1
else error()
until X = EOF

12
A grammar that is not LL(1)
hstmti ::= if hexpri then hstmti
| if hexpri then hstmti else hstmti
| ...

Left-factored:
hstmti ::= if hexpri then hstmti hstmt′i | . . .
hstmt′i ::= else hstmti | ε

Now, FIRST(hstmt′i) = {ε, else}


Also, FOLLOW(hstmt′i) = {else, $}
But, FIRST(hstmt i) FOLLOW(hstmt′i) = {else} 6= φ
′ T

On seeing else, conflict between choosing

hstmt′i ::= else hstmti and hstmt′i ::= ε

⇒ grammar is not LL(1)!


The fix:

Put priority on hstmt′i ::= else hstmti to associate else with


closest previous then.
13
Error recovery
Key notion:

• For each non-terminal, construct a set of terminals on which the


parser can synchronize
• When an error occurs looking for A, scan until an element of
SYNCH (A) is found

Building SYNCH:

1. a ∈ FOLLOW(A) ⇒ a ∈ SYNCH (A)


2. place keywords that start statements in SYNCH (A)
3. add symbols in FIRST (A) to SYNCH (A)
If we can’t match a terminal on top of stack:

1. pop the terminal


2. print a message saying the terminal was inserted
3. continue the parse
(i.e., SYNCH (a) = Vt − {a})
14

You might also like