0% found this document useful (0 votes)
3 views

CD_Unit3

The document covers the concepts of bottom-up parsing in compiler design, focusing on techniques such as Shift Reduce Parsing, Operator Precedence Parsing, and various types of LR Parsers. It explains the processes of reductions, handle pruning, and the construction of parsing tables, along with examples and potential conflicts that may arise during parsing. Additionally, it discusses the advantages and disadvantages of different parsing methods and provides algorithms for implementing these techniques.

Uploaded by

bm8968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

CD_Unit3

The document covers the concepts of bottom-up parsing in compiler design, focusing on techniques such as Shift Reduce Parsing, Operator Precedence Parsing, and various types of LR Parsers. It explains the processes of reductions, handle pruning, and the construction of parsing tables, along with examples and potential conflicts that may arise during parsing. Additionally, it discusses the advantages and disadvantages of different parsing methods and provides algorithms for implementing these techniques.

Uploaded by

bm8968
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 103

21CSC304J-COMPILER DESIGN

UNIT 3 Bottom-Up Parsing

Bottom Up Parsing-Reductions-Handle Pruning-Shift Reduce


Parser-Problems related to Shift Reduce Parsing-Operator
Precedence Parser, LEADING, TRAILING -LR Parser- LR Parsers-
Need of LR Parsers-LR (0)Item-Closure of Item Sets- Construction
of SLR Parsing Table -Problems related to SLR-Construction of
Canonical LR(1)- Problems related to CLR - LALR Parser —
Problems related to LALR-YACC.
BOTTOM UP PARSING

• Bottom up parsing works in the opposite direction from top down.


• A top down parser begins with the start symbol at the top of the parse tree and works
downward, driving productions in forward order until it gets to the terminal leaves.
• A bottom up parse starts with the string of terminals itself and builds from the leaves
upward, working backwards to the start symbol by applying the productions in reverse.
• A bottom-up parse corresponds to the construction of a parse tree for an input string
beginning at the leaves (the bottom) and working up towards the root (the top).
• Bottom-up syntax analysis is also termed as Shift Reduce parsing or LR parsing.
• At each and every step of reduction, the right side of a production which matches with
the substring is replaced by the left side symbol of the production.
• If the sub string is chosen correctly at each step, a rightmost derivation is traced out in
reverse.
BOTTOM UP PARSERS - TYPES

SLR (1) Simple LR


LALR Look ahead LR
CLR Canonical LR
EXAMPLE OF BOTTOM UP PARSE TREE
Consider the given grammar and to parse the input string abbcde,
REDUCTIONS

• It is the process of "reducing" a Input string w to the start symbol of the grammar.
• At each reduction step, a specific substring matching the body of a production is replaced by the non
terminal at the head of that production
• A reduction is the reverse of a step in a derivation, therefore bottom up parser is rightmost
derivation in reverse.

Example:
• Consider the grammar: S → aABe
A → Abc | b
B→d
• The sentence/Input string(w) to be recognized is abbcde.
HANDLE

• A "handle" is a substring that matches the right side of a production, and whose reduction to the non-
terminal on the left side of the production represents one step along the reverse of a rightmost derivation.
EXAMPLE OF HANDLE
HANDLE PRUNING

• The implementation of handle pruning involves the following data-structures:-


– a stack - to hold the grammar symbols;
– an input buffer that contains the remaining input and a table to decide handles.
EXAMPLE OF HANDLE PRUNING

Consider the given grammar,


SHIFT REDUCE PARSERS (SR PARSERS)

• Shift reduce parsing is a bottom-up parsing that reduces a string w to the start symbol of the grammar.
• It scans and parses the input text in one forward pass without backtracking.
Stack implementation of shift reduce parsing
• Handle pruning must solve the following two problems to perform parsing
• Locating the substring to be reduced in a right sentential form.
• Determining what production to choose in case there is more than one productions with that substring on the
right side.
• The type of data structure to use in a shift reduce parser.
SHIFT REDUCE PARSERS (SR PARSERS)

Implementation of Shift reduce parser


SR parsers can be implemented by using the following components
• Stack is used to hold grammar symbols.
• An input buffer is used to hold the string w to be parsed.
• $ is used to mark the bottom of the stack and also the right end of the input.
• Initially the stack is empty and the string w is on the input, Stack ($) Input(w$)
• The parser processes by shifting 0 or more input symbols onto the stack until a handle β is on top of the stack.
• The parser then reduces β to the left side of the appropriate production.
• The parser repeats this cycle until it has detected an error or until the stack contains the start symbol and the input
is empty. Stack ($S) Input($)
• When the input buffer reduces the end marker symbol $ and the stack contains the start symbol, the parser halts and
announces successful completion of parsing.
SHIFT REDUCE PARSING ALGORITHM
SHIFT REDUCE PARSERS (SR PARSERS)

Actions in Shift Reduce parsers


1. Shift: This action shifts the next input symbol present on the input buffer onto the top of the stack.
2. Reduce:
• This action is performed if the top of the stack has an input symbol that denotes a right end of a substring.
• And within the stack there exist the left end of the substring.
• The reduce action replaces the entire substring with the appropriate non-terminal.
• The production body of this non-terminal matches the replaced substring.
3. Accept:
• When at the end of parsing the input buffer becomes empty.
• And the stack has left with the start symbol of the grammar.
• The parser announces the successful completion of parsing.
4. Error: This action identifies the error and performs an error recovery routine.
EXAMPLE OF SHIFT REDUCE PARSER
Consider that we have a string id * id + id and the grammar for the input string is:
E ->E + T | T
T -> T * F | F • In the shift-reduce parsing a
F -> ( E ) | id handle always appears on the
top of the stack.
• The handle is a substring that
matches the body of
production in the grammar.
• The handle must never appear
inside the stack.
EXAMPLE OF SHIFT REDUCE PARSER
CONFLICTS/ PROBLEMS DURING SHIFT REDUCE PARSING
CONFLICT TYPES

There are two types of conflicts during Shift Reduce parsing


1. Shift-Reduce Conflict
2. Reduce-Reduce Conflict
EXAMPLE OF CONFLICTS DURING SR PARSING
EXAMPLE OF CONFLICTS DURING SR PARSING

Example of shift-reduce conflict

1. cannot decide whether to shift or to reduce (a shift/reduce conflict)


2. cannot decide which of several reductions to make (a reduce/reduce conflict).
Reduce-reduce conflicts are rare and usually indicate a problem in the grammar definition.
OPERATOR PRECEDENCE PARSER

• Operator grammar
– small, but an important class of grammars (parse only small class of grammars)
– we may have an efficient operator precedence parser (a shift-reduce parser) for an
operator grammar.
• In an operator grammar, no production rule can have:
– ε at the right side
– two adjacent non-terminals at the right side.

• Example :
E→AB E→EOE E→E+E |
A→a B→b E→id O→+|*|/ E*E | E/E | id
not operator grammar Not operator grammar An operator grammar
PRECEDENCE RELATIONS

• Operator precedence relation describes relation between pair of terminals which guides the selection of handles.
• The determination of correct precedence relations between terminals are based on the traditional notions of associativity
and precedence of operators. (Unary minus causes a problem).
IDEA BEHIND OPERATOR PRECEDENCE PARSING
OPERATOR PRECEDENCE ALGORITHM
OPERATOR PRECEDENCE RELATIONS

+ - * / ^ id ( ) $
+ .> .> <. <. <. <. <. .> .>

- .> .> <. <. <. <. <. .> .>

* .> .> .> .> <. <. <. .> .>

/ .> .> .> .> <. <. <. .> .>

^ .> .> .> .> <. <. <. .> .>

id .> .> .> .> .> .> .>

( <. <. <. <. <. <. <. =·


) .> .> .> .> .> .> .>

$ <. <. <. <. <. <. <.


EXAMPLE OPERATOR PRECEDENCE PARSER
EXAMPLE OF OPERATOR PRECEDENCE PARSING ALGORITHM
PRECEDENCE FUNCTIONS
PRECEDENCE FUNCTIONS EXAMPLE
OPERATOR PRECEDENCE PARSING

• Disadvantages:
– It cannot handle the unary minus (the lexical analyzer should handle the unary
minus).
– Small class of grammars.
– Difficult to decide which language is recognized by the grammar.

• Advantages:
– simple
– powerful enough for expressions in programming languages
ERROR RECOVERY IN OPERATOR PRECEDENCE PARSING

• Error Cases:
1. No relation holds between the terminal on the top of stack and the next input
symbol.
2. A handle is found (reduction step), but there is no production with this handle
as a right side

• Error Recovery:
3. Each empty entry is filled with a pointer to an error routine.
4. Decides the popped handle “looks like” which right hand side. And tries to
recover from that situation.
COMPUTATION OF LEADING AND TRAILING

• Two Methods to determine a precedence relation between a pair of


terminals
– Based on associativity and precedence relations of operators
– Using Operator Precedence Grammar
• Implementation of Operator-Precedence Parser:
– An operator-precedence parser is a simple shift-reduce parser that is capable of parsing a
subset of LR(1) grammars.
– The operator-precedence parser can parse all LR(1) grammars where two consecutive non-
terminals and epsilon never appear in the right-hand side of any rule.
STEPS FOR LEADING AND TRAILING

• Steps involved in Parsing:


1.Ensure the grammar satisfies the pre-requisite.
2. Computation of the function LEADING()
3. Computation of the function TRAILING()
4. Using the computed leading and trailing ,construct the operator Precedence Table
5. Parse the given input string based on the algorithm
6. Compute Precedence Function and graph.
COMPUTATION OF LEADING AND TRAILING

Leading is defined for every non-terminal. Terminals that can be the first terminal in a string derived from that non-terminal.

Compute LEADING (A)


• LEADING (A) = {a| A → γaδ, where γ is ε or a single non-terminal.}
• Rule 1: a is in LEADING (A) if there is a production of the form A → γaδ, Where γ is ε or a single non-terminal
• Rule 2: a is in LEADING (B) and if there is a production of the form A → Bα, then a is in LEADING (A)

Trailing is defined for every non-terminal. • Terminals that can be the last terminal in a string derived from that non-terminal.

Compute TRAILING (A)


• TRAILING (A) = {a| A → γaδ, where δ is ε or a single non-terminal.}
• Rule 1: a is in TRAILING (A) if there is a production of the form A → γaδ, Where δ is ε or a single non-terminal
• Rule 2: a is in TRAILING (B) and if there is a production of the form A → αB, then a is in TRAILING (A)
EXAMPLE OF LEADING AND TRAILING
ALGORITHM FOR OPERATOR PRECEDENCE TABLE

• Step 2: After computing LEADING and TRAILING, the table is constructed between all the terminals in the grammar including the ‘$’ symbol.
PRECEDENCE RELATION TABLE

• Step 3 : construct the precedence relation table


PARSING ALGORITHM

Algorithm:
set p to point to the first symbol of w$ ;
repeat forever
if $ is on top of the stack and p points to $ then return else
begin
let a be the topmost terminal symbol on the stack and let b be the symbol pointed to by p;
if ( a <. b or a =· b ) then
push b onto the stack;
advance p to the next input symbol;
end
else if a .> b then repeat pop
stack
until the top of stack terminal is related by <. to the terminal most recently popped
else error();
end
PARSING THE INPUT
Step 4 : parsing the input string (id+id)*id$
PRECEDENCE FUNCTIONS

• Compilers using operator-precedence parsers need not store the table of precedence relations.
• In most cases, the table can be encoded by two precedence functions f and g that map terminal symbols to integers.
We attempt to select f and g so that, for symbols a and b.
ALGORITHM FOR CONSTRUCTING PRECEDENCE RELATIONS
LR PARSERS
• LR parsing is one type of bottom up parsing. It is used to parse the large class of grammars.
• In the LR parsing, "L" stands for left-to-right scanning of the input.
• "R" stands for constructing a right most derivation in reverse.
• "K" is the number of input symbols of the look ahead used to make number of parsing decision.
• LR parsing is divided into four parts: LR (0) parsing, SLR parsing, CLR parsing and LALR parsing.
• LR parsers can usually recognize all programming language construct that can be specified by context-free grammars.
• LR parsers detect errors fast.
• Drawback: it is too much work to construct an LR parser by hand.
• Fortunately, we can use an LR parser generator such as YACC.

LR parsing is attractive because:


– LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient.
– The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can
be parsed with predictive parsers.
• LL(1)-Grammars ⊂ LR(1)-Grammars
– An LR-parser can detect a syntactic error as soon as it is possible to do so a left-to-right scan of the input.
TYPES OF LR PARSERS
LR PARSERS
• LR-Parsers
– covers wide range of grammars.
– SLR – simple LR parser
– LR – most general LR parser
– LALR – intermediate LR parser (look-head LR parser)
– SLR, LR and LALR work same (they used the same algorithm), only their parsing tables are different.
CONFIGURATION OF LR PARSING ALGORITHM
LR PARSING ALGORITHM
SLR PARSER / LR (0) PARSER

• An LR(0) parser is a Shift Reduce parser that uses zero tokens of look ahead to determine what action to take.
• This means that in any configuration of the parser, the parser must have an unambiguous action to choose- either it
shifts a specific symbol or applies a specific reduction.
• If there are ever two or more choices to make, the parser fails and the grammar is not LR(0).
• An LR parser makes shift-reduce decisions by maintaining states to keep track of parsing. States represent a set of
items.
• To find
1. Closure of items ets
2. Construct canonical LR(0) collection
CONSTRUCTION OF THE C ANONIC AL LR(0) COLLECTION

• To create the SLR parsing tables for a grammar G, we will create the canonical LR(0) collection of the grammar G’.
Algorithm

.
C is { closure({S’→ S}) }

repeat the followings until no more set of LR(0) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C An LR(0) item of a grammar G is a production of G with a • at
some position of the right-hand side
add goto(I,X) to C Thus, a production
A  XY Z
• goto function is a DFA on the sets in C. has four items:
[A  • X Y Z]
[A  X • Y Z]
[A  X Y • Z]
[A  X Y Z •]
Note that production A   has one item [A  •]
GOTO FUNCTION
CLOSURE FUNCTION

LR(0) PARSING TABLE


• If a state is going to some other state on a terminal then it correspond to a Shift move.
• If a state is going to some other state on a variable then it correspond to Go to move.
• If a state contain the final item in the particular row then write the Reduce move completely.
LR PARSERS

Steps

1. Number the productions


2. Construct augmented grammar by introducing a new start symbol and a
new production S’ → S.
3. Find GOTO and closure of item sets (DFA Transition diagram)
4. Construction of parsing table
5. Parse table on input
EXAMPLE OF SLR PARSER

Construct LR parsing table for the given context-free grammar


• S–>AA
A–>aA|b
• Solution:
• STEP1: Find augmented grammar
The augmented grammar of the given grammar is:-
• S’–>.S [0th production]
S–>.AA [1st production]
A–>.aA [2nd production]
A–>.b [3rd production]
• STEP2: Find LR(0) collection of items
CONTD..

• Defining 2 functions: goto[list of non-terminals] and action[list of terminals] in the parsing table. Below is the SLR
parsing table.

no shift/reduce or
no reduce/reduce conflict


so, it is a SLR grammar.
EXAMPLE OF SLR PARSER

Implement SLR Parser for the given grammar:


1. E→E + T
2. E→T
Step 1: Create augmented grammar using the provided grammar.
3. T→T * F Augmented grammar:
4. T→F E'→E
E→E + T
5. F→(E) E→T
6. F→ id T→T * F
T→F
F→(E)
F→id
Step 2: Find LR(0) canonical Items
GOTO(I6, T)
I9: E->E+T.
Step 2: (Second method) Find LR(0) canonical Items T->T.*F
GOTO (I0, E) GOTO (I0, id) GOTO(I4, T) GOTO(I7, id)
GOTO(I6, F)
l1: E’->E. I5: F->id. I2: E->T. I5: F->id.
I3: T->F.
E->E.+T T->T.*F
GOTO (I1, +) GOTO(I8, ) )
GOTO(I6, ( )
GOTO (I0, T) I6: E->E+.T GOTO(I4, F) I11: F->(E).
I4: F->(.E)
l2: E->T. T->.T*F I3: T->F.
T->T.*F T->.F GOTO(I8, +)
GOTO(I6, id)
F->.(E) GOTO(I4, ( ) I6: E->E+.T
I5: F->id.
GOTO (I0, F) F->.id I4: F->(.E) T->.T*F
l3: T->F. E->.E+T T->.F
GOTO(I7, F)
GOTO(I2, *) E->.T F->.(E)
I10: T->T*F.
GOTO (I0, ( ) I7: T->T*.F T->.T*F F->.id
I4: F->(.E) F->.(E) T->.F
GOTO(I7, ( )
E->.E+T F->.id F->.(E) GOTO(I9, *)
I4: F->(.E)
E->.T F->.id I7: T->T*.F
E->.E+T
T->.T*F GOTO(I4, E) F->.(E)
E->.T
T->.F I8: F->(E.) GOTO(I4, id) F->.id
T->.T*F
F->.(E) E->E.+T I5: F->id.
T->.F
F->.id
F->.(E)
F->.id
Step 3: Construct the Parsing table
To fill the reduction action in the ACTION section of the table, computation of FOLLOW is necessary.
Follow(E): {$, ), +}
Follow(T): {$, +, ), *}
Follow(F): {*, +, ), $}
Step 4: Parse the given input. Parse the string id*id+id using stack implementation.

no shift/reduce or
no reduce/reduce conflict

Using the SLR parser, the given input string id*id+id has been fully parsed. ⇓
so, it is a SLR grammar.
SLR PARSER
SLR PARSER EXAMPLE
Construct SLR parsing
table for the given
grammar
S → AA
A → aA | b

1 Find augmented
production
S` → •S
S → •AA
A → •aA
A → •b

2 Constructing DFA
SLR PARSER EXAMPLE

3 LR (0) parsing table


• If a state is going to some other state on a terminal then it correspond to a shift move.
• If a state is going to some other state on a variable then it correspond to go to move.
• If a state contain the final item in the particular row then write the reduce node completely.

no shift/reduce or
no reduce/reduce conflict


so, it is a SLR grammar.
SLR PARSER EXAMPLE
ISSUES OF SLR / LR(0) PARSER
EXAMPLE OF SLR CONFLICT (SHIFT –REDUCE)

The above example shows the shift-reduce conflict


Check the conflicts through canonical collection of LR(0) items or through SLR parsing table.
EXAMPLE OF SLR CONFLICT (REDUCE –REDUCE)

The above example shows the reduce-reduce conflict


Check the conflicts through canonical collection of LR(0) items or through SLR parsing table
LR (1) / CLR PARSER
CLR PARSER / LR (1) PARSER
• In SLR method, the state i makes a reduction by A→α when the current token is a:
– if the A→α. in the Ii and a is FOLLOW(A)

• In some situations, βA cannot be followed by the terminal a in


a right-sentential form when βα and the state i are on the top stack. This means that making reduction in
this case is not correct.

S → AaAb S➺AaAb➺Aab➺ab S➺BbBa➺Bba➺ba


S → BbBa
A→ε Aab ➺ ε ab Bba ➺ ε ba
B→ε AaAb ➺ Aa ε b BbBa ➺ Bb ε a

• To avoid some of invalid reductions, the states need to carry more information.
• Extra information is put into a state by including a terminal symbol as a second component in an item.

• A LR(1) item is: .


A → α β,a where a is the look-head of the LR(1) item (a is a terminal or end-marker.)
C ANONIC AL COLLECTION OF LR (1) ITEM SETS
CLOSURE OF ITEM SETS IN LR (1)
EXAMPLE OF LR (1) PARSER
EXAMPLE OF LR (1) PARSER
Construct a CLR parsing table for the given context-free grammar
S-->AA
A-->aA|b

STEP 1 – Find augmented grammar

The augmented grammar of the given grammar is:-

S'-->.S ,$ [0th production]


S-->.AA ,$ [1st production]
A-->.aA ,a|b [2nd production]
A-->.b ,a|b [3rd production]

STEP 2 – Find LR(1) collection of items


Below is the figure showing the LR(1) collection of items.
STEP 3 - transition diagram of LR(1)
STEP 4 defining 2 functions: goto[list of terminals] and action[list of non-terminals] in the parsing
table. Below is the CLR parsing table

STEP 5 : parse the input aabb using LR(1) parser.

The above example shows the LR (1) parser


.
no shift/reduce or no reduce/reduce conflict so, it is a CLR parser
EXAMPLE - SECOND METHOD OF FINDING LR(1) ITEMSETS
ISSUES IN CLR PARSING
• The primary issue with a CLR (Canonical LR) parser is that while it is more powerful than simpler LR parsers like
SLR.
• It can still generate very large parsing tables due to the extensive look-ahead information it needs to handle, leading
to increased memory usage and potentially slower parsing times.
• Because CLR parsers consider all possible look-ahead symbols when building their parsing table, the number of
states can significantly increase, resulting in large table sizes compared to other LR parsers.
• In CLR(1), the only chance of having SR conflict is when you have multiple productions in a state.

Shift / Reduce Conflict


• We say that we cannot introduce a shift/reduce conflict during the shrink process for the creation of the

. .
states of a LALR parser.

. .
• Assume that we can introduce a shift/reduce conflict. In this case, a state A → α ,a and B → β aγ,b
• This means that a state of the canonical LR(1) parser must have: A → α ,a and B → β aγ,c.
But, this state has also a shift/reduce conflict. i.e. The original canonical LR(1) parser has a conflict. (Reason
for this, the shift operation does not depend on lookaheads)
Reduce / Reduce Conflict
• But, we may introduce a reduce/reduce conflict during the shrink process for the creation of the states of a LALR
parser.
EXAMPLE
LALR PARSER
LALR PARSER EXAMPLE
Unambiguous LR(1) grammar:
SL=R
|R
L*R
| id
RL
Augment with S’  S
LALR items (next slide)
LALR PARSER PARSING TABLE

no shift/reduce or no reduce/reduce conflict so, it is a LALR parser


LALR PARSER EXAMPLE
Construct LALR parsing table for the grammar 3. Construct parsing table
S → AA
A → aA
A→b

1. Find the augmented production,


S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b

2. Draw DFA
EXAMPLE
EXAMPLE
ERROR CORRECTION AND RECOVERY IN LR PARSING

• Canonical LR parser uses full LR(1) parse tables and will never make a single reduction before recognizing the error when a
syntax error occurs on the input
• SLR and LALR may still reduce when a syntax error occurs on the input, but will never shift the erroneous input symbol.
ERROR RECOVERY IN LR PARSING
• Panic mode
• Pop until state with a goto on a nonterminal A is found, (where A represents a major
programming construct), push A
• Discard input symbols until one is found in the FOLLOW set of A
• Phrase-level recovery
• Implement error routines for every error entry in table
• Error productions
• Pop until state has error production, then shift on stack
• Discard input until symbol is encountered that allows parsing to continue
LL, SLR, CLR, LALR SUMMARY

• LL parse tables computed using FIRST/FOLLOW


• Nonterminals  terminals  productions
• Computed using FIRST/FOLLOW
• LR parsing tables computed using closure/goto
• LR states  terminals  shift/reduce actions
• LR states  nonterminals  goto state transitions
• A grammar is
• LL(1) if its LL(1) parse table has no conflicts
• SLR if its SLR parse table has no conflicts
• LALR if its LALR parse table has no conflicts
• LR(1) if its LR(1) parse table has no conflicts
YACC- YET ANOTHER COMPILER COMPILER

1. Tool which will produce a parser for a given grammar.


• YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar and to produce the source
code of the syntactic analyzer of the language produced by this grammar.
YACC

%{C declarations%}
yacc declarations
%%
Grammar rules
%%
Additional C code
• Comments enclosed in /* ...*/
LEX AND YACC

üLex generates C code for a lexical analyzer, or scanner


üLex uses patterns that match strings in the input and converts the strings to tokens

üYacc generates C code for syntax analyzer, or parser.


üYacc uses grammar rules that allow it to analyze tokens from Lex and create a
syntax tree.

ü For example, you use lex to identify keywords, identifiers, strings, comments, whitespace, and so
on. yacc is for parsing your grammar.
WORK WITH LEX AND YACC

[0-9]+
call yylex()

next token is NUM

NUM ‘+’ NUM

90
DEFINITIONS SECTION

%{
#include <stdio.h>
#include <stdlib.h>
%}
It is a terminal
%token ID NUM
%start expr

EXPR TO PARSE

91
START SYMBOL

 This section defines grammar


 Example

expr : expr '+' term | term;


term : term '*' factor | factor;
factor : '(' expr ')' | ID | NUM;
Rules Section
• The first non-terminal specified in the grammar specification section.
• To overwrite it with %start declaration.
%start non-terminal

92
RULES SECTION

• Normally written like this


• Example:
expr : expr '+' term
| term
;
term : term '*' factor
| factor
;
factor : '(' expr ')'
| ID
| NUM
;

9
3
THE POSITION OF RULES

expr : expr '+' term { $$ = $1 + $3; }


| term { $$ = $1; }
;
term : term '*' factor { $$ = $1 * $3; }
| factor { $$ = $1; }
;
factor : '(' expr ')' { $$ = $2; }
| ID
| NUM
;

94
THE POSITION OF RULES

expr : expr ' ' term { $$ = $1 + $3; }


expr : '+' term { $$ = $1 + $3; }
| term { $$ = $1; }
| { $$ = $1; }
;
;
term : term ' ' factor { $$ = $1 * $3; }
term : '*' factor { $$ = $1 * $3; }
| factor { $$ = $1; }
| { $$ = $1; }
;
;
factor : '(' ')' { $$ = $2; }
factor : ' ' expr ')' { $$ = $2; }
| ID
| ID
| NUM
| NUM
;
;

95
POSITION OF RULES

expr : expr '+' { $$ = $1 + $3; }


| term { $$ = $1; }
;
term : term '*' { $$ = $1 * $3; }
| factor { $$ = $1; }
;
factor : '(' expr ' ' { $$ = $2; }
| ID
| NUM
;

96
STEPS FOR YACC

1. Create a LEX file with ‘.l’ externsion and fill the definition and rule section. (lex_file_name.l)
2. The user sub-routine section is not necessary.
3. Create a YACC file with ‘.y’ extension and save it. (yacc_file_name.y)
4. Compile the LEX file. (lex lex_file_name.l)
5. Compile the YAAC file. (yacc – d yacc_file_name.y)
6. Compile both using C compiler. (cc lex.yy.c y.tab.c)
7. Execute the object file. ./a.out
TO RECOGNIZE A VALID IDENTIFIER
// fn.l
%option noyywrap
digit[0-9]
letter[a-zA-Z]
%{
#include "y.tab.h"
%}
%%
{letter}({letter}|{digit})* {return VALID;}
. {return INVALID;}
%%
// FN.Y

%{ int main (void)


#include <stdio.h> {
#include <stdlib.h> printf ("\n Enter String: ");
int yylex(); yyparse();
void yyerror (const char *s); return 0;
%} }
%token VALID INVALID void yyerror (const char *s)
%% {
term: VALID {printf ("The given string is Valid."); exit(0);} fprintf (stderr, "%s", s);
| INVALID {yyerror ("The given string is Invalid."); exit(0);} }
%%
TO RECOGNIZE A VALID ARITHMETIC EXPRESSION

// arith.l
%option noyywrap
%{
#include "y.tab.h"
%}
%%
[a-zA-Z_][a-zA-Z_0-9]* return IDENTIFIER;
[0-9]+(\.[0-9]*)? return NUMBER;
[+/*] return OPERATOR;
[ \t] ;
. return yytext[0];
\n return 0;
%%
%{ F : OPERATOR E
#include<stdio.h> | '-' E
|
int yyerror();
;
int yylex(); %%
int valid = 1;
int yyerror()
%} {
%token NUMBER IDENTIFIER OPERATOR valid = 0;
%% printf("\n The given Arithmetic Expression is Invalid.");
return 0;
START : E }
E : IDENTIFIER F int main()
| NUMBER F {
printf("\nEnter the expression:\n");
| '-' NUMBER F
yyparse();
| '-' IDENTIFIER F if (valid == 1)
| '(' E ')' F printf ("\n The given Arithmetic Expression is Valid.");
}
IMPLEMENTATION OF C ALCULATOR USING LEX AND YACC

// lex part // yacc part


%option noyywrap %{
%{ #include <stdio.h>
#include <stdio.h> #include <stdlib.h>
#include "y.tab.h"
int yylex();
void yyerror (const char *s);
extern int yylval;
%}
%} %token NUMBER
%% %left '+' '-'
[0-9]+ { yylval = atoi(yytext); return NUMBER; } %left '*' '/' '%'
[ \t] ; %left '(' ')'
[\n] return 0;
. return yytext[0];
%%
%%
EXPR: ID { printf ("\n The result is: %d\n", $$); return 0;}
;
ID: ID '+' ID { $$ = $1 + $3; }
| ID '-' ID { $$ = $1 - $3; }
| ID '*' ID { $$ = $1 * $3; }
| ID '/' ID { if ($3 != 0)
int main (void)
{
$$ = $1 / $3;
printf ("\n Enter the Arithmetic Expression: ");
else yyparse();
yyerror("Divide by Zero!"); }
void yyerror (const char *s)
{
printf ("\n The given Arithmetic Expression is invalid.");
exit(0);
}

You might also like