0% found this document useful (0 votes)
110 views

Compiler Design Material

The document discusses various topics related to compilers including: 1. The functions of a linkage editor which combines object files into an executable, including symbol resolution, address resolution, code integration, and error handling. 2. The differences between single-pass and two-pass assemblers, with single-pass being faster but unable to handle forward references like two-pass assemblers. 3. Top-down parsers having disadvantages like left-recursion handling, backtracking, and ambiguity resolution compared to other parsing techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views

Compiler Design Material

The document discusses various topics related to compilers including: 1. The functions of a linkage editor which combines object files into an executable, including symbol resolution, address resolution, code integration, and error handling. 2. The differences between single-pass and two-pass assemblers, with single-pass being faster but unable to handle forward references like two-pass assemblers. 3. Top-down parsers having disadvantages like left-recursion handling, backtracking, and ambiguity resolution compared to other parsing techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

End semester 2023

1. Enumerate the functions of linkage editor


A linkage editor, often referred to as a linker and Its primary function is to combine
multiple object files generated by the compiler into a single executable program or
shared library
key functions of a linkage editor in compiler design
 Symbol Resolution
 Address Resolution
 Code and Data Integration
 Dead Code Elimination
 Error Handling
 Debug Information Handling
2. Compare and contrast single pass and two pass assembler.
 single-pass assembler processes the source code in one pass, making it faster and
more memory-efficient but less capable of handling forward references.
 A two-pass assembler, on the other hand, takes two passes through the source
code, allowing it to handle forward references and report errors more accurately but
at the cost of increased memory usage and a potentially slower assembly process

3. Define "interpreter."
 An interpreter is a computer program that reads and executes high-level source
code directly, typically line by line, without the need for a separate compilation step.
4. Enumerate the disadvantages of the top-down parser
The disadvantages of top-down parsing:
 Left-Recursion Handling
 Backtracking
 Limited Grammar Coverage
 Ambiguity Resolution
 Efficiency
5. Recall the regular definition for floating-point numbers.

In many programming languages and formal specifications, a regular definition for


floating-point numbers typically takes the following form:

float_number -> integer_part fraction? exponent?

integer_part -> digit+

fraction -> '.' digit+

1
exponent -> ('E' | 'e') ('+' | '-')? digit+

digit -> [0-9]

6. When is a "syntax directed definition" said to be "circular"?


 A "syntax-directed definition" is said to be "circular" when it creates a circular
dependency
 Circular syntax-directed definitions can lead to ambiguity in parsing or difficulty
in establishing a clear order of evaluation of semantic actions, making it
challenging to determine the meaning of a program.
7. Define handle.
 In compiler design, a "handle" refers to a substring of the input string (usually a
sequence of terminals and non-terminals) that matches the right-hand side of a
production rule in a context-free grammar.
 a handle in compiler design is a substring of the input string that matches a
production in the grammar and is used in the reduction step of bottom-up
parsing to build the parse tree
8. Show the dependency graph for the declaration statements
In compiler design, dependency graphs help represent relationships between different
elements in a program or codebase, such as variable dependencies, function call
dependencies, and data flow dependencies.

int main() {
int a, b;
float c;
a = 5;
b = a * 2;
c = a / 2.0;
return 0;
}

9. Define activation life time of a procedure.

2
 The activation lifetime of a procedure refers to the period when a specific
function or subroutine is actively running and using memory.
 It begins when the function is called and ends when it finishes its task and
returns control to the calling code. During this time, the function has its own
dedicated space in memory to store data like variables and parameters.
 When the function completes, its memory space is released, and the activation
lifetime ends. This concept helps manage how functions work together and use
memory.
10. What is meant by 'NEXT-USE' of a variable?
 The "NEXT-USE" of a variable refers to the point in a program's execution when that
variable is next going to be read or used.
 It helps in optimizing memory usage and can be crucial for efficient register
allocation in compilers and runtime systems.
11. Differentiate loading and linking.

3
12. Compare and contrast an interpreter and a compiler.

13. Define lexeme.


 A lexeme is the smallest unit of a programming language that a compiler or
interpreter can recognize as a meaningful element.
 Lexemes are the building blocks of a program and include identifiers, keywords,
constants, operators, and other tokens.
14. Enumerate viable prefix property.

4
 The viable prefix property states that at any point during parsing, the partial
sequence of symbols at the top of the parsing stack is a viable prefix of some right-
hand side of a production in the grammar.
 the viable prefix property is a fundamental concept in LR parsing that ensures
deterministic parsing decisions and early error detection.
 It is an important property to consider when designing and implementing parsing
algorithms and tools.
15. State the conflicts of shift reduce parser.
 In shift-reduce parsing, conflicts refer to situations where the parser encounters
ambiguity to perform
 A "shift" operation (i.e., adding a symbol to the parsing stack)
or
 A "reduce" operation (i.e., reducing a group of symbols to a non-terminal using a
production rule).
Shift-Reduce Conflict:
 This conflict occurs when the parser has both a shift and a reduce action available in
a particular parsing state, and it's unsure which action to take.
 Shift-reduce conflicts can arise due to ambiguities in the grammar.
 For example, if the grammar is not clearly defined to handle operator precedence or
associativity, a shift-reduce conflict can occur when parsing expressions.
Reduce-Reduce Conflict:
 This conflict arises when the parser has multiple reduce actions available in a parsing
state, and it cannot determine which reduction to choose.
 Reduce-reduce conflicts often indicate that the grammar is ambiguous or lacks
specificity in certain rules.
16. Define handle pruning.
 Sequence of symbols (a "handle") in the parsing stack to replace it with a non-
terminal symbol.
 This process is crucial for constructing the parse tree or abstract syntax tree during
bottom-up parsing
17. Give the quadruple for the expression t2 = b * t1.
 A quadruple is a set of four elements that represent an operation or an instruction in
a form that can be processed by a compiler or interpreter.
 In the expression "t2 = b * t1," a quadruple.
 (=, b, t1, t2)
18. Draw the parse tree for the production E-> E1+T.
A parse tree for the production E -> E1 + T can be represented visually as follows:
E

5
/\
E1 +
|
T
19. Define activation record.
 An activation record, also known as a stack frame, is a data structure used by a
computer program to manage the execution of functions, procedures, or
subroutines.
 It plays a critical role in tracking the state and context of function calls during
program execution.
 The activation record stores information related to the function call, such as local
variables, parameters, return addresses, and control flow information.
20. Give an example of algebraic identity.
An algebraic identity is a mathematical expression or equation that holds true for all
values of its variables.
One of the well-known algebraic identities is the "distributive property."
a * (b + c) = (a * b) + (a * c)
In this identity, no matter what values you assign to a, b, and c, the equation will always
be true.
2 * (3 + 4) = (2 * 3) + (2 * 4)
2*7=6+8
14 = 14
21. Differentiate system software and application software.

System Software:

 Manages and maintains the computer's fundamental operations.


 Provides a platform for running application software.
 Examples: Operating systems (e.g., Windows, macOS), device drivers, and
utilities (e.g., antivirus, disk management tools).
 Essential for the computer to function properly.

Application Software:

 Designed for end-users to perform specific tasks or functions.


 Provides solutions to users' needs and preferences.
 Examples: Word processors (e.g., Microsoft Word), web browsers (e.g., Google
Chrome), video games, and productivity software (e.g., Excel).
 Users interact with application software to accomplish their goals.
22. Justify the need for 2 pass assembler in solving forward reference problem.
6
 a two-pass assembler is necessary for solving the forward reference problem
because it systematically processes and resolves symbols in two steps, ensuring
accurate and efficient translation of source code into machine code.
 It is a common and effective approach used in assembly language programming
to handle forward references and other related tasks.
23. Write the regular definition to recognize all identifier
In most programming languages, identifiers are typically defined as sequences of letters,
digits, and underscores, with the requirement that they must start with a letter or an
underscore.
identifier -> letter (letter | digit | _)*
letter -> [a-zA-Z]
digit -> [0-9]

24. Convert (a|b)*ab into NFA.

25. Apply left factor to the following grammar.


A->aB| aC| bc

26. Enumerate the possible conflicts in LR parsing.


Shift-Reduce Conflict:
 Happens when the parser is uncertain about whether to shift (add a symbol to
the parsing stack) or reduce (apply a production rule).
 It can occur because the parser has both shift and reduce actions available in a
certain parsing state, and it's not clear which action to take.
 Often caused by ambiguous grammars or when a sequence of symbols can be
interpreted in multiple ways.
Reduce-Reduce Conflict:
 Occurs when the parser has multiple reduce actions available in a parsing state
and cannot determine which reduction to choose.

7
 Typically indicates that the grammar is ambiguous or lacks specificity in certain
rules.
 It can be challenging for the parser to decide which production rule to apply in
such cases.
27. Design syntax directed translation for declaration statements.
Designing a syntax-directed translation scheme for declaration statements typically
involves associating attributes with the non-terminals in the grammar and defining
translation rules that generate code or perform actions as the parser processes the
input.
declaration → type identifier_list ;
type → int | float | char
identifier_list → identifier | identifier , identifier_list
28. What is a flow graph?
 A flow graph, in simple terms, is a visual representation of a program's control
flow.
 It illustrates how the program's instructions or statements are organized and the
possible paths of execution.
In a flow graph:
 Each instruction or statement is represented as a node.
 Arrows or edges between nodes show the order in which the instructions are
executed.
 Decision points (e.g., if statements or loops) result in branches or multiple
possible paths.
 Flow graphs are often used for program analysis, optimization, and
understanding the logic of a program
29. Define a basic block.
 It is defined as a sequence of consecutive, non-branching instructions in a
program.
 The basic block begins with a single entry point, which is typically the first
instruction in the block.
 The basic block ends with a single exit point, which is usually the last instruction
in the block.
 There are no branches or jump instructions within the block that would disrupt
the sequential flow of instructions.
30. Enumerate the issues in code generation.
Code generation, the final phase of compilation, involves translating the high-level
language code into machine code or assembly language.

8
3 marks
1. Schematically represent the language processing system.
Write Phases of compiler
2. Write a LEX program to identify the even binary strings.

9
To write a Lex program that identifies even-length binary strings, you can use regular
expressions to specify the pattern you want to match.
%%
[01]*[01] printf("Even Binary String\n");
. printf("Not an Even Binary String\n");
%%
int main() {
yylex();
return 0;
}
 [01]*[01] is a regular expression pattern that matches even-length binary
strings. It starts with zero or more '0' or '1' characters, followed by exactly
one '0' or '1'. The pattern ensures that the total length of the string is even.

 The first rule [01]*[01] matches even-length binary strings and prints "Even
Binary String" when a match is found.

 The second rule . matches any character (including '0' or '1') that does not fit
the previous pattern. It prints "Not an Even Binary String" for these cases.

 In the main function, yylex() is called to start the Lex scanner.


3. Write a YACC program to solve the given productions.
S-> aA | a
A->aA | b
To create a YACC (Yet Another Compiler Compiler) program to recognize and parse
the given grammar productions
%{
#include <stdio.h>
%}
%token a b
%%
S : a A { printf("Valid\n"); }
| a { printf("Valid\n"); }
;
A : a A { /* Do nothing, just recognize 'aA' */ }
| b { /* Do nothing, just recognize 'b' */ }
;
%%

10
int main() {
yyparse(); // Start the parsing process.
return 0;
}

int yyerror(const char* msg) {


printf("Invalid\n");
return 0;
}

 The %{ ... %} section is used to include header files, in this case, stdio.h.

 %token a b declares the tokens recognized by the lexer. In this case, the lexer recognizes
'a' and 'b' as tokens.

 The rules for the productions S and A are defined within the %% section. These rules
specify how the grammar is parsed. When a valid string is recognized, it prints "Valid." If
the input does not conform to the grammar, it prints "Invalid."

 In the main function, yyparse() is called to start the parsing process.

 The yyerror function is used to handle parsing errors and, in this case, print "Invalid"
when a parsing error is encountered.

4. When does a grammar identified as non L. attributed?


 Non-L-attributed grammars have attribute dependencies that cannot be handled
within the L-attributed framework

key characteristics that identify a grammar as non-L-attributed

 Attribute Dependencies Across Productions: In a non-L-attributed grammar, the


values of attributes for a particular non-terminal are influenced by attributes of
non-terminals in other productions.
 Lookahead Tokens: Non-L-attributed grammars may require lookahead tokens
(information about the next input token) to determine attribute values, which is
beyond the scope of L-attributed grammars.
 Use of Global Information: Non-L-attributed grammars may require the use of
global information or a more complex context to determine attribute values,

11
which is not consistent with the local and predictable attribute evaluation of L-
attributed grammars.

5. Illustrate the different types of attributes with necessary examples.


 We have two types of attributes: synthesized attributes and inherited
attributes.
 These attributes are used to describe and compute properties of nodes in a
parse tree or abstract syntax tree.

Synthesized Attributes:

 Synthesized attributes are attributes associated with a non-terminal in the


grammar, and their values are computed at the node corresponding to that
non-terminal during a bottom-up traversal of the parse tree.
 These attributes represent information that flows from child nodes to parent
nodes.
 Consider a simple arithmetic expression grammar:
E→E+T|T
T→T*F|F
F → (E) | num
 You can define synthesized attributes for the non-terminals like this:
E.value represents the value of the expression rooted at E.
T.value represents the value of the expression rooted at T.
F.value represents the value of the expression rooted at F.
 To compute these synthesized attributes, you would have semantic rules that
compute the values based on the values of child nodes.
 For example:
E.value = E1.value + T.value // For the production E → E + T
T.value = T1.value * F.value // For the production T → T * F
F.value = E.value // For the production F → (E)
F.value = num // For the production F → num

2. Inherited Attributes:

 Inherited attributes are attributes associated with a non-terminal, and their


values are computed at the parent node and passed down to the child nodes
during a top-down traversal of the parse tree.
 These attributes represent information that flows from parent nodes to child
nodes.

12
 Suppose you have a grammar for a simple programming language with type
checking:
Stmt → if (Exp) Stmt
| ...
Exp → Exp + Exp
| id
| ...
 You can define inherited attributes for type information:

Stmt.type represents the type of the statement.


Exp.type represents the type of the expression.
 The values of these attributes can be computed top-down and may be
passed down to child nodes:
Stmt.type = Exp.type // For Stmt that contains an Exp
Exp.type = checkType(Exp1.type, Exp2.type) // For the production Exp → Exp
+ Exp
Exp.type = lookupType(id) // For the production Exp → id
6. What is dangling reference? When does it occur?
 A "dangling reference" or "dangling pointer" refers to a situation where a
reference or pointer to a memory location or object becomes invalid because
the memory it points to has been deallocated or released.
 Deallocation of Memory: Dangling references often occur when a program
deallocates memory that was previously being pointed to by a reference or a
pointer.
int *ptr = (int *)malloc(sizeof(int));
free(ptr);
int value = *ptr; // Dangling reference
 Scope or Lifetime Issues: Dangling references can also occur when a
reference or pointer points to a local variable or an object with limited scope,
and the reference or pointer is accessed outside that scope or after the
object's lifetime has ended.
int *getLocalValue() {
int localVar = 42;
return &localVar;
}
int *ptr = getLocalValue();
int value = *ptr; // Dangling reference
7. Illustrate the use of a symbol table.

13
 A symbol table is a data structure used to manage and store information
about symbols (identifiers) used in a program, such as variables, functions,
constants, and types.
 It helps keep track of these symbols and their associated attributes to
perform various tasks like parsing, semantic analysis, and code generation.
 During the parsing phase, the compiler checks the syntax and semantics of
the code. It consults the symbol table to verify that identifiers are declared
before use, have compatible types, and are used correctly.
 If any errors are detected during compilation, the symbol table can help
provide detailed information about the problematic symbols and their
contexts to aid in error reporting and debugging.

8. Write a LEX program to identify the strings represented by the pattern (0/1)*011.
%{
#include <stdio.h>
%}

%%
(0|1)*011 { printf("Matched: %s\n", yytext); }
.|\n { /* Ignore other characters */ }

%%
int main() {
yylex();
return 0;
}
9. Define panic mode error recovery.
 Panic mode error recovery is a technique used in parsing and error handling
within compilers and parsers to recover from syntax errors and continue parsing
the input.

How panic mode error recovery typically works:

 When a syntax error is detected, the parser enters the panic mode.
 The parser looks for a set of "synchronization tokens" that can be used to
recover from the error.

14
 These tokens are typically chosen in a way that minimizes the chances of
encountering another error while searching for them.
 The parser skips over input tokens until it finds one of the synchronization
tokens.
 Once a synchronization token is found, the parser exits the panic mode and
continues parsing from that point.

10. Construct an annotated parse tree for the input expression: 3*5+4a
Discussed in the class
11. Give the syntax-directed definition for if-else statement to perform type checking.
Syntax Directed Definition (SDD)
It is a kind of abstract specification. It is generalization of context free
grammar in which each grammar production X –> a is associated with it a set
of production rules of the form s = f(b1, b2, ……bk) where s is the attribute
obtained from function f.
Statement → if (Expression) Statement else Statement

The SDD below associates semantic actions with this grammar:

Define attributes:

 Expression.type represents the type of the condition in the if statement.


 Statement1.type represents the type of the statements in the if branch.
 Statement2.type represents the type of the statements in the else
branch.

Statement.synthesize:

Statement.type = if Expression.type == BOOLEAN then

if Statement1.type == Statement2.type then

Statement1.type

else

// Type mismatch error between if and else branches

Error("Type mismatch between if and else branches")

else

15
// Type mismatch error for if condition

Error("If condition must be boolean")

12. Construct the DAG for the basic block:


a=b+c
b-a-d
c=b+c
d-a-d
13. Differentiate Linking Loader and Linkage Editor
14. Write a LEX program to count the even binary strings from the given input.
15. Consider the grammar
S ->(L)|a
L -> L, SS
a. What are the Terminals. Non terminals, and Start Symbol? b. Draw the parse tree
for the sentence (a. (a, a))
16. Construct an annotated parse tree for the input expression: 4+6-8 according to the
following syntax directed definition:
17. What are the different types of errors at various stages of the compiler?
18. Mention the important classes of transformation that can be applied to the basic
blocks.

12 marks
1. What is forward referencing? How is it handled while translating an assembly
language program?
2. Explain the working principle of a macro processor. List the types of macro
substitution.
3. Explain the various phases of a compiler in detail. Also, show the step-by-step
translation of the statement "peri = 2* ( length +breadth )" through the phases of
the compiler to attain target language.
4. Construct a predictive parsing table and pa se the string -(0 || 1) for the given
grammar.
be ->be || bt | bt
bt->bt && bf | bf
bf->-bi (be) 01
5. Summarize with relevant examples

16
i) S-attributed definition for evaluating an expression.
ii) L-attributed definition for declaration of variables.
6. Explain in detail the concepts involved in different storage allocation techniques.

7. Transform the expression "a = a + b*c-b*c/d into the following


i) Syntax Tree
ii) Directed Acyclic Graph
iii) Quadruple
iv) Triple
v) Indirect triple
vi) Postfix Expression

8. Construct a CLR parsing table for the following grammar and parse the string "abb"
S-> CC
C-> aC | b

9. Construct a minimum state DFA directly for the regular expression (01)*011 by
constructing the syntax tree and finding firstpos(). lastpos() and followpos().
10. Enumerate the challenges during code generation.
11. Illustrate code optimization with necessary examples.
12. Describe the working principle of a simple bootstrap loader.
13. Explain the basic assembler functions with a suitable SIC assembler language
program.
14. Explain the various phases of a compiler in detail. Also, show the step-by-step
translation of the statement "e=a+f* 5" through the phases of a compiler to the
target language.
15. Construct a predictive parsing table for the grammar. Verify whether the input string
id+id id is accepted or not.
E->TE’
E->+TE/epsilon
T->FT’
T-> *FT/epsilon
F-(E)/id
16. Explain in detail about the concepts involved in the syntax-directed definition.
17. Translate the expression "a+a (bc)+(be) "d" into the following representations.
Syntax Tree
Directed Acyclic Graph
Quadruple

17
Triple
Indirect triple
Postfix Expression
18. Construct SLR Parsing table for the following grammar:
S-> L=R
S-> R
L ->*R
L->id
R->L
19. Construct the DFA r=ba(a/b)* ab. Draw the parse tree and calculate the first and last
position of each state.
20. Explain the issues in the design of a code generator with necessary examples.
21. Describe the function of the macro processor with suitable example
22. Elucidate the various phases of a compiler in detail. Also show the step by step
translation of the statement "Percentage = 0.01 * NoPassed / TotalStrength"
through the phases of the compiler to the target language.
23. Construct the predictive parsing table for the following grammar and parse the
string "~ (a && b)",
be -> be || bt | bt
bt->bt&& bf | bf
bf->~ bf | (be) | a | b
24. What are the functions of a type system? Design type system for arithmetic
expression.
25. Differentiate S-attributed and L-attributed definitions with suitable examples.
26. Convert a = a* (b *-c) + (b*-c)/ d into three address code, quadruples, triples,
indirect triples, syntax tree and DAG.
27. Construct the SLR table for the following grammar and parse the string "qbbab".
28. 'e' denotes empty string.
S->qABC
A -> abbD
B->a|e
C->b|e
D->c |e
29. Convert the regular expression (01)*(0+1)1 to DFA.
30. Find the item lo for the following grammar using CLR parsing method.
G: S-> AS
S-> b
S-> SA

18
A->Aa

31. Explain with an example the optimization of basic blocks.


32. What do you mean by peephole optimization? Describe the characteristics of
peephole optimization in detail.

19

You might also like