0% found this document useful (0 votes)
6 views

ProgPrac

The document is an introduction to programming practices, primarily focused on the C programming language. It covers fundamental constructs, file processing, and working with modules, providing a comprehensive guide for beginners. The book includes exercises and summaries for each section to reinforce learning.

Uploaded by

Raghav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

ProgPrac

The document is an introduction to programming practices, primarily focused on the C programming language. It covers fundamental constructs, file processing, and working with modules, providing a comprehensive guide for beginners. The book includes exercises and summaries for each section to reinforce learning.

Uploaded by

Raghav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 130

Introduction to

Programming Practices

DAVID S CUSE

2022
Cover art adapted from https://latexdraw.com/stylish-latex-cover-page/

ii
Preamble

The original version of this book was written by David Scuse in MS Word, circa 2000. Rasit Eski-
cioglu converted it to LaTeX format and made substantial formatting revisions and added some con-
tent. Michael Zapp is the original author of Appendix A and Appendix B (2008). John Braico made
some re-formatting and added some content to these Appendices (2010).

iii
Contents

1 Introduction 1

2 Basic C Constructs 3
2.1 Hello World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 The int Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.6 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.7 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.8 More Array Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.9 Variable-Length Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.10 Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.11 Other C Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.12 Casting Between Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.13 Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.14 Passing Structures to Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.15 Arrays of Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.16 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.17 The typedef Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.18 Enumerated Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.19 Unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.20 Define Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.21 Bit Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.22 Multi-Dimentional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.23 ArrayList Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.24 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.25 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 File Processing 33
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 File Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 File Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Formatted File Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Formatted File Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6 Console Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.7 Standard Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Working with Modules 43


4.1 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Defining a “Constructor” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3 Scope of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.1 Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.2 Private Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

iv
4.3.3 Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.4 Static Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 An ArrayList Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5 Pointers and Memory Management 57


5.1 Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Memory Allocation for Basic Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Java References (Pointers) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4 C Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.5 Pointer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.6 Casting a Pointer Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.7 Passing Values to Functions by Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.8 Arrays and Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.9 Pointer Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.10 Casting and Pointer Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.11 Processing Multi-Dimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.12 Dynamic Memory Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.13 NULL Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.14 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.15 The Stack and the Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.16 void Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.17 Resizing a Memory Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.18 Pointers to Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.19 Processing Run-Time (Command-Line) Parameters . . . . . . . . . . . . . . . . . . . . . . 78
5.20 Pointers to Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.21 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.22 ArrayLists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.23 Linked Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.24 Working with Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.25 Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.26 Big/Little Endian Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.27 Memory Dump Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.28 Pointer Pitfalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.29 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.30 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6 Design by Contract 101


6.1 Design by Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2 Basic Array List Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.3 Error Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7 Unit Tests 107


7.1 A Simple Unit Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2 CUnit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.3 Creating a Unit Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

A Best Programming Practices 113


A.1 Variables and Naming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
A.2 White Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
A.3 Exit Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
A.3.1 Break and Continue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
A.3.2 Switch Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
A.3.3 Return Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
A.3.4 Brace Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
A.3.5 Functional Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
A.3.6 Commenting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

B Programming Standards 121


B.1 Commenting Files and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
B.2 Writing Readable Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
B.3 Writing Maintainable and Extendable Code . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
B.4 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Chapter

1
Introduction

When programming in Java, the Java virtual machine attempts to ensure that you do not perform any
invalid operations. For example, Java generates an error condition if you attempt to access an element
that does not exist in an array. While many of the C programming language features are very similar to
the corresponding features in Java, unfortunately, error-checking is not one of them. In C, you can do
almost anything that you want to, whether it makes sense or not. As a result, you must be much more
careful when writing C programs.
These notes provide a very basic introduction to the C language and also to some of the tools that are
useful when developing C programs. The notes are not a complete introduction to C—you can find that
in an introductory C textbook. Instead, these notes cover the basic constructs in C and compare these
constructs with the equivalent Java constructs.
Although the first four chapters of these notes focus on the C language, these chapters are just a warm-up
to the programming practices discussed in the later chapters. Following good programming practices
is necessary in a language such as C but it is also necessary to follow good practices in higher-level
languages such as Java even though the compiler provides more support than the C compiler provides.
The programming conventions used in these notes are similar to the standard Java conventions. The
particular set of conventions used does not really matter as long as you use the conventions consistently.
When you are developing systems as a member of a team or in a programming course, you follow the
conventions prescribed by the team or by the course instructor.

1
2 CHAPTER 1. INTRODUCTION
Chapter

2
Basic C Constructs

In this chapter, we introduce the basic constructs used to build a C program. Many of these constructs
are identical to the corresponding Java contructs.

2.1 Hello World


The “Hello World” program is the program that is most frequently used to illustrate a new programming
language. Hello World is an important program not only because it provides a starting point when
learning a new language but also because it ensures that the necessary infrastructure (compiler, paths to
libraries, etc.) is set up correctly. The following Java program prints Hello World!

1 import java.io.*; // Java


2
3 public static void main(String[] parms)
4 {
5 System.out.println("Hello World!");
6 }

The astute programmer will notice that the import statement is not required in this program since the
only I/O is directed to System.out. The statement is included only because a similar statement is required
in the corresponding C program shown below.

1 #include <stdio.h>
2
3 void main(int numParms, char *parms[])
4 {
5 printf("Hello C World!\n");
6 }

As can be seen, the C program is very similar to the Java program. The system library stdio contains
the I/O functions that are used in C. The parameters that are passed to the main function (int numParms,
char *parms[]) are equivalent to the parameter passed to Java’s main method (String[] parms); we will
examine the meaning of these parameters in Chapter 4. The function printf (which is similar to Java’s
println method and is identical to Java’s printf method) sends its output to the system console, stdout.
printf does not begin a new line after printing the characters and so a newline character (n) is included
in the output string. Note the slight difference in terminology between the two languages—in Java, the
term “method ”is used while in C, the term “function” is used.
If you are using the gcc compiler, the program can be compiled using the following statement. The
parameter -o ch2 specifies the name of the output file (the executable file (See Chapter A for more infor-
mation about C compilers).

3
4 CHAPTER 2. BASIC C CONSTRUCTS

sh-3.2$ gcc -ggdb main.c -o ch1


main.c: In function ’main’:
main.c:4: warning: return type of ‘main’ is not ’int’
main.c:6:2: warning: no newline at end of file
sh-3.2$

Even though there were two warnings (which will soon be fixed), the program can still be executed by
typing the name of the exe file that was created by the compiler.

sh-3.2$ ch1
Hello C World!
sh-3.2$

As was mentioned, the program generates two warning messages. The second warning message is “no
newline at end of file”. The C compiler expects an empty line at the end of each C source file. This is
simple to fix, just add an empty line after the final “}”.

1 #include <stdio.h>
2
3 void main(int numParms, char *parms[])
4 {
5 printf("Hello C World!\n");
6 }
7

In these notes, the empty line is not always shown at the end of each source file but you should include
it at the end of your programs. The first warning message is “return type of ’main’ is not ’int”’. By
convention, the main function should return a value that indicates whether or not the program exe-
cuted successfully. The following program performs the same processing as the program above but it
also indicates that the main function returns an int value (instead of being a void function, that is, not
returning a value).

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 printf("Hello C World!\n");
6 return 0;
7 }
8

The program no longer generates a warning but the program also does not explicitly return a return
value/code to the operating system. We can explicitly return a return code as shown below.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 printf("Hello C World!\n");
6 return 0;
7 }
8

So now the program compiles correctly, without any warning messages. With most of the sample pro-
grams in these notes, the output generated by the program is shown immediately following the pro-
gram.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 printf("Hello C World!\n");
6 return 0;
7 }
8
CHAPTER 2. BASIC C CONSTRUCTS 5

For example, the program shown above generates Hello World! as its output.

Hello C World!

So now that we have a working C program that we can compile and execute, we can move on to bigger
and better programs.

2.2 Comments
The C language supports the standard /* ... */ style of comments. These comments may extend over
multiple lines. However, many C compilers also support the single-line comment that begins with
// (this is part of the ANSI/ISO C standard C99). The // style of comment is used in this book for
convenience.

2.3 The int Data Type


The C language contains the same basic data types as Java. The following C program declares a variable
sum to be an int (integer) and then computes the sum of the first 5 positive integer values. As in Java,
every variable that is used in a program must first be declared to be of the appropriate type.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int sum;
6
7 sum = 1 + 2 + 3 + 4 + 5;
8 return 0;
9 }

It would be nice to know the value that is stored in the variable sum. In Java, we can display the value
of sum quite easily using the following statement.

System.out.println("The sum is " +sum); // Java

Unfortunately, C output is slightly more complex. The following C statement generates the same result
as the Java statement shown above.
printf("The sum is \%d\\n", sum);

The printf statement takes its first parameter, a string, and uses the contents of the string to format the
subsequent parameter(s). Each parameter that is to be printed is associated with a format code that
begins with a “%”. The format code for printing the value of an int is %d. Format codes are described in
additional detail later in this chapter.

2.4 Loops
The program in the previous section that summed the integers from 1 to 5 hard-coded the values to be
summed. With a small number of values, this might be acceptable but for a large number of values,
hard-coding them is not acceptable. A simple for loop can be used to generalize the processing. Note
that the loop is identical to the corresponding loop in Java.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int sum;
6 int count;
7
8 sum = 0;
6 CHAPTER 2. BASIC C CONSTRUCTS

9 for (count=1; count<=5; count++)


10 {
11 sum = sum + count;
12 }
13 printf("The sum is %d\n", sum);
14 return 0;
15 }

The sum is 15

C also includes the += operator which makes the addition of a value to a variable even easier.

sum += count;

C also supports a while loop that is also identical to Java’s while loop.

1 include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int sum;
6 int count;
7
8 sum = 0;
9 count = 1;
10 while (count<=5)
11 {
12 sum += count;
13 count++;
14 }
15 printf("The sum is %d\n", sum);
16 return 0;
17 }

2.5 Conditional Execution


C includes a condition statement (if statement) that is identical to Java’s condition statement. The fol-
lowing program determines the sum of the even numbers from 1 to 5, inclusive. Note that the remainder
operator in C (%) is the same as Java’s remainder operator. The condition statement is not actually re-
quired (the loop parameters could be modified instead) but the point of the example is to illustrate the
condition statement. The logical comparison/equals operator (==) is the same as Java’s.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int sum;
6 int count;
7
8 sum = 0;
9 for (count=1; count<=5; count++)
10 {
11 if ((count%2)==0)
12 {
13 sum += count;
14 }
15 }
16 printf("The sum of the even values is %d\n", sum);
17 return 0;
18 }

The sum of the even values is 6

C’s if statement may also include an else clause, as shown below:


CHAPTER 2. BASIC C CONSTRUCTS 7

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int sumEven;
6 int sumOdd;
7 int count;
8
9 sumEven = 0;
10 sumOdd = 0;
11 for (count=1; count<=5; count++)
12 {
13 if ((count%2)==0)
14 {
15 sumEven += count;
16 }
17 else
18 {
19 sumOdd += count;
20 }
21 }
22 printf("The sum of the even values is %d \n", sumEven);
23 printf("The sum of the odd values is %d \n", sumOdd);
24
25 return 0;
26 }

The sum of the even values is 6


The sum of the odd values is 9

Although you may not have encountered Java’s conditional element (operator), both Java and C support
an identical (and simplified) if statement, the conditional operator.

int result;
int value1;
int value2;

result = ( (value1 < value2) ? value1 : value2);

The effect of the statement above is that the logical expression (value1 < value2) is evaluated: if the result
of the expression is true, the expression that follows the “?” is returned as the result of the conditional
operator; if the result of the expression is false, the expression that follows the “:” is returned as the
result. The conditional operator above is identical to the if statement shown below (which determines
the smaller of two integer values).

if (value1 < value2)


{
result = value1;
}
else
{
result = value2;
}

2.6 Arrays
C also supports arrays that are equivalent to Java arrays, as shown below.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int sum;
6 int count;
7 int values[5];
8 CHAPTER 2. BASIC C CONSTRUCTS

8
9 for (count=0; count<5; count++)
10 {
11 values[count] = count + 1;
12 }
13
14 sum = 0;
15 for (count=0; count<5; count++)
16 {
17 sum += values[count];
18 }
19 printf("The sum of the values is %d \n", sum);
20
21 return 0;
22 }

Note that the size of the array (5 elements) is hard-coded throughout the program. We will examine
techniques later in this chapter that remove the need for hard-coding constant values repeatedly in a
program. Unlike Java, one array can not be assigned to another array in C. For example, the following
assignment statement is not valid:

int values[5];
int moreValues[5];

moreValues = values; // Wrong!!


char char1;

The contents of one array can be copied to another array by copying the individual elements one at a
time using a loop. Unfortunately, C does not provide a function equivalent to Java’s System.arraycopy
method; although, it is simple enough for the programmer to write such a function (and we will do that
in the next section). Like Java, the contents of an array can not be printed without using a loop. (There
is one exception to this statement—-it is examined later in this chapter when we examine character
strings.)

2.7 Functions
The C language supports user-defined functions in essentially the same way that Java supports user-
defined methods. For example, the following program computes 1+2+3+...+N where N is a parameter
that is passed to the function.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int result;
6
7 result = sum(5);
8 printf("The sum is %d\n", result);
9 return 0;
10 }
11
12 int sum(int n)
13 {
14 int count;
15 int result;
16
17 result = 0;
18 for (count=1; count<=n; count++)
19 {
20 result += count;
21 }
22 return result;
23 }

The sum is 15
CHAPTER 2. BASIC C CONSTRUCTS 9

Although the program compiles and executes correctly, the compiler generates the following warning
message:

sh-3.2$ /main.c: In function ‘main’:


sh-3.2$ /main.c:7: warning: implicit declaration of function ‘sum’

Unlike Java, the C compiler gets upset when a function (in this case sum) is used (or referred to) before
it is defined. We could move the definition of sum before the definition of main but that would make the
program more difficult to read. Instead, in C we define the prototype of the function at the beginning
of the source file (the prototype is essentially the same as the signature of a Java method). Now the
function sum can be used in main without having warning messages generated because the C compiler
already knows what the sum function looks like. We must do this for each function that is referenced
before it is defined.
1 #include <stdio.h>
2
3 int sum(int); // function prototype
4
5 int main(int numParms, char *parms[])
6 {
7 int result;
8
9 result = sum(5);
10 printf("The sum is %d\n", result);
11 return 0;
12 }

The sum is 15

As with Java, if a function does not return a value, its type is void. A void function is sometimes referred
to as a procedure.
In the previous section, we noted that one array can not be assigned directly to another array; nor can
an array be printed in one statement. The following program defines two functions that are useful when
manipulating arrays, arrayCopy copies elements from one array to another array and arrayPrint prints
the elements in an array. Note that C supports array initialization in the declaration statement in the
same manner as does Java.
1 #include <stdio.h>
2
3 void arrayCopy(int[], int[], int, int, int);
4 void printArray(int, int[]);
5 void zeroArray(int, int[]);
6
7 int main(int numParms, char *parms[])
8 {
9 int array1[] = {10, 20, 30, 40, 50}; // array initialization
10 int array2[5];
11
12 printArray(5, array1);
13 arrayCopy(array1, array2, 0, 0, 5);
14 printArray(5, array2);
15
16 zeroArray(5, array2);
17 arrayCopy(array1, array2, 0, 1, 4);
18 printArray(5, array2);
19
20 zeroArray(5, array2);
21 arrayCopy(array1, array2, 1, 0, 4);
22 printArray(5, array2);
23
24 zeroArray(5, array2);
25 arrayCopy(array1, array2, 1, 2, 2);
26 printArray(5, array2);
27
28 return 0;
10 CHAPTER 2. BASIC C CONSTRUCTS

29 }
30
31 void arrayCopy(int fromArray[],int toArray[],int fromStart,int toStart, int length)
32 {
33 int count;
34 for (count=0; count<length; count++)
35 {
36 toArray[toStart+count] = fromArray[fromStart+count];
37 }
38 }
39
40 void printArray(int length, int array[])
41 {
42 int count;
43 for (count=0; count<length; count++)
44 {
45 printf("%2d ", array[count]);
46 }
47 printf("\n");
48 }
49
50 void zeroArray(int length, int array[])
51 {
52 int count;
53 for (count=0; count<length; count++)
54 {
55 array[count] = 0;
56 }
57 }

10 20 30 40 50
10 20 30 40 50
0 10 20 30 40
20 30 40 50 0
0 0 20 30 0

In C, when a basic data type is passed to a function, it is passed by value (also referred to as “call by
value”). This is the same as in Java. For example, in C if an int is passed to a function, a copy is made of
the current contents of the int variable and this copy is passed to the function. If the function modifies
the value of a parameter, the new value remains in effect for the life of the function but the original value
in the calling function is not modified.

In C, when an array is passed to a function, a pointer to the array is passed (this is identical to the
processing performed in Java). In the function, the value of the pointer to the array can be modified
but this modification is visible only within the function, not in the calling function. However, as with
Java, the contents of the array that is pointed to can be modified and these modifications do affect the
contents of the array in the calling function. This type of parameter passing is sometimes referred to as
“call by reference” even though it is really call by value. This topic will be examined in more detail in
Chapter 4.

2.8 More Array Manipulation


In this section, we examine some additional features of array manipulation. In the following program,
the variable MAX_VALUES is a literal constant that is the same as a Java final variable. The use of a literal
constant removes the need to hard-code constant values throughout a program.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 const int MAX_VALUES = 5; // Literal Constant
6 int sum;
7 int count;
8 int values[MAX_VALUES];
9
CHAPTER 2. BASIC C CONSTRUCTS 11

10 for (count=0; count<MAX_VALUES; count++)


11 {
12 values[count] = count + 1;
13 }
14
15 sum = 0;
16 for (count=0; count<MAX_VALUES; count++)
17 {
18 sum += values[count];
19 }
20 printf("The sum of the values is %d \n", sum);
21
22 return 0;
23 }

Sum of the values is 15

If an array is passed to a function, the number of elements in the array must also be passed as a pa-
rameter since C does not provide a mechanism for determining the number of elements in an array at
run-time.
1 #include <stdio.h>
2
3 int sum(int, int[]);
4
5 int main(int numParms, char *parms[])
6 {
7 const int MAX_VALUES = 5;
8 int result;
9 int count;
10 int values[MAX_VALUES];
11
12 for (count=0; count<MAX_VALUES; count++)
13 {
14 values[count] = count + 1;
15 }
16
17 result = sum(MAX_VALUES, values);
18 printf("The sum of the values is %d \n", result);
19 return 0;
20 }
21
22 int sum(int numEntries, int entries[])
23 {
24 int result;
25 int count;
26
27 result = 0;
28 for (count=0; count<numEntries; count++)
29 {
30 result += entries[count];
31 }
32 return result;
33 }

Sum of the values is 15

Actually, the preceding statement is not entirely true. It is possible to determine the number of elements
in an array in the function in which the array is declared by using the sizeof function.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 const int MAX_VALUES = 5;
6 int result;
7 int count;
8 int values[MAX_VALUES];
12 CHAPTER 2. BASIC C CONSTRUCTS

9
10 for (count=0; count<MAX_VALUES; count++)
11 {
12 values[count] = count + 1;
13 }
14
15 printf("Size of array is %d \n", sizeof(values));
16 return 0;
17 }

Size of array is 20

The value 20 is obviously not the number of elements in the array. What C is telling us is the number of
bytes of memory that the array occupies. Since each element of the array occupies more than one byte,
we need to divide the total amount of memory allocated for the array by the amount of storage that is
allocated for one int value. The following simple modification causes the correct result to be generated.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 const int MAX_VALUES = 5;
6 int result;
7 int count;
8 int values[MAX_VALUES];
9
10 for (count=0; count<MAX_VALUES; count++)
11 {
12 values[count] = count + 1;
13 }
14
15 printf("Size of array is %d \n", sizeof(values)/sizeof(int));
16 return 0;
17 }

Size of array is 5

Unfortunately, as was mentioned, sizeof works correctly only in the function in which the array is
declared. If an array is passed as a parameter, no information is passed along with the array that permits
the function to determine correctly the number of elements that are in the array. While we can use the
sizeof function in a called function, C generates a value of 1 for the number of elements, regardless of
the actual size of the array. It is for this reason that the actual number of elements in an array is passed
to any function that processes the array. (An alternative to passing the number of elements in an array
is to place a sentinel value after the last used element in the array.)

1 int sum(int numEntries, int entries[])


2 {
3 int result;
4 int count;
5
6 printf("Size of array is %d \n", sizeof(entries)/sizeof(int));
7
8 result = 0;
9 for (count=0; count<numEntries; count++)
10 {
11 result += entries[count];
12 }
13 return result;
14 }

Size of array is 1
CHAPTER 2. BASIC C CONSTRUCTS 13

2.9 Variable-Length Arrays


In the updated C standard (C99), the size of an array can be declared at run time (instead of at compile
time) as shown in the following program. This type of array is referred to as a “variable-length array”.

1 #include <stdio.h>
2
3 int sum(int);
4
5 int main(int numParms, char *parms[])
6 {
7 int numValues = 5;
8 int result;
9
10 result = sum(numValues);
11 printf("%d\n", result);
12
13 return 0;
14 }
15
16 int sum(int nentries)
17 {
18 int numEntries = nentries * 2;
19 int entries[numEntries];
20 int result;
21 int count;
22
23
24 result = 0;
25 for (count=0; count<numEntries; count++)
26 {
27 entries[count] = count+1;
28 result += entries[count];
29 }
30 return result;
31 }

55

We will examine an alternative mechanism for creating arrays dynamically in Chapter 4.

2.10 Recursive Functions


The C language supports recursive functions in the same way that Java does. For example, the following
program uses recursion instead of iteration to compute 1+2+3+...+N for N>=1.

1 #include <stdio.h>
2
3 int sum(int);
4
5 int main(int numParms, char *parms[])
6 {
7 int result;
8
9 result = sum(5);
10 printf("The sum is %d\n", result);
11 return 0;
12 }
13
14 int sum(int n)
15 {
16 int result;
17
18 if (n>1)
19 {
20 result = n + sum(n-1);
21 }
14 CHAPTER 2. BASIC C CONSTRUCTS

22 else
23 {
24 result = 1;
25 }
26 return result;
27 }

The sum is 15

Note that if the condition N>=1 is not true, the function returns the value 1.

2.11 Other C Data Types


The C language contains the same basic data types as does Java (with one exception). These data types
are:
Data Type Format Code Meaning
int %d an integer
float %f a floating-point value
double %f double-precision value
char %c a single character
Some of the data types may be modified by adding an appropriate modifier (short, long, signed, unsigned).
For example, an int may be modified to be an unsigned int. An unsigned int can store a larger value than
a signed int but an unsigned int can only store non-negative values.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 char charValue = ’a’;
6 int intValue = 33;
7 float floatValue = 4.1;
8 double doubleValue = 5.1;
9
10 printf("%c, %d, %f, %f", charValue, intValue, floatValue, doubleValue);
11 return 0;
12 }

a, 33, 4.100000, 5.100000

The format codes shown above are just the basic codes. The printf statement supports many elaborations
on these format codes. For example, the following program illustrates the use of a width specification
in the format codes.
1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int value1;
6 int value2;
7
8 value1 = 3 < 5;
9 value2 = 4 != 4;
10 if (value1 && value2)
11 {
12 printf("The expression %d and %d is true\n", value1, value2);
13 }
14 else
15 {
16 printf("The expression %d and %d is false\n", value1, value2);
17 }
18
19 if (value1 || value2)
20 {
CHAPTER 2. BASIC C CONSTRUCTS 15

21 printf("The expression %d or %d is true\n", value1, value2);


22 }
23 else
24 {
25 printf("The expression %d or %d is false\n", value1, value2);
26 }
27 return 0;
28 }

a, 33, 4.1, -5.1

The Java boolean data type (which can be assigned either the value true or the value false) is not included
in the list of C basic data types. Instead, C uses integer values to represent true and false. The following
program illustrates how the results of logical expressions are manipulated.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 char charValue = ’a’;
6 int intValue = 33;
7 float floatValue = 4.1;
8 double doubleValue = -5.1;
9
10 printf("%2c, %3d, %4.1f, %4.1f", charValue, intValue, floatValue,doubleValue);
11 return 0;
12 }

The expression 1 and 0 is false


The expression 1 or 0 is true

When C saves the result of evaluating a logical expression, it represents true and false by the values 1
and 0, respectively. However, when C evaluates the result of a logical expression that contains integer
values, 0 is interpreted as false and any non-zero value is interpreted as true. The following program
illustrates this principle.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int value1;
6 int value2;
7
8 value1 = 3;
9 value2 = 0;
10 if (value1 && value2)
11 {
12 printf("The expression %d and %d is true\n", value1, value2);
13 }
14 else
15 {
16 printf("The expression %d and %d is false\n", value1, value2);
17 }
18
19 if (value1 || value2)
20 {
21 printf("The expression %d or %d is true\n", value1, value2);
22 }
23 else
24 {
25 printf("The expression %d or %d is false\n", value1, value2);
26 }
27 return 0;
28 }

The expression 3 and 0 is false


16 CHAPTER 2. BASIC C CONSTRUCTS

The expression 3 or 0 is true

Even though any non-zero value is interpreted as true, it is a good programming practice to use only
the value 1 to represent true.

2.12 Casting Between Data Types


In Java, it is occasionally necessary to cast one data type (or object) to another data type (object). The
same is true in C.
When converting from one basic data type to another, C does not insist on a cast but a cast may be
included. For example,

int int1;
long int longInt1;
float float1;
double double1;

char1 = ’a’;
longInt1 = 23;
float1 = 1.0;

int1 = char1;
int1 = (int) char1;
char1 = int1;

int1 = longInt1;
longInt1 = int1;

float1 = (float) double1;


double1 = float1;

We shall see in later chapters why explicit casts are sometimes necessary.

2.13 Structures
In Java, there is no simple mechanism for defining a group of related data items together so that they
can be manipulated as a whole. Instead, you can create an object that contains an instance variable for
each data item. The following Java program illustrates the use of an object to group related data items
together.

1 public class TestPerson // Java program


2 {
3 public static void main(String[] parms)
4 {
5 Person person1;
6
7 person1 = new Person();
8
9 person1.number = 25;
10 person1.age = 30;
11 System.out.println("Person1: " +person1.number +" " +person1.age);
12 }
13 }
14
15 public class Person // Java program
16 {
17 public int number;
18 public int age;
19 }

Person1: 25 30

In C, there is a convenient language feature called a “struct” that provides a similar facility. The follow-
CHAPTER 2. BASIC C CONSTRUCTS 17

ing is the C language equivalent of the Java program shown above.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 struct
6 {
7 int number;
8 int age;
9 } person1;
10
11 person1.number = 25;
12 person1.age = 30;
13
14 printf("Person1: %d %d \n", person1.number, person1.age);
15 return 0;
16 }

Person1: 25 30

The structure is defined using the struct keyword. Note that there must be a semicolon at the end of
the structure definition. The variable, person1, at the end of the structure definition is an instance of
the specified structure. The variable person1 can be thought of as similar to an object that contains the
public instance variables number and age. The variables in person1 are referenced in exactly the same way
that a public variable in a Java object is referenced—by specifying the identifier person1, a period (“.”),
and the name of one of the variables defined in the structure.
An alternative method of declaring a structure is shown below. In this program, the structure is given
a name which can be used in subsequent declaration statements. The advantage of this method is that
once the structure has been defined, it may be used in the declaration of any number of variables.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 struct person
6 {
7 int number;
8 int age;
9 };
10
11 struct person person1;
12
13 person1.number = 25;
14 person1.age = 30;
15
16 printf("Person1: %d %d \n", person1.number, person1.age);
17 return 0;
18 }

Person1: 25 30

Just as the object referred to by a variable in Java may be assigned to another variable, the contents
of a struct variable may be assigned to another struct variable. Unlike Java (which simply copies the
reference of – or pointer to – the object), C copies the contents of the structure. The following example
illustrates this process.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 struct person
6 {
7 int number;
8 int age;
18 CHAPTER 2. BASIC C CONSTRUCTS

9 };
10
11 struct person person1;
12 struct person person2;
13
14 person1.number = 25;
15 person1.age = 30;
16
17 person2 = person1;
18
19 person2.number += 10;
20
21 printf("Person1: %d %d \n", person1.number, person1.age);
22 printf("Person2: %d %d \n", person2.number, person2.age);
23 return 0;
24 }

Person1: 25 30
Person1: 35 30

Note that modifying person2 does not have any impact on the contents of person1. Structures can not be
compared for equality directly—instead, you must compare corresponding elements in order to deter-
mine if the contents of two structures are equal. We will examine how this can be accomplished in the
next section.
A structure may include one or more arrays inside the structure in addition to any necessary basic data
types. For example, the following is a valid structure:

struct student
{
int number;
int age;
int grades[5];
};

The array grades is accessed in the same manner as a basic data item.

int count;
int sum;
struct student student1;

sum = 0;
for (count=0; count<5; count++)
{
sum += student1.grades[count];
}

2.14 Passing Structures to Functions


A structure is passed as a parameter to a function in exactly the same way that a basic data type is
passed (call by value). The following program illustrates passing a structure to a function. Note that
the definition of the structure has been moved to the beginning of the program before the main function
so that the structure definition is available (i.e. is global) to all functions in this file (more on this in
Chapter 3).

1 #include <stdio.h>
2
3 struct person
4 {
5 int number;
6 int age;
7 };
8
9 void printPerson(struct person);
10
CHAPTER 2. BASIC C CONSTRUCTS 19

11 int main(int numParms, char *parms[])


12 {
13 struct person person1;
14
15 person1.number = 25;
16 person1.age = 30;
17
18 printPerson(person1);
19 return 0;
20 }
21
22 void printPerson(struct person person1)
23 {
24 printf("Person: %d %d \n", person1.number, person1.age);
25 }

Person: 25 30

When a structure is passed to a function, it is passed by value. This means that if the contents of the
structure are modified by the function, the changes are not reflected in the calling function.

The following program illustrates the use of a function that compares 2 instances of the person struct.
Again, the definition of the structure has been moved outside of the main function to make the structure
global to the entire program. We will examine global variables in more detail in Chapter 3.

1 #include <stdio.h>
2
3 struct person
4 {
5 int number;
6 int age;
7 };
8
9 void printPerson(struct person);
10 int equalPersons(struct person, struct person);
11
12 int main(int numParms, char *parms[])
13 {
14 struct person person1;
15 struct person person2;
16
17 person1.number = 25;
18 person1.age = 30;
19
20 person2.number = 50;
21 person2.age = 50;
22
23 printPerson(person1);
24 printPerson(person2);
25
26 if (equalPersons(person1, person2))
27 {
28 printf("The two structures are equal.\n");
29 }
30 else
31 {
32 printf("The two structures are not equal.\n");
33 }
34
35 return 0;
36 }
37
38 void printPerson(struct person person1)
39 {
40 printf("Person: %d %d \n", person1.number, person1.age);
41 }
42
43 int equalPersons(struct person person1, struct person person2)
44 {
20 CHAPTER 2. BASIC C CONSTRUCTS

45 int result;
46
47 result = ((person1.number==person2.number) && (person1.age==person2.age));
48 return result;
49 }
50

Person: 25 30
Person: 50 50
The two structures are not equal.

Note that the equalPersons function is essentially the same as the Java equals method that would be
written to compare two Java objects of type Person.
Passing a structure by value to a function includes any enclosed arrays. For example, the array grades[5]
defined in the previous section is passed by value. This is contrary to the normal rule for passing arrays
to functions.

2.15 Arrays of Structures


Parallel arrays are used to maintain a collection of groups of related data items. In Java, parallel arrays
were used primarily to make life difficult for students—as you should be aware by now, instead of
using parallel arrays in Java, you would instead use an array of objects where each object contains the
necessary instance variables.
In C, we do exactly the same thing. Although parallel arrays are certainly valid, it is normally more
convenient to define an array of structures.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 struct person
6 {
7 int number;
8 int age;
9 };
10
11 const int MAX_PERSONS = 5;
12 struct person persons[MAX_PERSONS];
13 int count;
14
15 for (count=0; count<MAX_PERSONS; count++)
16 {
17 persons[count].number = 1000 + count;
18 persons[count].age = 30 + count;
19 }
20
21 for (count=0; count<MAX_PERSONS; count++)
22 {
23 printf("Person %d: %d %d \n",
24 count, persons[count].number, persons[count].age);
25 }
26
27 return 0;
28 }

Person 0: 1000 30
Person 1: 1001 31
Person 2: 1002 32
Person 3: 1003 33
Person 4: 1004 34

We can also pass an array of structures to a function. The structure definition must be global so that all
functions in the source file can access the structure.
CHAPTER 2. BASIC C CONSTRUCTS 21

1 #include <stdio.h>
2
3 struct person
4 {
5 int number;
6 int age;
7 };
8
9 void printPersons(int, struct person[]);
10
11 int main(int numParms, char *parms[])
12 {
13 const int NUM_PERSONS = 5;
14
15 struct person persons[NUM_PERSONS];
16 int count;
17
18 for (count=0; count<NUM_PERSONS; count++)
19 {
20 persons[count].number = 1000 + count;
21 persons[count].age = 30 + count;
22 }
23
24 printPersons(NUM_PERSONS, persons);
25
26 return 0;
27 }
28
29 void printPersons(int numPersons, struct person persons[])
30 {
31 int count;
32
33 for (count=0; count<numPersons; count++)
34 {
35 printf("Person %d: %d %d \n",
36 count, persons[count].number, persons[count].age);
37 }
38 }

Person 0: 1000 30
Person 1: 1001 31
Person 2: 1002 32
Person 3: 1003 33
Person 4: 1004 34

2.16 Strings

The char data type exists in C as well as in Java. The following program fills a char array with the 26
lower-case letters in the alphabet.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 char lowerCase[26];
6 int count;
7
8 lowerCase[0] = ’a’;
9 for (count=1; count<26; count++)
10 {
11 lowerCase[count] = lowerCase[count-1] + 1;
12 }
13
14 for (count=0; count<26; count++)
15 {
16 printf("%c", lowerCase[count]);
17 }
18 printf("\n");
22 CHAPTER 2. BASIC C CONSTRUCTS

19
20 return 0;
21 }

abcdefghijklmnopqrstuvwxyz

Unfortunately, since the size of each array must be declared at compile time, we can not resize an array
(as we could in Java) once we know how many characters are to be processed as a collection. As a result,
in C there is a convention that a special sentinel character (’0’) is placed after the last valid character in
a collection of characters. A char array that is terminated with the sentinel character is referred to as a
“string” in the C language. A string is not a special data type, a string is just an array of characters that
is properly terminated.
The following program creates a properly terminated string. The entire string can then be printed using
the %s format code. This is the exception to the rule mentioned earlier that in order to print the contents
of an array, a loop must be used to print the array elements one at a time.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 char lowerCase[27];
6 int count;
7
8 lowerCase[0] = ’a’;
9 for (count=1; count<26; count++)
10 {
11 lowerCase[count] = lowerCase[count-1] + 1;
12 }
13
14 lowerCase[26] = ’\0’;
15
16 printf("%s\n", lowerCase);
17
18 return 0;
19 }

abcdefghijklmnopqrstuvwxyz

As long as a string is terminated correctly, the size of the array in which the string is stored may be any
value, as long as there is sufficient room to hold all of the characters in the string plus the sentinel value.
When printed using the %s format code, only the characters up to the sentinel value are displayed.
Recall that Java contains a class named String and this class contains a variety of methods that manipu-
late the contents of a String. Since C does not support objects, an array of char’s is the best that we can
do, but C does provide the string library of functions that can be used to manipulate C’s equivalent (a
char array that is correctly terminated with the ’\0’ character) of a Java String.

The following are some additional functions in the C string library. (Any good C reference manual
should contain a list of the details of each function in the library.) Each function assumes that the source
string is properly terminated; if the string is not properly terminated, the results will be unpredictable
(and possibly disasterous to the health of your program).

strcpy(char destination[], char source[]); // copy the contents of source


// to destination

strncpy(char destination[], char source[], int n); // copy the first n characters
// from source to destination

strncat(char destination[], char source[], int n); // append n characters from


// source to destination

strlen(char source[]); // return the number of characters in source


// (not including the string termination character)
CHAPTER 2. BASIC C CONSTRUCTS 23

strcmp(char string1[], char string2[]); // compare the contents of the two strings

The following program illustrates the use of the strcpy(to, from) function. Note that C permits the use
of a string constant in some situations (such as initializing an array).

1 #include <stdio.h>
2 #include <string.h>
3
4 int main(int numParms, char *parms[])
5 {
6 char string1[] = "This is a string.";
7 char string2[10];
8 char string3[20];
9
10 strcpy(string2, string1);
11 strcpy(string3, string1);
12
13 printf("%s\n", string1);
14 printf("%s\n", string2);
15 printf("%s\n", string3);
16
17 return 0;
18 }

.
This is a string.
.

However, the output is clearly incorrect. If you examine the program closely, you should notice that
string2 is not long enough to contain all of the characters in string1 plus the sentinel character. As a
result, when strcpy copies the characters from string1, it overwrites whatever comes after string2 in
memory.
The problem can be fixed by increasing the size of string2.

1 #include <stdio.h>
2 #include <string.h>
3
4 int main(int numParms, char *parms[])
5 {
6 char string1[] = "This is a string.";
7 char string2[20];
8 char string3[20];
9
10 strcpy(string2, string1);
11 strcpy(string3, string1);
12
13 printf("%s\n", string1);
14 printf("%s\n", string2);
15 printf("%s\n", string3);
16
17 return 0;
18 }

This is a string.
This is a string.
This is a string.

This was a simple program but the complications caused by having a destination string too small to
receive all of the characters in the source string are significant. One of the difficulties with C program-
ming is that such an error may not be immediately obvious and debugging the program could take a
significant amount of time.
Any time that you use the string functions to copy characters from one array to another array, the
potential exists for over-writing the memory after an array if the array is not sufficiently large to contain
24 CHAPTER 2. BASIC C CONSTRUCTS

all of the characters or if the source array is not properly terminated.

2.17 The typedef Statements


An alternative mechanism for defining a structure (and other data types as well) is the typedef state-
ment. The typedef statement defines a new data type (that is really just an alias for a basic data type or
structure).
The following program illustrates the use of the typedef statement to define the person structure used in
the previous sections. Note that the keyword struct is only used in the typedef statement – subsequently,
just the name of the structure (person in this case) is used. Also note that the name of the new data type
is specified at the end of the declaration. In this program, the typedef statement is defined globally (prior
to the main function) because the structure is used in more than one function. The typedef statement
may also be defined within a function if it will be used only within that one function.

1 #include <stdio.h>
2
3 typedef struct
4 {
5 int number;
6 int age;
7 } person;
8
9 void printPerson(person);
10
11 int main(int numParms, char *parms[])
12 {
13 person person1;
14
15 person1.number = 25;
16 person1.age = 30;
17
18 printPerson(person1);
19 return 0;
20 }
21
22 void printPerson(person person1)
23 {
24 printf("Person1: %d %d \n", person1.number, person1.age);
25 }

Person1: 25 30

2.18 Enumerated Types


An enumerated type is a collection of symbolic constants that can be used instead of explicit values.

enum {FALSE, TRUE} switch;

switch = TRUE;

if (switch == FALSE)
{
...
}

The values used in an enumerated type are integers that begin with 0 and increase in steps of 1. The
only exception to the assignment of values to the enumerated constants occurs when the constants are
given explicit values in the declaration.

enum {VALUE1=10, VALUE2=20} values;

Any constant that is not given a value is assigned the value of the previous constant plus 1.
CHAPTER 2. BASIC C CONSTRUCTS 25

As with the definition of structs, there are multiple ways of declaring the same enumerated type. The
following definition is the same as the one used in the earlier example, with the exception that the
definition below can be used in multiple declaration statements.

enum {VALUE1=10, VALUE2=20} values;

enum BOOLEAN {FALSE, TRUE};

enum BOOLEAN switch;

switch = TRUE;

if (switch == FALSE)
{
...
}

An enumerated type may be used in a typedef statement. This makes the definition of variables some-
what easier because the enum does not have to be included in the definition of each variable. The
following enumerated type implements the equivalent of the boolean data type in Java.

typedef enum {FALSE, TRUE} BOOLEAN;

BOOLEAN switch;

2.19 Unions
We saw earlier how several variables can be organized as a collection using the struct data type. With
a struct, we can refer to individual values and we can also pass the entire collection of values as one
parameter, the name of the struct. A related data type is the union data type. A union is the opposite of a
struct—any number of variables may be defined in a iunion but only one of the variables can be used at
a time and the compiler reserves only enough room for the largest data type in the union.
For example, in the following union, 2 variables are defined, an int and a double.

union
{
int intValue;
double doubleValue;
} union1;

On many machines, the C compiler reserves 8 bytes for this union (the sizeof(double) is 8 bytes and the
is 4 bytes). Memory organization will be examined in more detail in Chapter 4.
sizeof(int)

The variable union1 is declared to be of type union using the same notation as declaring a variable to be
a struct.
A union may also be defined separately from its use in a declaration statement (the same as with a struct).

union MY_UNION
{
int intValue;
double doubleValue;
};

union MY_UNION union1;

Similarly, a union may be defined using a typedef statement.

typedef union
{
int intValue;
double doubleValue;
26 CHAPTER 2. BASIC C CONSTRUCTS

} MY_UNION;

MY_UNION union1;

Regardless of which technique is used to declare the variable union1, we can then initialize and reference
its variables using the same notation as for a struct.
Regardless of which technique is used to declare the variable union1, we can then initialize and reference
its variables using the same notation as for a struct.

union1.intValue = 15;
printf("%d\n", union1.intValue);

15

Alternatively, we can initialize and refer to doubleValue.

union1.doubleValue = 123456789.0;
printf("%f\n", union1.doubleValue);

123456789.000000

However, only one value can be stored and referenced at a time—if after assigning a value to the double,
as shown above, we then reference the intValue in the union, the value that is printed is garbage (more
or less).
printf("%d\n", union1.intValue);

1409286144

When programmers use a union, they often include a flag or tag field that identifies the value that is
currently stored in the union. What we would like to do is to add a tag field to the beginning of the union.
Unfortunately, the instructions shown below do not do what we want. In this union, we can store a value
in tag, or in intValue, or in doubleValue but we can not store values in both tag and intValue at the same
time.
union MY_UNION
{
enum {INT_TYPE, DOUBLE_TYPE} tag; // Wrong!!
int intValue;
double doubleValue;
};

However, if we move the enumerated type out of the union and into a struct that also contains the union,
the program will work as desired.

struct MY_STRUCT
{
enum {INT_TYPE, DOUBLE_TYPE} tag;
union
{
int intValue;
double doubleValue;
} union1;
};

We can now declare a struct variable that contains the tag field plus the union of either an int or a double.
Before manipulating either the int or the double, we check the tag field to determine the type of value
that is currently stored in the union.

struct MY_STRUCT struct1;

struct1.tag = INT_TYPE;
struct1.union1.intValue = 15;
CHAPTER 2. BASIC C CONSTRUCTS 27

if (struct1.tag == INT_TYPE)
{
printf("%d\n", struct1.union1.intValue);
}
else if (struct1.tag == DOUBLE_TYPE)
{
printf("%f\n", struct1.union1.doubleValue);
}

Note that in order to reference either of the union variables in the structure, it is necessary to specify both
the name of the structure and the name of the union.
As with structs and enumerated types, a union may be defined in a typedef statement. As shown in the
example below, once the typedef statement has been defined, variables are declared to be of type defined
in the typedef statement.
typedef struct
{
enum {intType, doubleType} tag;
union
{
int intValue;
double doubleValue;
} union1;
} MY_STRUCT;

unions provide a mechanism for saving space in memory and also disk space if the contents of memory
are written to disk. These days most machines have more than enough memory and disk space so the
advantage of using unions often doesn’t make up for the effort required to use unions properly. How-
ever, if you are developing embedded systems where memory and disk memory may be at a premium,
saving some memory by using unions is often justified.

2.20 Define Statement


The #define statement is used to define a symbolic constant; the value of a symbolic constant is substi-
tuted for each occurrence of its name. Note that there is no equals sign in the definition of a symbolic
constant.
MY_STRUCT struct1;
#define MAX_ELEMENTS 100 // Symbolic Constant

int data[MAX_ELEMENTS];
int count;

for (count=0; count<MAX_ELEMENTS; count++)


{
...
}

The use of a symbolic constant instead of literal constant makes a program easier to read and also easier
to modify (since you don’t have to search through your program looking for all occurrences of the
constant and hoping that you don’t miss any occurrences).

2.21 Bit Manipulation


Computer memory consists of collections of “bytes” and each byte consists of a fixed number of “bits”.
It depends on the architecture of a specific computer but in most computers these days, there are 8
bits in each byte. Similarly, each data type in C consists of a machinedependent number of bytes. For
example, on most Intel computers, 4 bytes are allocated for each int and 1 byte is allocated for each char.
However, rather than making an assumption about the number of bytes used to store each data type,
use the sizeof() function to determine the actual number of bytes used to represent a particular data
28 CHAPTER 2. BASIC C CONSTRUCTS

type. Using good programming practices such as this make a program more machine independent.
When performing low-level processing, it is often necessary to manipulate the individual bits in a data
item instead of manipulating all of the bits as a unit. C provides the following bitmanipulation operators:
& bitwise and
| bitwise or (inclusive or)
^ bitwise xor (exclusive or)
<< bitwise left shift
>> bitwise right shift
~ bitwise one’s complement
These operators work as you would expect, performing the operation on corresponding bits in the
operand or operands provided.
When performing bit-wise operations, it is often easier to easier to verify the results if they are displayed
in hexadecimal (or octal if that is your number system’s base). The following program illustrates a few
simple operations and displays the results in hexadecimal (except for the output on the last line which
is displayed in octal).

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int value1;
6 int value2;
7 int value3;
8
9 value1 = 1;
10 value2 = -1;
11
12 value3 = value1 & value2;
13 printf("Line1: %08X %08X %08X \n", value1, value2, value3);
14
15 value3 = value2 & 0XFFFFFF00;
16 printf("Line2: %08X %08X %08X \n", value1, value2, value3);
17
18 value3 = 8 >> 1;
19 printf("Line3: %08X % \n", 8, value3);
20
21 value3 = value2 >> 2;
22 printf("Line4: %08X \n", value3);
23
24 value3 = ((unsigned) value2) >> 4;
25 printf("Line5: %08X \n", value3);
26
27 value3 = value2 & 077777777776;
28 printf("Line6: %011o %011o %011o \n", value1, value2, value3);
29
30 return 0;
31 }

Line1: 00000001 FFFFFFFF 00000001


Line2: 00000001 FFFFFFFF FFFFFF00
Line3: 00000008 00000004
Line4: FFFFFFFF
Line5: 0FFFFFFF
Line6: 00000000001 37777777777 37777777776

2.22 Multi-Dimentional Arrays


In earlier sections, we examined the use of one-dimensional arrays. C also supports multidimensional
arrays that are similar to Java’s multi-dimensional arrays. In this section, we will take a quick look at
2-dimensional arrays. Two-dimensional arrays are declared in the same manner as are two-dimensional
arrays in Java. The use of constants to specify the number of rows and the number of columns is a good
CHAPTER 2. BASIC C CONSTRUCTS 29

programming practice. Some C texts do not use the term “multi-dimensional array”; instead, they
prefer the term “array of an array”. In these notes, we will use the term multidimensional array.

The following program illustrates the creation, initialization, and manipulation of a two-dimensional
array. The only unusual point to note is that while the size of the first dimension of the array does not
have to be defined in the called function (printArray), the size of the second dimension must be explicitly
declared. (We will see how this can be avoided in Chapter 4.

1 #include <stdio.h>
2
3 void printArray(int, int, int[][]);
4
5 const int NUM_ROWS = 3;
6 const int NUM_COLS = 2;
7
8 int main(int numParms, char *parms[])
9 {
10 int myArray[NUM_ROWS][NUM_COLS];
11 int row, col;
12
13 for (row=0; row<NUM_ROWS; row++)
14 {
15 for (col=0; col<NUM_COLS; col++)
16 {
17 myArray[row][col] = row*10 + col;
18 }
19 }
20
21 printArray(NUM_ROWS, NUM_COLS, myArray);
22
23 return 0;
24 }
25
26 void printArray(int rows, int cols, int myArray[][NUM_COLS])
27 {
28 int row;
29 int col;
30
31 for (row=0; row<rows; row++)
32 {
33 for (col=0; col<cols; col++)
34 {
35 printf("%2d ", myArray[row][col]);
36 }
37 printf("\n");
38 }
39 }

As you would expect, multi-dimensional arrays can be declared to be of any valid C data type.

2.23 ArrayList Functions


In this section, we use the techniques that have been developed in this chapter to define a collection
of functions that perform much of the same processing as Java’s ArrayList class. This example will be
continued some of the following chapters.

1 #include <stdio.h>
2
3 void printArray(int, int, int[][]);
4
5 const int NUM_ROWS = 3;
6 const int NUM_COLS = 2;
7
8 int main(int numParms, char *parms[])
9 {
10 int myArray[NUM_ROWS][NUM_COLS];
11 int row, col;
30 CHAPTER 2. BASIC C CONSTRUCTS

12
13 for (row=0; row<NUM_ROWS; row++)
14 {
15 for (col=0; col<NUM_COLS; col++)
16 {
17 myArray[row][col] = row*10 + col;
18 }
19 }
20
21 printArray(NUM_ROWS, NUM_COLS, myArray);
22
23 return 0;
24 }
25
26 void printArray(int rows, int cols, int myArray[][NUM_COLS])
27 {
28 int row;
29 int col;
30
31 for (row=0; row<rows; row++)
32 {
33 for (col=0; col<cols; col++)
34 {
35 printf("%2d ", myArray[row][col]);
36 }
37 printf("\n");
38 }
39 }

It should be obvious that the arraylist functions defined above are just a beginning. For example, the
array used to contain the arraylist elements is an array of int’s; also, the array is not automatically
resized when the array becomes full.

2.24 Summary
In this chapter, we examined the fundamental C constructs. These constructs are almost identical to the
corresponding constructs in Java (since the core of Java was based in large part on the C language).

Unlike Java, C programs are compiled directly into exe files (which consist of machine instructions
specific to the machine architecture on which the program was compiled). With the recent increases in
speed of the Java Virtual Machine, the difference in performance between executing a C program and a
Java program is becoming smaller and smaller.

The C language is a very elegant and compact language. While C does not support objects, we will see
in Chapters 3 and 4 that we can create programs that are similar to simple Java objects (not including
inheritance).

The C language is not as “safe” a language as Java and the programmer must be careful when writing
C programs.

For example, C does not initialize variables—it is the responsibility of the programmer to initialize all
variables (well, almost all variables but it is a good programming practice to initialize all variables).

C does not perform subscript checking when traversing an array.

C does not have a string data type; a “string” in C is an array of characters that is correctly terminated.

The order of evaluation of expressions in C is not always obvious so the programmer should specify the
order of evaluation explicitly by including appropriate parentheses.

As you learn more features of the C language, it will be tempting to write very convoluted programs that
take advantage of the intricacies of the language—try to avoid this habit. Keep your code as readable as
possible and above all, ensure that you follow good programming practices so that your programs do
not contain subtle bugs.
CHAPTER 2. BASIC C CONSTRUCTS 31

2.25 Exercises
1. What are the differences between C and Java (as discussed in this chapter)?
2. What is the difference between call by value and call by reference? Which data types are passed
by value and which are passed by reference?
3. What happens when a char[] that does not contain the termination character is processed using
the string manipulation functions?
4. How can you create the equivalent of a very simple Java object in the C language?
32 CHAPTER 2. BASIC C CONSTRUCTS
Chapter

3
File Processing

3.1 Introduction
In Chapter 1, each program wrote its output to the system console, stdout. In this chapter, we examine
basic file processing in C. Using C’s file processing functions, we can read information from a file and
write information to a file. File processing in C is very similar to file processing in Java.

3.2 File Input


The following program reads the contents of the file in.txt one character at a time and sends each
character to the system console, stdout. The definition of the file variable: FILE *infile; contains an
unusual character, an asterisk (“*”). The asterisk has a special meaning in C which we will examine in
Chapter 4. For now though, just ignore the asterisk.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 FILE *infile;
6 char c;
7
8 infile = fopen("in.txt", "r");
9
10 c = fgetc(infile);
11 while (c != EOF)
12 {
13 printf("%c", c);
14 c = fgetc(infile);
15 }
16 fclose(infile);
17
18 printf("All done\n");
19 return 0;
20 }

You should notice that this program is almost identical to the corresponding Java program that reads
characters from a file, one character at a time. The value EOF that is used in the while statement indicates
that the end of the file has been reached. This value is defined in stdio.h.
1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 FILE *infile;
6 char c;
7
8 infile = fopen("in.txt", "r");
9
10 while ((c=fgetc(infile)) != EOF)

33
34 CHAPTER 3. FILE PROCESSING

11 {
12 printf("%c", c);
13 }
14 fclose(infile);
15
16 printf("All done\n");
17 return 0;
18 }

As with Java, the function that reads from a file may be included in the while statement as is shown
below.
However, programmers often forget the inner set of parentheses and instead write the statement as:

while (c=fgetc(infile) != EOF) // Wrong!!

Due to the priority of operators, the statement above is not processed correctly at run-time (although
it does compile without any errors or warnings). As a result, in the following examples, we will use
the first style for file processing: read from the file before the loop and then read the next piece of
information from the file at the end of the loop. It is typically a matter of personal preference which
style is used but if you use the second style (with the file input function inside the while statement),
ensure that you include the appropriate parentheses.
The program below illustrates the point made in Section?? that the C run-time system removes any
carriage-return characters that are immediately followed by a newline character (the carriage return
and newline combination is used on Windows systems). The program was run with the source program
itself being used as the input file. Even though the source file contains both carriage return and newline
characters, only the newline characters are presented to the program. This is the expected behaviour for
text input files (text is the default when reading a file).

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 FILE *infile;
6 char c;
7
8 infile = fopen("main.c", "r");
9
10 c = fgetc(infile);
11 while (c != EOF)
12 {
13 if (c == ’\n’)
14 {
15 printf("N\n");
16 }
17 else if (c == ’\r’)
18 {
19 printf("R");
20 }
21 else
22 {
23 printf("%c", c);
24 }
25 c = fgetc(infile);
26 }
27 fclose(infile);
28
29 printf("All done\n");
30
31 return 0;
32 }

Partial program output will look line:

#include <stdio.h>N
CHAPTER 3. FILE PROCESSING 35

N
int main(int numParms, char *parms[])N
{N
FILE *infile;N
char c;N
int count1;N
int count2;N
N
infile = fopen("main.c", "r");N
N
...

If it is necessary to process all characters in a file, the file should be processed as a binary input file by
opening the file with the string “rb”. As can be seen from the output below, the carriage-return character
is now presented to the processing program.

#include <stdio.h>RN
RN
int main(int numParms, char *parms[])RN
{RN
FILE *infile;RN
char c;RN
int count1;RN
int count2;RN
RN
infile = fopen("checkchars.c", "rb");RN
RN
...

The following program is a slight elaboration of the programs above. This program reads characters one
at a time, saves the characters in a char array, and then when the end-of-line character is encountered,
the entire line is printed using the string format code %s.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 FILE *infile;
6 char line[100];
7 char c;
8 int count;
9
10 infile = fopen("in.txt", "r");
11
12 count = 0;
13 c = fgetc(infile);
14 while (c != EOF)
15 {
16 line[count] = c;
17 if (line[count] == ’\n’)
18 {
19 line[count+1] = ’\0’;
20 printf("%s", line);
21 count = 0;
22 }
23 else
24 {
25 count++;
26 }
27 c = fgetc(infile);
28 }
29 if (count > 0)
30 {
31 line[count] = ’\n’;
32 line[count+1] = ’\0’;
33 printf("%s", line);
34 }
35 fclose(infile);
36 return 0;
36 CHAPTER 3. FILE PROCESSING

37 }

The if statement after the loop handles the case where the last line in the input file contains 1 or more
characters but does not end with a newline character. The program above (and the program below)
contains a subtle bug. Can you determine what it is?
To avoid having to process a line one character at a time, we could move the processing into its own
method, as shown below. The getLine function reads the contents of one line or reads until the array is
full. The getLine function always terminates the array with the sentinel character so you must ensure
that the array is large enough to contain the number of characters specified plus one additional character
for the sentinel. For example, in the program below, we specify that up to and including 100 characters
can be read from a line but we declare the array to be of size 101 to ensure that there is always room for
the sentinel character.
1 #include <stdio.h>
2
3 char getLine(int, char[], FILE*);
4
5 int main(int numParms, char *parms[])
6 {
7 const int MAX_CHARS = 100;
8 FILE *infile;
9 char line[MAX_CHARS+1];
10 char result;
11
12 infile = fopen("in.txt", "r");
13 result = getLine(MAX_CHARS, line, infile);
14 while (result != EOF)
15 {
16 printf("%s", line);
17 result = getLine(MAX_CHARS, line, infile);
18 }
19 if (line[0]!=’\n’) // Check for missing newline at end of file
20 {
21 printf("%s\n", line);
22 }
23 fclose(infile);
24
25 return 0;
26 }
27
28 char getLine(int numChars, char line[], FILE *infile)
29 {
30 int count;
31 char c;
32
33 c = fgetc(infile);
34 for (count = 0; (c != EOF) && (c != ’\n’); count++)
35 {
36 line[count] = c;
37 c = fgetc(infile);
38 }
39 line[count] = ’\n’;
40 count++;
41 line[count] = ’\0’;
42 return c;
43 }

In the program above, if the input file is not terminated by a newline character, we would fail to print
the contents of the last line in the file if the processing is confined to the loop. It is for this reason that
we must check the contents of the line returned when the EOF character is returned.
As we have seen, when the getc function detects end of file, it returns a special EOF character. One of the
features of getc is that it can be called again after the end of file is detected—getc continues to return
the EOF character if it is called more than once after end of file. We can take advantage of this feature to
improve the program above so that it is no longer the responsibility of the calling function to check for
a non-empty line when the end of the file is encountered.
CHAPTER 3. FILE PROCESSING 37

1 #include <stdio.h>
2
3 char getLine(int, char[], FILE*);
4
5 int main(int numParms, char *parms[])
6 {
7 const int MAX_CHARS = 100;
8 FILE *infile;
9 char line[MAX_CHARS+1];
10 char result;
11
12 infile = fopen("in.txt", "r");
13 result = getLine(MAX_CHARS, line, infile);
14 while (result != EOF)
15 {
16 printf("%s", line);
17 result = getLine(MAX_CHARS, line, infile);
18 }
19 fclose(infile);
20
21 printf("\n\nAll done\n");
22 return 0;
23 }
24
25 char getLine(int numChars, char line[], FILE *infile)
26 {
27 int count;
28 char result;
29 static char c = ’ ’;
30
31 // Take advantage of the fact that we can call fgetc after reaching EOF;
32 // fgetc keeps returning EOF once the end of the file has been reached.
33 c = fgetc(infile);
34 for (count = 0; (c != EOF) && (c != ’\n’); count++)
35 {
36 line[count] = c;
37 c = fgetc(infile);
38 }
39 line[count] = ’\n’;
40 count++;
41 line[count] = ’\0’;
42
43 // if we have encountered EOF but there are also characters to return,
44 // (i.e. the last line in the file was not terminated by ’\n’)
45 // don’t return EOF until the next call to getLine.
46 if ((c == EOF) && (count > 1))
47 {
48 result = ’ ’;
49 }
50 else
51 {
52 result = c;
53 }
54 return result;
55 }

Finally, C supports line-oriented input using the fgets function. (The previous examples were just a
warm-up for this function.) The only part that is different is that the variable result is defined using the
statement char *result; Once again, we will examine the purpose of the asterisk in Chapterch:4 and the
asterisk can be ignored for now.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 const int MAX_CHARS = 100;
6 FILE *infile;
7 char line[MAX_CHARS];
8 char *result;
9
38 CHAPTER 3. FILE PROCESSING

10 infile = fopen("in.txt", "r");


11 result = fgets(line, MAX_CHARS, infile);
12 while (result != NULL)
13 {
14 printf("%s", line);
15 result = fgets(line, MAX_CHARS, infile);
16 }
17 fclose(infile);
18
19 printf("\n\nAll done\n");
20 return 0;
21 }

The fgets function reads characters and stores them in the variable line (as shown above). Characters
are read until a newline character is encountered (the newline character is included with the characters
stored in line) or until MAX_CHARS-1 characters have been read. fgets terminates the string correctly by
adding the ’\0’ character after the last character added to the string. When end of file is encountered or
if a file error is detected, the value NULL is returned.

3.3 File Output


File output in C is very similar to file output in Java. The example below illustrates how the contents of
one file may be copied to another file, one character at a time.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 FILE *infile;
6 FILE *outfile;
7 char c;
8
9 infile = fopen("in.txt", "r");
10 outfile = fopen("out.txt", "w");
11
12 c = fgetc(infile);
13 while (c != EOF)
14 {
15 fputc(c, outfile);
16 c = fgetc(infile);
17 }
18 fclose(infile);
19 fclose(outfile);
20
21 printf("All done\n");
22
23 return 0;
24 }

The following program uses string input and string output to copy the contents of one file to another
file.
1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 const int MAX_CHARS = 100;
6 FILE *infile;
7 FILE *outfile;
8 char line[MAX_CHARS+1];
9 char *result;
10
11 infile = fopen("in.txt", "r");
12 outfile = fopen("out.txt", "w");
13 result = fgets(line, MAX_CHARS, infile);
14 while (result != NULL)
15 {
CHAPTER 3. FILE PROCESSING 39

16 fputs(line, outfile);
17 result = fgets(line, MAX_CHARS, infile);
18 }
19 fclose(infile);
20 fclose(outfile);
21
22 printf("All done\n");
23 return 0;
24 }

When the program above opens the output file, any information that was in the file is lost and is replaced
by the new information written by the program. If it is necessary to append to the end of an existing file
instead of replacing the file, the file is opened with the parameter “a” instead of “w”.

3.4 Formatted File Input


Until now, the information read from a file has consisted of character strings. In the same way that we
can format output that is sent to stdout or to a file by including a format code, we can specify the format
in which input is specified using a format code and one of the scan functions.
The following program reads one integer at a time from an input file and determines the sum of all of
the integers in the file. Each line in the file may consist of any number of integer values. The values are
separated by one or more white space characters (blank, carriage return, newline, formfeed, tab, vertical
tab). There is one aspect of C that has not yet been encountered, the ampersand (“&”). The ampersand
is placed before the variable in the fscanf function. The ampersand operator is related to the asterisk
operator and both are discussed in Chapter 4.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 FILE *infile;
6 int myInt;
7 int sum;
8 int result;
9
10 infile = fopen("ints.txt", "r");
11
12 sum = 0;
13 result = fscanf(infile, "%d ", &myInt);
14 while (result != EOF)
15 {
16 printf("%d\n", myInt);
17 sum += myInt;
18 result = fscanf(infile, "%d ", &myInt);
19 }
20
21 fclose();
22 printf("\nThe sum is %d\n", sum);
23
24 printf("All done\n");
25 return 0;
26 }

The fscanf function returns an int, the number of values successfully read. If the number of values read
is less than the number of variables specified in the format string, then the first int values were read
correctly but the next value was not read correctly. When end of file is encountered, the value EOF is
returned. fscanf ignores the newline character (newline is considered to be white space); if a scan ends
in the middle of a line, the next scan will resume where the previous scan left off. Similarly, if there are
insufficient characters on a line to fill all of the variables in the scan statement, the scan is continued on
the next line.
The following program is a slight variation on the program above. The program reads one line at a time
into a char array (line). The program then scans line for the first integer value. In this program, reading
40 CHAPTER 3. FILE PROCESSING

a line is separated from scanning the line for values. Once an integer value is located, any remaining
integer values on the same line are ignored.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 const int MAX_CHARS = 100;
6
7 char line[MAX_CHARS+1];
8
9 FILE *infile;
10 int myInt;
11 int sum;
12 char *result;
13
14 infile = fopen("ints.txt", "r");
15 sum = 0;
16 result = fgets(line, MAX_CHARS, infile);
17 while (result != NULL)
18 {
19 printf("%s\n", line);
20 sscanf(line, "%d", &myInt);
21 sum += myInt;
22 result = fgets(line, MAX_CHARS, infile);
23 }
24 fclose(infile);
25
26 printf("\nThe sum is %d\n", sum);
27 printf("All done\n");
28
29 return 0;
30 }

The sscanf function is used to scan a string instead of the contents of a file. The function works in the
same manner as the scanf function. If the end of the string is encountered without having extracted a
value for any of the parameters, sscanf returns EOF. If the first n parameters were given values but the
remainder of the parameters were not given values, sscanf returns the value n.
Unfortunately, sscanf can not be used to resume a scan where the previous scan left off (sscanf always
begins its scan at the beginning of the string). One (somewhat ugly) way around this is to write the
contents of the string to a temporary file and then use fscanf to scan and then resume the scan of the
file. There is a more elegant technique that can be used with sscanf to perform the equivalent processing
(scanning and then resuming the scan where the previous scan terminated). The proof is left to the
reader.
The format code in a scan function may also include the width of each value. For example, the format
string “%3d” specifies that a 3-digit integer value is to be read.

3.5 Formatted File Output


Formatted file output in C works in the same manner as formatted output to the system console. The
only difference is that the fprintf function is used and this function requires the name of the file as the
first parameter. The variables in the parameter list are formatted according to the format codes in the
format string. The following program illustrates a simple use of the fprintf function.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 const int MAX_CHARS = 100;
6 FILE *infile;
7 FILE *outfile;
8 char line[MAX_CHARS+1];
9 char *result;
10
CHAPTER 3. FILE PROCESSING 41

11 infile = fopen("in.txt", "r");


12 outfile = fopen("out.txt", "w");
13 result = fgets(line, MAX_CHARS, infile);
14 while (result != NULL)
15 {
16 fprintf(outfile, "%s", line);
17 result = fgets(line, MAX_CHARS, infile);
18 }
19 fclose(infile);
20 fclose(outfile);
21
22 printf("\n\nAll done\n");
23 return 0;
24 }

The processing involved in formatted file output can also be used to write formatted output to a string
instead of to a file. The sprintf function sends its output to a string instead of to a file. The sprintf
function terminates the output correctly with the ’0’ character. The function returns the number of
characters written to the string, not including the termination character.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 char line[100];
6 int value1;
7 int value2;
8
9 value1 = 25;
10 value2 = -9999;
11 sprintf(line, "%d %d", value1, value2);
12
13 printf("\nvalue1 %d value2 %d line %s\n", value1, value2, line);
14
15 printf("All done\n");
16
17 return 0;
18 }

3.6 Console Input

The system console can also be used as an input device. The file input functions described in this chapter
can be applied to the system console by using the file name stdin or by omitting the character “f” at the
beginning of the input function name. For example, the following are equivalent:

scanf(...) fscanf(stdin, ...)


gets(...) fgets(... stdin)
getchar(...) getc(stdin)

The following program writes a prompt to the system console and then reads an integer from the console
using the scanf function.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int myInt;
6
7 printf("Please enter an integer: ");
8 scanf("%d", &myInt);
9 printf("\nThe integer entered was %d\n", myInt);
10 return 0;
11 }
42 CHAPTER 3. FILE PROCESSING

3.7 Standard Error


Until now, you have probably written error messages to stdout using the printf statement. For pro-
duction programs, error information should be written to stderr instead of stdout. Any output sent to
stderr is directed to the system console. The following program illustrates writing to istderr and also
the use of the exit command to terminate a program prematurely when an unrecoverable error occurs.

1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int numParms, char *parms[])
5 {
6 int count;
7 int sum;
8
9 sum = 0;
10 for (count=0; count<100000000; count++)
11 {
12 sum += count;
13 if (sum < 0)
14 {
15 fprintf(stderr, "Result overflowed into sign bit\n");
16 exit(EXIT_FAILURE);
17 }
18 }
19
20 printf("All done\n");
21
22 return 0;
23 }

3.8 Summary
In this chapter, we examined the simple file processing. Files may be read and written using character
processing or using formatted input and/or output. The descriptions of the various file manipulation
functions is very brief to avoid becoming overwhelmed with details. For more detailed information,
take a look at a more detailed C language manual or the internet. It is important to note that error
checking was not included in the examples in this chapter.

3.9 Exercises
1. What does the following portion of code do? (This example is taken from K&R, Second Edition,
p. 96; the getint function returns one integer value.) This code should convince you to write more
readable code (although this code is perfectly readable to an experienced C programmer).

int n, array[SIZE], getint(int *);

for (n=0; n<SIZE && getint(&array[n]) != EOF; n++)

2. Pick one of the example programs in this chapter and add error checking that ensures that the
program will run correctly in all circumstances or will generate an appropriate error message.
Chapter

4
Working with Modules

In Chapters 2 and 3, each program was defined in one source file. For small programs, this is acceptable,
but as programs become larger, good programming practices dictate that a program should be subdi-
vided into separate modules, where each module is stored in a separate source file. In this chapter, we
examine how a C program can be split into separate modules/files and what additional work has to be
done to permit this.

4.1 Modules
In Chapter 1, we created a simple program (shown below) that created a struct and then passed the
struct to a print function.

1 #include <stdio.h>
2
3 typedef struct
4 {
5 int number;
6 int age;
7 } person;
8
9 void printPerson(person);
10
11 int main(int numParms, char *parms[])
12 {
13 person person1;
14
15 person1.number = 25;
16 person1.age = 30;
17
18 printPerson(person1);
19 return 0;
20 }
21
22 void printPerson(person person1)
23 {
24 printf("Person1: %d %d \n", person1.number, person1.age);
25 }

We will begin our work with modules by breaking this program up into two modules, one module will
contain the main function and the other module will contain the printPerson function.

main.c

1 #include <stdio.h>
2
3 typedef struct
4 {
5 int number;
6 int age;
7 } person;

43
44 CHAPTER 4. WORKING WITH MODULES

8
9 int main(int numParms, char *parms[])
10 {
11 person person1;
12
13 person1.number = 25;
14 person1.age = 30;
15
16 printPerson(person1);
17 return 0;
18 }

module1.c

1 #include <stdio.h>
2
3 typedef struct
4 {
5 int number;
6 int age;
7 } person;
8
9 void printPerson(person person1)
10 {
11 printf("Person1: %d %d \n", person1.number, person1.age);
12 }

The two modules can be compiled using the following statement:

1 sh-3.2$ gcc -ggdb main.c module1.c -o ch3


2 sh-3.2$ gcc -ggdb main.c module1.c -o ch3
3 1 2 3 4 5 6 7 8 9 10
4 module1.o(.text+0x18): In function ‘myFunction’:
5 C:/Ch3/module1.c:11: undefined reference to ‘numElements’
6 module1.o(.text+0x29):C:/Ch3/module1.c:13: undefined reference to ‘array1’
7
8 Tool completed with exit code 2
9 1 1 1 1 1 1 1 1 1 1
10 1 2 3 4 5 6 7 8 9 10
11 Current list contents:
12 The list is empty.
13
14 Current list contents:
15 Element 0 is 100
16 Element 1 is 200
17 Element 2 is 300
18 Element 3 is 400
19
20 Current list contents:
21 Element 0 is 100
22 Element 1 is 200
23 Element 2 is 300
24
25 Current list contents:
26 Element 0 is 200
27 Element 1 is 300
28
29 Current list contents:
30 Element 0 is 200
31 Element 1 is 300
32 Element 2 is 500
33
34 Current list contents:
35 The list is empty.

11
Unfortunately, when the two modules are compiled, multiple error messages are generated because the
main module does not know about the module1 module. We need the equivalent of a function prototype
that is used to declare a function before it is used in the same module. Fortunately, C supports the
CHAPTER 4. WORKING WITH MODULES 45

definition of “header files” that contain the prototypes of functions that are defined in another module.
Such a header file is incorporated into the following main.c module using the include statement, as
shown below.

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 typedef struct
5 {
6 int number;
7 int age;
8 } person;
9
10 int main(int numParms, char *parms[])
11 {
12 person person1;
13
14 person1.number = 25;
15 person1.age = 30;
16
17 printPerson(person1);
18 return 0;
19 }
46 CHAPTER 4. WORKING WITH MODULES

module1.h

1 void printPerson(person); // Header file

module1.c

1 #include <stdio.h>
2
3 typedef struct
4 {
5 int number;
6 int age;
7 } person;
8
9 void printPerson(person person1)
10 {
11 printf("Person1: %d %d \n", person1.number, person1.age);
12 }

Note that the program now consists of 3 separate files: the main.c file, the module1.c file, and the module1.h
file. However, the compile statement is the same as before—the header file is not specified in the compile
statement, C finds it automatically.

1 sh-3.2$ gcc -ggdb main.c module1.c -o ch3


2 sh-3.2$ gcc -ggdb main.c module1.c -o ch3
3 1 2 3 4 5 6 7 8 9 10
4 module1.o(.text+0x18): In function ‘myFunction’:
5 C:/Ch3/module1.c:11: undefined reference to ‘numElements’
6 module1.o(.text+0x29):C:/Ch3/module1.c:13: undefined reference to ‘array1’
7
8 Tool completed with exit code 2
9 1 1 1 1 1 1 1 1 1 1
10 1 2 3 4 5 6 7 8 9 10
11 Current list contents:
12 The list is empty.
13
14 Current list contents:
15 Element 0 is 100
16 Element 1 is 200
17 Element 2 is 300
18 Element 3 is 400
19
20 Current list contents:
21 Element 0 is 100
22 Element 1 is 200
23 Element 2 is 300
24
25 Current list contents:
26 Element 0 is 200
27 Element 1 is 300
28
29 Current list contents:
30 Element 0 is 200
31 Element 1 is 300
32 Element 2 is 500
33
34 Current list contents:
35 The list is empty.

22

The program now compiles and executes correctly because the main module knows that the printPerson
function is defined in the module1.c module. (Actually, the compiler complains that in main.c, the header
file references a structure before the structure has been defined but this problem only results in a warn-
ing and could easily be fixed by moving the include “module1.h” statement after the definition of the
structure.)
CHAPTER 4. WORKING WITH MODULES 47

However, even though the program executes correctly, there is one thing that should bother you—the
definition of struct person is repeated in both the main module and in the module1 module. It is not a
good programming practice to duplicate code in this manner because subsequent modifications to the
program may not be made to all copies of the structure. So, we would like to move the structure defini-
tion to one common location. The header file module1.h is the perfect place for the structure definition.
The following revised program illustrates the use of the header file to store a structure definition that is
required by several modules.

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 int main(int numParms, char *parms[])
5 {
6 person person1;
7
8 person1.number = 25;
9 person1.age = 30;
10
11 printPerson(person1);
12 return 0;
13 }

module1.h

1 typedef struct
2 {
3 int number;
4 int age;
5 } person;
6
7 void printPerson(person);

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 void printPerson(person person1)
5 {
6 printf("Person1: %d %d \n", person1.number, person1.age);
7 }

Note that now the module1.c module must also include the module1.h header file because the header file
contains the definition of the structure.

So a program may be subdivided into any number of modules. Each module must have its own header
file that declares the prototypes of each function in the module (plus any structures and/or typedefs
that are used in more than one module). When subdividing a program into modules, it is a good idea
to think of each module as being similar to a Java class so that related functions are kept in the same
module.

4.2 Defining a “Constructor”


In the previous section, we took a program that manipulated a person structure and moved the function
that printed a person into its own module. If you examine the main function, you will notice that this
function instantiates the values of the two “instance” variables in the structure. This is processing that
would be better performed in the module1 module.

The program below illustrates how this processing can be performed. Notice that the structure is de-
clared in the main function and is passed as a parameter to each function that performs processing on
the structure.
48 CHAPTER 4. WORKING WITH MODULES

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 int main(int numParms, char *parms[])
5 {
6 person person1;
7
8 person1 = newPerson(person1, 25, 30);
9
10 printPerson(person1);
11 return 0;
12 }

module1.h

1 typedef struct
2 {
3 int number;
4 int age;
5 } person;
6
7 person newPerson(person, int, int);
8
9 void printPerson(person);

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 person newPerson(person person1, int newNumber, int newAge)
5 {
6 person1.number = newNumber;
7 person1.age = newAge;
8 return person1;
9 }
10
11 void printPerson(person person1)
12 {
13 printf("Person1: %d %d \n", person1.number, person1.age);
14 }

As can be seen from this program, we have created a module that is very similar to a Java class: the only
significant difference is that the instance variables (the person struct) are stored in the main module
instead of being encapsulated in the class. We will examine how to encapsulate variables in a C module
in Chapter 4.

4.3 Scope of Variables


Until now, we have assumed that the scope of variables that are defined in C programs is the same as the
scope of variables defined in a Java program (and, for the most part, this is accurate). However, there
are some differences between C and Java that should be examined.

4.3.1 Global Variables


As was mentioned in Chapter 1, variables that are declared before the main function are global to the
entire module. In fact, global variables can be accessed in other modules as well if the variable declara-
tions in the other modules are prefixed with the keyword extern. The following program illustrates this
feature.

main.c

1 #include <stdio.h>
CHAPTER 4. WORKING WITH MODULES 49

2 #include "module1.h"
3
4 int array1[10];
5 int numElements = 10;
6
7 int main(int numParms, char *parms[])
8 {
9 int count;
10
11 for (count=0; count<numElements; count++)
12 {
13 array1[count] = count+1;
14 }
15
16 myFunction();
17
18 return 0;
19 }

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 void myFunction()
5 {
6 extern int array1[];
7 extern int numElements;
8
9 int count;
10
11 for (count=0; count<numElements; count++)
12 {
13 printf("%d ", array1[count]);
14 }
15 printf("\n");
16 }

1 sh-3.2$ gcc -ggdb main.c module1.c -o ch3


2 sh-3.2$ gcc -ggdb main.c module1.c -o ch3
3 1 2 3 4 5 6 7 8 9 10
4 module1.o(.text+0x18): In function ‘myFunction’:
5 C:/Ch3/module1.c:11: undefined reference to ‘numElements’
6 module1.o(.text+0x29):C:/Ch3/module1.c:13: undefined reference to ‘array1’
7
8 Tool completed with exit code 2
9 1 1 1 1 1 1 1 1 1 1
10 1 2 3 4 5 6 7 8 9 10
11 Current list contents:
12 The list is empty.
13
14 Current list contents:
15 Element 0 is 100
16 Element 1 is 200
17 Element 2 is 300
18 Element 3 is 400
19
20 Current list contents:
21 Element 0 is 100
22 Element 1 is 200
23 Element 2 is 300
24
25 Current list contents:
26 Element 0 is 200
27 Element 1 is 300
28
29 Current list contents:
30 Element 0 is 200
31 Element 1 is 300
32 Element 2 is 500
50 CHAPTER 4. WORKING WITH MODULES

33
34 Current list contents:
35 The list is empty.

33
As can be seen, myFunction which is in a different module is able to access the array and the number of
elements in the array by declaring the two variables to be extern variables.

4.3.2 Private Global Variables


There are times when defining variables to be global to a module is necessary but you do not want those
variables to be accessible in other modules. This is the equivalent of private variables in a Java class. In
C, you can obtain the same effect if you declare global variables to be static. Note that this definition of
static is not the same as Java’s use of static.

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 static int array1[10];
5 static int numElements = 10;
6
7 int main(int numParms, char *parms[])
8 {
9 int count;
10
11 for (count=0; count<numElements; count++)
12 {
13 array1[count] = count+1;
14 }
15
16 myFunction();
17
18 return 0;
19 }

module1.h

1 void myFunction();

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 void myFunction()
5 {
6 extern int array1[];
7 extern int numElements;
8
9 int count;
10
11 for (count=0; count<numElements; count++)
12 {
13 printf("%d ", array1[count]);
14 }
15 printf("\n");
16 }
17

In this program, even though there is an extern variable definition, the compiler chokes on the program
and is not able to compile it correctly because the two variables array1 and numElements have been
hidden from other modules by the keyword static. (The following error message is generated.)
CHAPTER 4. WORKING WITH MODULES 51

1 sh-3.2$ gcc -ggdb main.c module1.c -o ch3


2 sh-3.2$ gcc -ggdb main.c module1.c -o ch3
3 1 2 3 4 5 6 7 8 9 10
4 module1.o(.text+0x18): In function ‘myFunction’:
5 C:/Ch3/module1.c:11: undefined reference to ‘numElements’
6 module1.o(.text+0x29):C:/Ch3/module1.c:13: undefined reference to ‘array1’
7
8 Tool completed with exit code 2
9 1 1 1 1 1 1 1 1 1 1
10 1 2 3 4 5 6 7 8 9 10
11 Current list contents:
12 The list is empty.
13
14 Current list contents:
15 Element 0 is 100
16 Element 1 is 200
17 Element 2 is 300
18 Element 3 is 400
19
20 Current list contents:
21 Element 0 is 100
22 Element 1 is 200
23 Element 2 is 300
24
25 Current list contents:
26 Element 0 is 200
27 Element 1 is 300
28
29 Current list contents:
30 Element 0 is 200
31 Element 1 is 300
32 Element 2 is 500
33
34 Current list contents:
35 The list is empty.

48

4.3.3 Local Variables


As with Java, local variables are variables that are defined within a C function. Local variables can be
accessed anywhere in the function in which they are declared but can not be accessed from outside of
the function.
Local variables are re-allocated each time that the function is called so the values of local variables are
erased at the end of each function invocation and are not available the next time that the function is
called as shown in the program below.

1 #include <stdio.h>
2
3 void printElement(void);
4
5 int array1[10];
6 int numElements = 10;
7
8 int main(int numParms, char *parms[])
9 {
10 int count;
11
12 for (count=0; count<numElements; count++)
13 {
14 array1[count] = count+1;
15 }
16
17 for (count=0; count<numElements; count++)
18 {
19 printElement();
20 }
21
52 CHAPTER 4. WORKING WITH MODULES

22 return 0;
23 }
24
25 void printElement()
26 {
27 int count = 0;
28
29 if (count < numElements)
30 {
31 printf("%d ", array1[count]);
32 count++;
33 }
34 }

1 sh-3.2$ gcc -ggdb main.c module1.c -o ch3


2 sh-3.2$ gcc -ggdb main.c module1.c -o ch3
3 1 2 3 4 5 6 7 8 9 10
4 module1.o(.text+0x18): In function ‘myFunction’:
5 C:/Ch3/module1.c:11: undefined reference to ‘numElements’
6 module1.o(.text+0x29):C:/Ch3/module1.c:13: undefined reference to ‘array1’
7
8 Tool completed with exit code 2
9 1 1 1 1 1 1 1 1 1 1
10 1 2 3 4 5 6 7 8 9 10
11 Current list contents:
12 The list is empty.
13
14 Current list contents:
15 Element 0 is 100
16 Element 1 is 200
17 Element 2 is 300
18 Element 3 is 400
19
20 Current list contents:
21 Element 0 is 100
22 Element 1 is 200
23 Element 2 is 300
24
25 Current list contents:
26 Element 0 is 200
27 Element 1 is 300
28
29 Current list contents:
30 Element 0 is 200
31 Element 1 is 300
32 Element 2 is 500
33
34 Current list contents:
35 The list is empty.

99

4.3.4 Static Local Variables


Occasionally, it is useful if the value of a local variable is retained from one function invocation to the
next invocation of the same function. This can be accomplished by defining the local variables to be
static variables. Note that this use of static is not the same as the use of the keyword static that is applied
to global variables. (Bad choice of keywords.)

1 #include <stdio.h>
2
3 void printElement(void);
4
5 int array1[10];
6 int numElements = 10;
7
8 int main(int numParms, char *parms[])
9 {
10 int count;
CHAPTER 4. WORKING WITH MODULES 53

11
12 for (count=0; count<numElements; count++)
13 {
14 array1[count] = count+1;
15 }
16
17 for (count=0; count<numElements; count++)
18 {
19 printElement();
20 }
21
22 return 0;
23 }
24
25 void printElement()
26 {
27 static int count = 0; // static local variable
28
29 if (count < numElements)
30 {
31 printf("%d ", array1[count]);
32 count++;
33 }
34 }

1 sh-3.2$ gcc -ggdb main.c module1.c -o ch3


2 sh-3.2$ gcc -ggdb main.c module1.c -o ch3
3 1 2 3 4 5 6 7 8 9 10
4 module1.o(.text+0x18): In function ‘myFunction’:
5 C:/Ch3/module1.c:11: undefined reference to ‘numElements’
6 module1.o(.text+0x29):C:/Ch3/module1.c:13: undefined reference to ‘array1’
7
8 Tool completed with exit code 2
9 1 1 1 1 1 1 1 1 1 1
10 1 2 3 4 5 6 7 8 9 10
11 Current list contents:
12 The list is empty.
13
14 Current list contents:
15 Element 0 is 100
16 Element 1 is 200
17 Element 2 is 300
18 Element 3 is 400
19
20 Current list contents:
21 Element 0 is 100
22 Element 1 is 200
23 Element 2 is 300
24
25 Current list contents:
26 Element 0 is 200
27 Element 1 is 300
28
29 Current list contents:
30 Element 0 is 200
31 Element 1 is 300
32 Element 2 is 500
33
34 Current list contents:
35 The list is empty.

1010

You should note that the initialization of the static variable count to zero happens only once, when the
function is first executed.
54 CHAPTER 4. WORKING WITH MODULES

4.4 An ArrayList Module


In this section, we use the techniques developed in this chapter to define a module that performs much
of the same processing as Java’s ArrayList class.

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 int main(int numParms, char *parms[])
5 {
6 list myList;
7
8 myList = newList(myList);
9
10 printList(myList);
11
12 myList = addList(myList, 100);
13 myList = addList(myList, 200);
14 myList = addList(myList, 300);
15 myList = addList(myList, 400);
16
17 printList(myList);
18
19 myList = removeList(myList, 3);
20
21 printList(myList);
22
23 myList = removeList(myList, 0);
24
25 printList(myList);
26
27 myList = addList(myList, 500);
28
29 printList(myList);
30
31 myList = removeList(myList, 0);
32 myList = removeList(myList, 0);
33 myList = removeList(myList, 0);
34
35 printList(myList);
36
37 printf("\nAll done\n");
38 return 0;
39 }

module1.h

1 typedef struct
2 {
3 int size;
4 int values[100];
5 } list;
6
7 list newList(list);
8
9 list addList(list, int);
10
11 int getList(list, int);
12
13 int sizeList(list);
14
15 list removeList(list, int);
16
17 void printList(list);

module1.c
CHAPTER 4. WORKING WITH MODULES 55

1 #include <stdio.h>
2 #include "module1.h"
3
4 list newList(list myList)
5 {
6 myList.size = 0;
7 return myList;
8 }
9
10 list addList(list myList, int value)
11 {
12 myList.values[myList.size] = value;
13 myList.size++;
14 return myList;
15 }
16
17 int getList(list myList, int position)
18 {
19 int entry;
20
21 entry = myList.values[position];
22 return entry;
23 }
24
25 int sizeList(list myList)
26 {
27 return myList.size;
28 }
29
30 list removeList(list myList, int position)
31 {
32 int count;
33
34 for (count=position; count<(myList.size-1); count++)
35 {
36 myList.values[count] = myList.values[count+1];
37 }
38 myList.size--;
39 return myList;
40 }
41
42 void printList(list myList)
43 {
44 int count;
45
46 printf("Current list contents:\n");
47 if (myList.size > 0)
48 {
49 for (count=0; count<myList.size; count++)
50 {
51 printf("Element %d is %d\n", count, getList(myList, count));
52 }
53 printf("\n");
54 }
55 else
56 {
57 printf("The list is empty.\n\n");
58 }
59 }

1 sh-3.2$ gcc -ggdb main.c module1.c -o ch3


2 sh-3.2$ gcc -ggdb main.c module1.c -o ch3
3 1 2 3 4 5 6 7 8 9 10
4 module1.o(.text+0x18): In function ‘myFunction’:
5 C:/Ch3/module1.c:11: undefined reference to ‘numElements’
6 module1.o(.text+0x29):C:/Ch3/module1.c:13: undefined reference to ‘array1’
7
8 Tool completed with exit code 2
9 1 1 1 1 1 1 1 1 1 1
10 1 2 3 4 5 6 7 8 9 10
11 Current list contents:
12 The list is empty.
56 CHAPTER 4. WORKING WITH MODULES

13
14 Current list contents:
15 Element 0 is 100
16 Element 1 is 200
17 Element 2 is 300
18 Element 3 is 400
19
20 Current list contents:
21 Element 0 is 100
22 Element 1 is 200
23 Element 2 is 300
24
25 Current list contents:
26 Element 0 is 200
27 Element 1 is 300
28
29 Current list contents:
30 Element 0 is 200
31 Element 1 is 300
32 Element 2 is 500
33
34 Current list contents:
35 The list is empty.

1135
Note that there is still a significant limitation in this module Ð the size of the array can never be in-
creased. We will see in the next chapter how this problem can be resolved.

4.5 Summary
In this chapter we have examined how a large program can be subdivided into a collection of smaller
modules. (Although the examples showed only two modules, any number of modules may be defined
in a system.) The scope and lifetime of variables in a C program have been expanded by the addition of
the keywords extern and static.
While C does not support object orientation, when programs are modularized, it is possible to observe
many of the practices that make object-oriented programming more powerful than procedural program-
ming.
It should also be obvious that the programs developed so far have not included any error-checking.
Error-checking is one of the fundamental good programming practices and we will take a closer look at
it in subsequent chapters.

4.6 Exercises
1. What are the differences in the scope of variables in Java and C?
2. What are the differences in the visibility of variables in Java and C?
3. What is the purpose of a header file?
4. What are the two meanings of the keyword “static”?
5. What does the “lifetime” of a variable mean?
6. There is another type of storage class, the auto (automatic) class. What does this storage class do?
Chapter

5
Pointers and Memory Management

In Chapter 1, we examined the basic C constructs. In Chapter 3, we saw how a program could be
decomposed into multiple modules which are similar to Java classes. In this chapter, we examine how
data structures can be created dynamically and how pointers (references in Java) can be used to point to
dynamically created data structures plus what the * and & operators mean.

5.1 Memory Organization


We have already seen that a computer’s memory consists of a collection of bytes, each of which consists
of a collection of bits. The size of a byte (that is, the number of bits in the byte) may vary from machine
to machine. The number of bytes used to represent each data type may also vary from machine to
machine so it is important that programs be written in as machine-independent a manner as possible.
For example, use the sizeof() function to determine the number of bytes used to represent a particular
data type. When a program is loaded into memory, different portions of the program are stored in
different areas of memory. A generic view of the memory segments used by a C program is shown
below. (Understanding the organization of memory in a C program is not really necessary to become a
proficient C programmer but it doesn’t hurt either.)

Stack

Heap

BSS

Data

Text

Figure 1: Memory Layout of a C Program

The text segment contains the program’s executable instructions; the data segment contains initialized
global variables; the BSS (“block started by symbol”—an assembly-language instruction from the IBM
7094) segment contains uninitialized global variables (these variables are normally initialized to zero but
you should never rely on this); the stack contains local variables used within a function; and the heap

57
58 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

contains storage explicitly allocated by the programmer. Depending on the operating system, different
combinations of access permissions (read, write, execute) may be assigned to each segment.
The size command can be used to determine the size of each of the first three segments.

1 sh-3.2$ size ch4.exe


2 text data bss dec hex filename
3 2560 1536 112 4208 1070 ch4.exe
4 sh-3.2$

The amount of memory used by the stack and by the heap grows and shrinks as a program performs its
processing so the size command can not display the amount of memory that will be used by the stack
or heap.
When a function is called, the variables declared in the function are allocated in an area of memory
referred to as “local memory” which is allocated on the “stack”. When the function terminates, the
memory used by the function is released (made available for reuse).
When memory is allocated dynamically (using the malloc function, described later in this chapter), that
memory is taken from a different area of memory referred to as “dynamic memory” which is allocated
on the “heap”. Memory allocated in the heap remains dedicated to the program until the programmer
explicitly releases the memory (using the free function). It is a common programming error to forget
to release dynamic memory—this error causes what is referred to as a “memory leak”. Memory leaks
can be a major problem in programs that execute for a long period of time since the amount of memory
that is allocated but no longer required can continue to increase to the point at which there is no more
memory available for dynamic allocation.
In Java, memory leaks may occur but are quite rare since the Java virtual machine includes a method
called the “garbage collector” that is able to identify and free memory allocations that are no longer
being used. Thus, with Java, it is not the programmer’s responsibility to free memory (such as objects)
when the memory is no longer needed. (Although there are some programming practices in Java that
make things easier for the garbage collector.)
In C, there are also memory management practices that reduce the chance of memory leaks. We will
examine some of them in this chapter and also in a later chapter.

5.2 Memory Allocation for Basic Data Types


In Java, when a basic data type is declared, memory is reserved for the data item and an initial value of
zero is stored in the data item.
1 int myInt; // Java

myInt

When a value is assigned to the variable, the value is placed in the corresponding memory location.

1 myInt = 3; // Java

myInt

In C, when a basic data type is declared, memory is reserved for the data item and but the data item is not
initialized. (This statement is not entirely true but it is better to assume that memory is not initialized.)

1 int myInt;
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 59

myInt

Then, when a value is assigned to the variable, the value is placed in the corresponding memory location.

1 int myInt = 3;

myInt

This process is quite simple in both languages: each variable is given an appropriate amount of memory
and when a value is assigned to the variable, the value is placed in the memory allocated to that variable.
The only difference is that C does not assign an initial value to the memory location.
In C, you can determine the amount of memory allocated for a specific data type using the sizeof()
function. For example, on most computers, sizeof(int) returns the value 4. However, sizeof(int) may
be different on a computer with a different architecture so do not assume that the size of an integer value
is always a specific value.

5.3 Java References (Pointers)


Recall that in Java, objects are not stored in a variable—instead, the variable contains a reference (or
pointer) to the actual object.

1 int myInt; // Java


2 myInt = 3; // Java
3 int myInt;
4 int myInt = 3;
5 new Flight(100, "Winnipeg", "Toronto") // Java
6 myFlight = new Flight(100, "Winnipeg", "Toronto"); // Java
7 int *intPtr;
8 int myInt;
9 intPtr = &myInt;
10 myInt = 3;
11 printf("%d\n", *intPtr);
12 int myInt;
13 int *intPtr;
14 intPtr = &myInt;
15 *intPtr = 3;
16 myInt = 5; // assign a value directly to a memory location
17 *intPtr = 5; // assign a value indirectly to a memory location
18 // using a pointer variable
19 int intValue1, intValue2; // declare two int variables
20 int *intPtr1, *intPtr2; // declare two (int) pointer variables
21
22 intValue1 = intValue2; // normal assignment; both variables contain the same
23 // value but each variable has its own memory location
24
25 intPtr1 = &intValue1; // assign the address of intValue1 to intPtr1
26 intPtr2 = &intValue2; // assign the address of intValue2 to intPtr2
27
28 *intPtr1 = *intPtr2; // assign the value pointed to by intPtr2 to the memory
29 // location pointed to by intPtr1
30
31 *intPtr1 = intValue2; // assign the value in intValue2 to the memory location
32 // pointed to by intPtr1
33
34 intPtr1 = intPtr2; // assign the memory address in intPtr2 to intPtr1 so
35 // that both pointers point to the same memory location
36 int *intPtr;
37
38 *intPtr = 5; // Wrong!
39 int *intPtr;
40 char *charPtr;
60 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

41 int myInt;
42 int *intPtr;
43 char *charPtr;
44
45 intPtr = &myInt;
46 charPtr = (char*) intPtr;
47 void myFunction(int, int[])
48 int myArray[100];
49 myFunction(100, myArray)
50
51 myFunction(100, (int*) &myArray[0])
52 int array1[] = {1, 2, 3};
53 ptr++
54 (*ptr)++
55 *(++ptr)
56 *(ptr++)
57 *(++ptr)
58 *ptr++
59 int q[3], r[4];
60 int *qPtr, *rPtr;
61
62 qPtr = &q[0];
63 if (qPtr < q+1) ...
64
65 rPtr = r+1;
66 if (rPtr == &r[1]) ...
67 qPtr = &q[0];
68 rPtr = &r[0];
69 if (qPtr == rPtr) ... // Wrong
70 int main(int numParms, char *parms[])
71 {
72 int array1[] = {1, 2, 3, 4, 5};
73 ...
74 }
75 int *ptr;
76 ptr = malloc(bytes);
77 int *ptr;
78 ptr = malloc(5*sizeof(int));
79 int *ptr;
80 ptr = malloc(5*sizeof(int));
81 ...
82 free(ptr);
83 #include <stdio.h>
84 #include <stdlib.h>
85
86 int main(int numParms, char *parms[])
87 {
88 int count;
89 int *p;
90 int *s;
91
92 p = (int*) malloc(5*sizeof(int));
93
94 for (count=0, s=p; s<p+5; s++, count++)
95 {
96 *s = count+1;
97 }
98
99 ...
100
101 free(p);
102 ...
103
104 return 0;
105 }
106 #include <stdio.h>
107 #include <stdlib.h>
108
109 int main(int numParms, char *parms[])
110 {
111 int count;
112 int *p;
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 61

113 int *s;


114
115 p = (int*) malloc(5*sizeof(int));
116
117 for (count=0, s=p; s<p+5; s++, count++)
118 {
119 *s = count+1;
120 }
121
122 . . .
123
124 free(p);
125 p = NULL;
126 ...
127
128 return 0;
129 }
130 void *ptr;
131
132 ptr = malloc(10*sizeof(int));
133 void *ptr;
134
135 ptr = malloc(10*sizeof(int));
136
137 *((int *) ptr) = 5;
138 void *ptr;
139
140 ptr = malloc(10*sizeof(int));
141
142 *((int *) ptr) = 5;
143 public static void main(String[] parms) ... // Java
144 int main(int numParms, char *parms[])
145 int main(int argc, char *argv[])
146 int main(int argc, char **argv)
147 myprogram
148 myprogram parm1 parm2 parm3
149 struct person
150 {
151 int number;
152 int age;
153 };
154
155 struct person *myPerson;
156
157 myPerson = (struct person*) malloc(sizeof(struct person));
158 (*myPerson).number
159 myPerson->number
160 typedef struct person *Person;
161 Person person1;
162 void resizeList(ArrayList list, int newSize)
163 {
164 int count;
165
166 list -> data = (int*) realloc(list->data, newSize*sizeof(int));
167 list -> maxSize = newSize;
168 for (count=list->size; count<list->maxSize; count++)
169 {
170 list -> data[count] = 0;
171 }
172 }
173 if (list->size >= list->maxSize)
174 {
175 resizeList(list, list->maxSize+10);
176 }

In the example above, myFlight is a variable. When the main method is executed, myFlight is allocated
memory and the memory is initialized to null.
62 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

myFlight

NULL

When a new Flight object is created, Java allocates sufficient memory for the object and then creates the
object in that memory location.

1 new Flight(100, "Winnipeg", "Toronto") // Java

myFlight

Flight

100
Winnipeg
Toronto

If the object is assigned to the variable myFlight, Java stores a reference/pointer to the object in the
variable myFlight.
1 myFlight = new Flight(100, "Winnipeg", "Toronto"); // Java

Whenever myFlight is accessed, Java automatically follows (dereferences) the pointer to the object in
such a way that the programmer is never aware of the fact that the variable does not actually contain
the object itself.

5.4 C Pointers
In the C language, pointers are an explicit part of the language (instead of being implicit as they are in
Java). In C, any data item may be pointed to via a pointer variable. The following statement declares
the variable intPtr to be a pointer that points to an int memory location. The asterisk in the declaration
indicates that the variable is a pointer variable. The declaration of the variable does not cause a value to
be assigned to the pointer variable.

1 int *intPtr;

intPtr

If the int variable myInt is subsequently declared, memory is allocated for myInt.

1 int myInt;

intPtr

myInt
?
?
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 63

Then, if the pointer variable intPtr is assigned the address of the variable CmyInt, intPtr points to the
same memory location as myInt. The & operator is referred to as the “address of” operator.

1 intPtr = &myInt;

intPtr

myInt

Remember that C does not assign an initial value to variables when they are declared so at this point,
myInt does not contain a useful value.

Finally, if myInt is assigned the value 3, the memory located allocated for myInt is given the value 3.

1 myInt = 3;

intPtr

myInt

The pointer intPtr continues to point to the same memory location and so now points to the value 3.
Note that intPtr does not contain the value 3, intPtr contains a pointer to the memory location allocated
for myInt, and myInt contains the value 3. So intPtr points to a memory location that contains the value
3.
To access the value that intPtr points to, we use the asterisk * operator to follow (or dereference) the
pointer variable to the memory location that it points to. The following statement would print the value
3.
1 printf("%d\n", *intPtr);

1 3

To repeat, each variable is assigned a location in memory. In the following instructions, the variable
intPtr declared and assigned a memory location (at address 0x003D24B0) and the variable myInt declared
and assigned a memory location (at address 0x003D24F8).

1 int myInt;
2 int *intPtr;

003D24B0

intPtr 003D24F8

003D24F8

myInt

intPtr is then assigned the address of the variable myInt.


64 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

1 intPtr = &myInt;

003D24F0

intPtr 003D24F8

intPtr = &myInt
003D24F8

myInt

The memory location that intPtr points at is then assigned the value 3.

1 *intPtr = 3;

003D24F0

intPtr 003D24F8

003D24F8

myInt
*intPtr = 3

Once the pointer variable points to a valid memory location, we can assign a value to that memory
location either directly or indirectly. The following two statements each have exactly the same effect—
each statement assigns the value 5 to the memory location that used to contain the value 3.

1 myInt = 5; // assign a value directly to a memory location


2 *intPtr = 5; // assign a value indirectly to a memory location
3 // using a pointer variable

So a variable that is declared with an * is a pointer variable. If you recall, the FILE variables used in
Chapter 2 included an * in the declaration—so the FILE variable is actually a pointer variable to the FILE
description.
We now have two new operators that permit us to deal with pointers: the & operator returns the address
of the memory location assigned to a variable and the * operator dereferences a pointer variable to the
memory location that the variable points to.
The following statements illustrate the use of pointer variables:

1 int intValue1, intValue2; // declare two int variables


2 int *intPtr1, *intPtr2; // declare two (int) pointer variables
3
4 intValue1 = intValue2; // normal assignment; both variables contain the same
5 // value but each variable has its own memory location
6
7 intPtr1 = &intValue1; // assign the address of intValue1 to intPtr1
8 intPtr2 = &intValue2; // assign the address of intValue2 to intPtr2
9
10 *intPtr1 = *intPtr2; // assign the value pointed to by intPtr2 to the memory
11 // location pointed to by intPtr1
12
13 *intPtr1 = intValue2; // assign the value in intValue2 to the memory location
14 // pointed to by intPtr1
15
16 intPtr1 = intPtr2; // assign the memory address in intPtr2 to intPtr1 so
17 // that both pointers point to the same memory location

The following segment of code will probably execute but since we have not pointed the pointer vari-
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 65

able at anything yet, we don’t know where the pointer is pointing and it is likely that executing the
assignment statement will cause some innocent area of memory to be stomped on by storing the value
5 in it. (If the area of memory that was modified contains some system information, it is likely that the
system will start doing weird things at some, possibly later, point.) This is one of the difficulties when
programming in C—you are allowed to do stupid things; the Java language and run-time system would
not permit such an assignment.

1 int *intPtr;
2
3 *intPtr = 5; // Wrong!

Declaring a pointer variable allocates space for the variable itself but does not give the pointer variable
a value so the pointer variable does not point at a meaningful memory location.
The following program illustrates some pointer manipulations.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int myInt;
6 int *intPtr;
7
8 intPtr = &myInt;
9 myInt = 3;
10
11 printf("myInt contains %d \n\n", myInt);
12 printf("intPtr points to a memory location that contains the value %d \n\n", *intPtr);
13 printf("The memory location assigned to myInt is %p \n\n", &myInt);
14 printf("intPtr points to memory location %p \n\n", intPtr);
15
16 return 0;
17 }

1 myInt contains 3
2
3 intPtr points to a memory location that contains the value 3
4
5 The memory location assigned to myInt is 0022FF8C
6
7 intPtr points to memory location 0022FF8C

There are two points to mention about this program. First, note that the value of a pointer is printed
using the %p format code. Secondly, the address of a memory location is printed using the number
system used internally by the current computer—base 16 (or hexadecimal) in this case. So the memory
location assigned to the variable myInt is 0022FF8C (in base 16) on this particular computer. If the program
were run again on the same computer, myInt might be assigned a different memory location so you must
never assume that a specific memory location will be used for a variable.

5.5 Pointer Types


Each pointer variable has a type, the type that it was declared with. For example, the declaration

1 int *intPtr;

declares intPtr to be a pointer to an int. A pointer variable may be declared to be of any valid C data
type. For example, the following statement declares charPtr to be a pointer variable that points to a
memory location that contains one char value.

1 char *charPtr;

The importance of declaring the type of a pointer will become more obvious when we examine pointer
arithmetic later in this chapter.
66 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

5.6 Casting a Pointer Variable


As was mentioned in the previous section, each pointer variable has a type. It is valid to cast the value
of a pointer variable to a pointer variable of a different type. The following program segment illustrates
this technique. We will cast pointer variables later in this chapter.

1 int myInt;
2 int *intPtr;
3 char *charPtr;
4
5 intPtr = &myInt;
6 charPtr = (char*) intPtr;

5.7 Passing Values to Functions by Reference


In this section we examine how pointer variables can be passed to functions and what the impact of this
technique is.

First, recall that the basic data types (and structures) are passed to a function by value (call by value), so
any modifications that are made to a variable in a function are not reflected back in the calling function.
For example, in the following program we attempt to modify the value of the parameter in the function.
The modification does take effect during the lifetime of the function but once control is returned to the
calling program, myInt still has its original value.

1 #include <stdio.h>
2
3 void modifyValue(int);
4
5 int main(int numParms, char *parms[])
6 {
7 int myInt;
8 myInt = 3;
9
10 printf("myInt contains %d\n\n", myInt);
11 modifyValue(myInt);
12 printf("myInt contains %d\n\n", myInt);
13
14 return 0;
15 }
16
17 void modifyValue(int intValue)
18 {
19 intValue++;
20 printf("intValue contains %d\n\n", intValue);
21 }

1 myInt contains 3
2
3 intValue contains 4
4
5 myInt contains 3

However, if we modify the program so that a pointer to the variable is passed to the function, the
modification to the variable is reflected in the calling function (this is essentially call by reference).

1 #include <stdio.h>
2
3 void modifyValue(int*);
4
5 int main(int numParms, char *parms[])
6 {
7 int myInt;
8 myInt = 3;
9
10 printf("myInt contains %d\n\n", myInt);
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 67

11 modifyValue(&myInt);
12 printf("myInt contains %d\n\n", myInt);
13
14 return 0;
15 }
16
17 void modifyValue(int *intValue)
18 {
19 *intValue = *intValue + 1;
20 printf("intValue contains %d\n\n", *intValue);
21 }

1 myInt contains 3
2
3 intValue contains 4
4
5 myInt contains 4

If you examine the program, you will notice that instead of passing the parameter myInt itself, we now
pass a pointer to the parameter.

So we now have a mechanism whereby parameters may be passed by reference instead of by value.

Although this program works correctly, it is generally a good programming practice to return modified
values via the return statement instead of using call by reference.

5.8 Arrays and Pointers


Recall from Chapter 1 that arrays are passed by reference, not by value. (Again, saying that arrays are
passed by reference is somewhat debatable; but the fact that a pointer to the array is passed to a function
is not debatable.)

If a function’s prototype is

1 void myFunction(int, int[])

and if myArray is declared as:


1 int myArray[100];

then the following two calls to myFunction are equivalent.

1 myFunction(100, myArray)
2
3 myFunction(100, (int*) &myArray[0])

Note that the second technique casts the address to the correct type (an int pointer variable).

So an array is just the address of the first element in the array. Recall that in Java, the pointer to an array
is stored in an array (pointer) variable. Unlike Java, C does not store the address of the first element in a
separate variable. C keeps track of the address of the first element of an array and so passing the name
of the array is the same as passing a pointer to the first element in the array.

array1
array1

Java C
68 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

The following program illustrates how an array may be passed by name to a function or it may be passed
by address to a function. Both techniques are equivalent.

1 #include <stdio.h>
2
3 void sumArray(int, int[]);
4
5 int main(int numParms, char *parms[])
6 {
7 int array1[] = {1, 2, 3, 4, 5};
8
9 sumArray(5, array1);
10 sumArray(5, (int*) &array1[0]);
11
12 return 0;
13 }
14
15 void sumArray(int numEntries, int entries[])
16 {
17 int count;
18 int result;
19
20 result = 0;
21 for (count=0; count<numEntries; count++)
22 {
23 result += entries[count];
24 }
25
26 printf("The sum is %d\n", result);
27 }

1 The sum is 15
2 The sum is 15

5.9 Pointer Arithmetic

The program in the previous section illustrated how either the name of an array or a pointer to the first
element in an array may be passed to a function that processes the array. The array can then manipulated
using conventional array manipulation techniques (array[subscript]).

The array can also be manipulated using what is referred to as “pointer arithmetic”. When pointer
arithmetic is used, we replace the standard array[subscript] notation with a pointer variable and an
offset. For example,

1 The sum is 15
2 &myArray[0] is equivalent to myArray+0
3 &myArray[1] is equivalent to myArray+1
4 &myArray[2] is equivalent to myArray+2
5 ...
6
7 myArray[0] is equivalent to *(myArray+0)
8 myArray[1] is equivalent to *(myArray+1)
9 myArray[2] is equivalent to *(myArray+2)
10 ...

Adding 1 to a pointer variable looks somewhat strange when each int (in this example) element of the
array requires more than one byte of memory. (int values are typically stored in 4 bytes of memory but
this can vary from one computer to another.) The diagram below shows how the following array would
be represented internally.

1 int array1[] = {1, 2, 3};


CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 69

00 00 00 01 00 00 00 02 00 00 00 03

first element second element third element

Each element of the array is stored in 4 bytes; each byte is represented by 2 hexadecimal digits.
So, if each array element requires 4 bytes of memory, how does adding 1 to a pointer variable cause
the pointer variable to point to the next element and not to the next byte in the current element? The
answer is that when the value of a pointer variable is modified by adding 1 to it or subtracting 1 from
it, C modifies the pointer variable by the sizeof the type of the pointer variable. So, for an int* pointer
variable, C adds/subtracts the value of sizeof(int) to/from the pointer variable. As a result of the way
that pointer arithmetic works, the following statement has the effect of modifying the pointer variable
to point to the next element in an array.

1 ptr++

You must be careful when using the increment (or decrement) operator with pointer variables and deref-
erencing. For example, the following statement is fairly obvious—add 1 to the memory location pointed
to by the pointer variable.

1 (*ptr)++

Similarly, the following statement increments the pointer variable by 1 element and then dereferences
the new contents of the pointer variable.

1 *(++ptr)

However, other variations such as the ones below are not as obvious and should be used with caution.
(Try figuring out what they do.)

1 *(ptr++)
2 *(++ptr)
3 *ptr++

Two pointer variables may be compared (pointer comparison) using the arithmetic comparison opera-
tors (<,>,==, etc.) if both pointer variables point at the same data item (or array) in memory. For example,
if the following 2 arrays have been declared, two pointer variables that both point to q or that both point
to r may be compared. However, comparing one pointer variable that points to q with another pointer
variable that points to r does not make any sense since we do not know what the relationship between
the memory allocated for q and the memory allocated for r is.

q 2 3 4 r 5 6 7 8

In the following program segments, the pointer variable comparisons all make sense.

1 int q[3], r[4];


2 int *qPtr, *rPtr;
3
4 qPtr = &q[0];
5 if (qPtr < q+1) ...
6
7 rPtr = r+1;
8 if (rPtr == &r[1]) ...

However, the following comparison statement does not make sense because the programmer has no
control over where the arrays q and r are placed in memory.

1 qPtr = &q[0];
2 rPtr = &r[0];
70 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

3 if (qPtr == rPtr) ... // Wrong

The following program rewrites the program from the previous section so that pointer arithmetic is
used instead of the array manipulations used in the sumArray function. Note that the for statement uses
pointer variable comparisons to determine whether or not the end of the array has been reached.

1 #include <stdio.h>
2
3 void sumArray(int, int*);
4
5 int main(int numParms, char *parms[])
6 {
7 int array1[] = {1, 2, 3, 4, 5};
8
9 sumArray2(5, array1);
10 sumArray2(5, (int*) &array1[0]);
11
12 return 0;
13 }
14
15 void sumArray(int numEntries, int *entries)
16 {
17 int *ptr;
18 int result;
19
20 result = 0;
21 for (ptr=entries; ptr<entries+numEntries; ptr++)
22 {
23 result += *ptr;
24 }
25
26 printf("The sum is %d\n", result);
27 }
28

1 The sum is 15
2 The sum is 15

As a general rule, if you have a choice between using the array notation to access an element of an array
or using the pointer notation, it is generally better to use the array notation.

5.10 Casting and Pointer Arithmetic


We saw earlier how a pointer variable could be cast to a different type. The following example illus-
trates how pointer casting may produce incorrect or unexpected results. The first two loops are defined
correctly but the third loop uses an int pointer to traverse a char array. As a result, only every fourth
character in the array is printed since the int pointer is incremented by the sizeof an int (4) instead of
the sizeof a char (1).
1 #include <stdio.h>
2 #include <string.h>
3
4 int main(int numParms, char *parms[])
5 {
6 int intArray[] = {1, 2, 3};
7 char charArray[] = "This is a string";
8
9 int *intPtr;
10 char *charPtr;
11
12 for (intPtr=&intArray[0]; intPtr<&intArray[0]+3; intPtr++) // Correct
13 {
14 printf("%d ", *intPtr);
15 }
16 printf("\n");
17
18 for (charPtr=&charArray[0]; *charPtr!=’\0’; charPtr++) // Correct
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 71

19 {
20 printf("%c", *charPtr);
21 }
22 printf("\n");
23
24 intPtr = (int*) &charArray[0];
25 for (; intPtr<(int*)(&charArray[0]+strlen(charArray)); intPtr++) // Wrong
26 {
27 printf("%c", *intPtr);
28 }
29 printf("\n");
30
31 return 0;
32 }

1 1 2 3
2 This is a string
3 T ar

5.11 Processing Multi-Dimensional Arrays

It was mentioned in Chapter 1 that when multi-dimensional arrays are passed as arguments, the func-
tion must explicitly declare the value of the second dimension (and all subsequent dimensions as well).
The program below shows how the two-dimensional array was passed to the function printArray in
Chapter 1 and then illustrates how the array could be passed as a pointer variable and manipulated
appropriately in the function printArray2. Both methods generate identical output.

1 #include <stdio.h>
2
3 void printArray(int, int, int[][]);
4 void printArray2(int, int, int*);
5
6 const int NUM_ROWS = 3;
7 const int NUM_COLS = 2;
8
9 int main(int numParms, char *parms[])
10 {
11 int myArray[NUM_ROWS][NUM_COLS];
12 int row, col;
13 int* ptr;
14
15 for (row=0; row<NUM_ROWS; row++)
16 {
17 for (col=0; col<NUM_COLS; col++)
18 {
19 myArray[row][col] = row*10 + col;
20 }
21 }
22
23 printArray(NUM_ROWS, NUM_COLS, myArray);
24
25 printf("\n");
26 ptr = (int*) &myArray[0];
27 printArray2(NUM_ROWS, NUM_COLS, ptr);
28
29 return 0;
30 }
31
32 void printArray(int rows, int cols, int myArray[][NUM_COLS])
33 {
34 int row;
35 int col;
36
37 for (row=0; row<rows; row++)
38 {
39 for (col=0; col<cols; col++)
40 {
41 printf("%2d ", myArray[row][col]);
72 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

42 }
43 printf("\n");
44 }
45 }
46
47 void printArray2(int rows, int cols, int* ptr)
48 {
49 int row;
50 int col;
51 int* current;
52
53 current = ptr;
54 for (row=0; row<rows; row++)
55 {
56 for (col=0; col<cols; col++)
57 {
58 printf("%2d ", *current);
59 current++;
60 }
61 printf("\n");
62 }
63 }
64

1 0 1
2 10 11
3 20 21
4
5 0 1
6 10 11
7 20 21

5.12 Dynamic Memory Allocation


In the examples used so far in this chapter, the pointers have always pointed at variables that were
declared in the corresponding functions. For example, in the program segment below, when array1 is
declared in the main function, the memory required to store the 5 values is allocated when the main
function begins execution.

1 int main(int numParms, char *parms[])


2 {
3 int array1[] = {1, 2, 3, 4, 5};
4 ...
5 }

Frequently, it is not known how big an array is or how many elements will be used in a data structure
(such as a linked list). C permits the dynamic allocation of memory using the malloc function (which
is defined in the stdlib library). The function malloc obtains a specific amount of memory from the
operating system and then returns a pointer to the memory that is allocated. If there is not sufficient
memory available, malloc returns a value of NULL.

1 int *ptr;
2 ptr = malloc(bytes);

The number of bytes requested must be an integer value. Since the size of certain data types may
vary from one machine to another (for example, an int may be stored in either 2 bytes or 4 bytes), you
should never specify the number of bytes required as an absolute value. Instead, specify the number
of elements of a specific type. So, to allocate the space for an int array of size 5, you would use the
following statement:

1 int *ptr;
2 ptr = malloc(5*sizeof(int));

By using the sizeof function instead of an absolute value, your program is portable to other machines
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 73

instead of being limited to the type of machine on which it was developed. Once you have finished
using some dynamically allocated memory, you should return that memory to the operating system so
that the memory can be used by another part of the program. The free function is used to return memory
that was dynamically allocated.

1 int *ptr;
2 ptr = malloc(5*sizeof(int));
3 ...
4 free(ptr);

Memory that is allocated using malloc remains allocated until it is explicitly freed. (This is unlike mem-
ory that is allocated for local variables on the stack—this memory is automatically freed when the func-
tion that declared the local variables terminates.) Thus, a pointer variable that points to dynamically
allocated memory may be passed to other functions and may also be returned to the calling function.
The following program illustrates the use of the malloc statement to allocate an int array of 5 elements,
pointer arithmetic to assign values to the array elements, and finally the free statement to return the
memory to the operating system.

p 1 2 3 4 5

1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int numParms, char *parms[])
5 {
6 int count;
7 int *p;
8 int *s;
9
10 p = (int*) malloc(5*sizeof(int));
11
12 for (count=0, s=p; s<p+5; s++, count++)
13 {
14 *s = count+1;
15 }
16
17 ...
18
19 free(p);
20 ...
21
22 return 0;
23 }

5.13 NULL Pointers


When the memory pointed to by a pointer variable has been freed, it is a good programming practice
to assign the NULL value to that pointer variable. NULL is used in C in a manner that is similar to the way
that Java assigns a NULL value to an object reference before the object has been created.

1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int numParms, char *parms[])
5 {
6 int count;
7 int *p;
8 int *s;
9
10 p = (int*) malloc(5*sizeof(int));
11
12 for (count=0, s=p; s<p+5; s++, count++)
74 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

13 {
14 *s = count+1;
15 }
16
17 . . .
18
19 free(p);
20 p = NULL;
21 ...
22
23 return 0;
24 }

The reason for assigning the value NULL to a pointer variable is that if the programmer mistakenly tries
to refer to the memory previously pointed to by the pointer variable after the memory has been freed,
a NULL value will normally cause an error immediately whereas leaving the pointer variable pointing at
the memory that had been dynamically allocated will not necessarily cause an error or it may cause an
error at a later time.

5.14 Memory Management


When memory is allocated using the malloc command, the allocated memory is contiguous. For exam-
ple, if memory for 5 int values is allocated, the values are stored one after the other in one chunk of
memory.

1 2 3 4 5

However, if multiple calls are made to malloc, you must not assume that there is any relationship be-
tween the chunks of memory that are allocated. For example, the following program allocates a block
of two ints, a block of 3 ints, and a block of 4 ints.

p 0 1 q 2 3 4 r 5 6 7 8

The program then uses pointer arithmetic to assign the values shown above to the 9 memory locations.
Note that 3 separate loops are required to perform this initialization.

1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int numParms, char *parms[])
5 {
6 int count;
7 int *p;
8 int *q;
9 int *r;
10 int *s;
11
12 p = (int*) malloc(2*sizeof(int));
13 q = (int*) malloc(3*sizeof(int));
14 r = (int*) malloc(4*sizeof(int));
15
16 for (count=0, s=p; s<p+2; s++, count++)
17 {
18 *s = count;
19 }
20
21 for (s=q; s<q+3; s++, count++)
22 {
23 *s = count;
24 }
25
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 75

26 for (s=r; s<r+4; s++, count++)


27 {
28 *s = count;
29 }
30
31 free(p);
32 free(q);
33 free(r);
34 p = NULL;
35 q = NULL;
36 r = NULL;
37 ...
38
39 return 0;
40 }

The memory locations within each chunk are contiguous but there is no relationship among the 3
chunks. In the diagram shown above, the 3 chunks appear to have been allocated one after the other
(with some space in between each pair of chunks), but that organization is not necessarily correct. It is
just as likely that the memory will be allocated as shown below.

r 5 6 7 8 q 2 3 4 p 0 1

The only thing that you can be certain of is that the 3 memory allocations will not be contiguous since
some system information is reserved before each chunk of allocated memory.
If you require that some memory locations be contiguous, you must allocate the memory in one malloc
statement. Otherwise if separate malloc statements are used, the requested chunks of memory will be
allocated but you can not predict in advance where those memory locations will be.

5.15 The Stack and the Heap


As was mentioned earlier, when a function is called, the memory required for the local variables used
in the function is automatically allocated on the stack by the C run-time system. The programmer can
manipulate those variables within the function in any valid manner. When the function terminates, the
memory allocated for those variables is automatically freed by the C run-time system and the program-
mer must not refer to those memory locations in any of the higher-level (calling) functions. Alternatively,
any memory allocated explicitly by the programmer using the malloc statement is allocated on the heap.
This memory remains available until the programmer explicitly frees the memory.
The following program illustrates memory allocation on the stack and on the heap.

1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int sum1(int, int[]);
5 int* sum2(int, int[]);
6 int* sum3(int, int[]);
7
8 int main(int numParms, char *parms[])
9 {
10 int myInts[] = {1, 2, 3, 4, 5};
11 int result1;
12 int* result2;
13 int* result3;
14
15 result1 = sum1(5, myInts);
16 result2 = sum2(5, myInts);
17 result3 = sum3(5, myInts);
18 printf("Sum1: %d\n", result1);
19 printf("Sum2: %d\n", *result2);
20 printf("Sum3: %d\n", *result3);
21
76 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

22 free(result3);
23
24 return 0;
25 }
26
27 int sum1(int size, int ints[])
28 {
29 int count;
30 int result;
31
32 result = 0;
33 for (count=0; count<size; count++)
34 {
35 result += ints[count];
36 }
37 return result;
38 }
39
40
41 int* sum2(int size, int ints[])
42 {
43 int count;
44 int result;
45 int* ptr;
46
47 result = 0;
48 ptr = &result;
49 for (count=0; count<size; count++)
50 {
51 *ptr += ints[count];
52 }
53 return ptr; // Wrong!!
54 }
55
56 int* sum3(int size, int ints[])
57 {
58 int count;
59 int* result;
60
61 result = (int*) malloc(sizeof(int));
62 *result = 0;
63 for (count=0; count<size; count++)
64 {
65 *result += ints[count];
66 }
67 return result;
68 }
69

1 Sum1: 15
2 Sum2: 2009291924
3 Sum3: 15

As can be seen from the output, the first function correctly returns its result as an int variable. The
second function returns a pointer to the variable result that is local to the sum2 function. The memory
allocated for this local variable is released at the end of the processing of sum2 but the pointer still points
to the memory location that had been used by the local variable. This is a typical mistake when manip-
ulating pointer variables. In the third function, memory is allocated on the heap for the result variable.
A pointer to this memory can be returned to the calling program but the calling program must free the
memory when it has completed its processing (or a memory leak will have been created).

5.16 void Pointers


The pointer returned by the malloc function is of type void. A void pointer is a pointer that does not have
a specific type (such as int, char, etc.) associated with it. A void pointer may be stored in a void pointer
variable, as is shown below.
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 77

1 void *ptr;
2
3 ptr = malloc(10*sizeof(int));

Although you may store the pointer in a void pointer variable, you can not use the variable directly since
voidpointers can not be dereferenced without a corresponding cast. The following statements illustrate
how the pointer must be cast before it can be dereferenced.

1 void *ptr;
2
3 ptr = malloc(10*sizeof(int));
4
5 *((int *) ptr) = 5;

C also does not maintain information about the type (if there is a type specified) of each alloc statement
so it is not possible to determine the type of information that the void pointer points to.
As a result, instead of storing the result of a malloc statement in a void pointer, it is normally preferable
to store the result in a pointer of the type for which the memory will be used. The following statements
illustrate the intent to use the memory that is allocated to store int values. Although an explicit cast to
int* is performed, this cast is not required.

1 void *ptr;
2
3 ptr = malloc(10*sizeof(int));
4
5 *((int *) ptr) = 5;

A void pointer can be thought of as being similar to Java’s Object class—any pointer may be stored in a
void pointer; however, unlike Java, the void pointer can not be used without a corresponding downcast
to a specific data type.

5.17 Resizing a Memory Allocation


If it is necessary to resize an array (or any other dynamically allocated chunk of memory) in C, there
are several functions that assist with this process. The first technique involves allocating a larger chunk
of memory using malloc, then using the C function memcopy to copy the contents of the original chunk
of memory into the newly allocated chunk of memory, and finally using free to release the originally
allocated chunk of memory. Be aware that the new chunk of memory will not normally have the same
memory address as the original chunk so you must ensure that any pointers are updated appropriately.
However, since resizing a chunk of memory is a common practice in C, another function, realloc, is
available to perform all tasks (malloc, memcopy, and free) together.

5.18 Pointers to Pointers


A pointer contains the address of a data item. It is perfectly valid in C for a pointer to point to another
pointer. A variable that points to a pointer variables is declared as type**. The following program
illustrates the use of a pointer to another pointer.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int value = 25;
6 int* ptr1;
7 int** ptr2;
8
9 ptr1 = &value;
10 ptr2 = &ptr1;
11 printf("%d %d %d \n", value, *ptr1, **ptr2);
12
78 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

13 return 0;
14 }

1 25 25 25

5.19 Processing Run-Time (Command-Line) Parameters


The beginning of each Java main method contains the parameter:

1 public static void main(String[] parms) ... // Java

The beginning of each C main function contains the parameters

1 int main(int numParms, char *parms[])

This parameter list is frequently written as:

1 int main(int argc, char *argv[])

or
1 int main(int argc, char **argv)

The names of the variables used to define the run-time or command-line parameters is more or less
irrelevant—just keep the variables meaningful. It should be obvious that what is being passed is a
pointer to an array of pointer variables, each of which points to a character string that represents one of
the parameters the program was called with. For example, in addition to invoking a program with only
its name,
1 myprogram

a program may be provided with parameters that the program is expected to process:

1 myprogram parm1 parm2 parm3

These parameters are supplied to the main method as parameters (an array of pointers to individual
arrays of chars). The following program extracts and prints the parameters but does not perform any
parsing or processing of the parameters.

1 #include <stdio.h>
2
3 int main(int numParms, char *parms[])
4 {
5 int count;
6
7 for (count=0; count<numParms; count++)
8 {
9 printf("%s\n", parms[count]);
10 }
11 printf("All done.\n");
12
13 return 0;
14 }

1 myprogram
2 parm1
3 parm2
4 parm3

Note that the first parameter is the name of the program that was executed.
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 79

5.20 Pointers to Structures


In addition to being able to allocate memory for the basic data types, you may also use malloc to allocate
memory for a struct. For example, the following code segment illustrates the dynamic allocation of
space for a structure.

1 struct person
2 {
3 int number;
4 int age;
5 };
6
7 struct person *myPerson;
8
9 myPerson = (struct person*) malloc(sizeof(struct person));

Now the variable myPerson is a pointer variable to an instance of the structure person. Unfortunately, the
notation used earlier to refer to a variable in a structure (a.b) can not be used with a pointer variable.
However, if we dereference the pointer variable, we can then refer to the structure variable as shown
below:
1 (*myPerson).number

The expression above is a bit clumsy and so C provides an alternative syntax that is slightly nicer:

1 myPerson->number

Either of the two expressions is acceptable and both expressions may be used on either side of an as-
signment operator.

5.21 Encapsulation
In Chapter 3, we created the following program that began a transition towards a more object-oriented
style of programming in C. However, if you examine the program closely, you should notice that person1
is assigned memory in the main function, not in the newPerson function. The newPerson function simply
assigns the parameters to the existing structure and returns the modified structure. Do you remember
why a struct that is modified must be returned (using a return statement) by the function that modifies
it?

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 int main(int numParms, char *parms[])
5 {
6 person person1;
7
8 person1 = newPerson(person1, 25, 30);
9
10 printPerson(person1);
11 return 0;
12 }

module1.h

1 i
2 typedef struct
3 {
4 int number;
5 int age;
6 } person;
7
8 person newPerson(person, int, int);
80 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

9
10 void printPerson(person);

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 person newPerson(person person1, int newNumber, int newAge)
5 {
6 person1.number = newNumber;
7 person1.age = newAge;
8 return person1;
9 }
10
11 void printPerson(person person1)
12 {
13 printf("Person1: %d %d \n", person1.number, person1.age);
14 }

Now, however, we have the ability to allocate memory for an object dynamically in the newPerson func-
tion.

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 int main(int numParms, char *parms[])
5 {
6 Person person1;
7 Person person2;
8
9 person1 = newPerson(25, 30);
10 printPerson(person1);
11
12 person2 = newPerson(55, 55);
13 printPerson(person2);
14
15 deletePerson(person1);
16 deletePerson(person2);
17
18 printf("All done.\n");
19 return 0;
20 }

module1.h

1 typedef struct person *Person;


2
3 Person newPerson(int, int);
4
5 void printPerson(Person);
6
7 void deletePerson(Person myPerson);

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 struct person
5 {
6 int number;
7 int age;
8 };
9
10 Person newPerson(int num, int newAge)
11 {
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 81

12 Person myPerson = (Person) malloc(sizeof(struct person));


13 myPerson->number = num;
14 myPerson->age = newAge;
15 return myPerson;
16 }
17
18 void printPerson(Person myPerson)
19 {
20 printf("Number: %d, age: %d\n", myPerson->number, myPerson->age);
21 }
22
23 void deletePerson(Person myPerson)
24 {
25 free(myPerson);
26 }

The statement
1 typedef struct person *Person;

in the header file is interesting because it defines a pointer to the struct instead of defining the contents
of the struct. (The contents of the struct person are defined only in the module that requires access to
them.) Thus, the statement
1 Person person1;

declares an “object” person1 that is a pointer variable to the struct that contains the instance variables for
person1. (There isn’t an explicit name for this construct in C but referring to it as an “inner structure”
would be appropriate.) As a result, the instance variables are hidden from the main function (encapsu-
lated).

5.22 ArrayLists
Using the material developed in the preceding section, the following program is a slightly nicer ver-
sion of the ArrayList program that was developed in Chapter 3. In this version, instead of making the
structure that defines the ArrayList object a global structure, instead, we simply define a pointer to the
structure.
Note that because variables of type ArrayList now contain pointers, the pointer variable is passed to
the processing functions and the processing functions can modify the structure itself (call by reference).
Also, as a result of using a pointer variable, the calling program can no longer access the contents of the
arrayList structure since this structure is hidden in the associated module. (This is not entirely true but
it is close enough for now.)
The #define statement is used to define a symbolic constant; the value of a symbolic constant is substi-
tuted for each occurrence of its name. Note that there is no equals sign in the definition of a symbolic
constant.

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 int main(int numParms, char *parms[])
5 {
6 ArrayList myList;
7
8 myList = newList();
9
10 addList(myList, 1);
11 addList(myList, 2);
12 addList(myList, 3);
13
14 printList(myList);
82 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

15
16 removeList(myList, 1);
17 printList(myList);
18
19 addList(myList, 5);
20 printList(myList);
21
22 removeList(myList, 0);
23 removeList(myList, 0);
24 removeList(myList, 0);
25 printList(myList);
26
27 printf("\nAll done\n");
28 return 0;
29 }

module1.h

1 typedef struct arrayList *ArrayList;


2
3 ArrayList newList(void);
4
5 void addList(ArrayList, int);
6
7 int getList(ArrayList, int);
8
9 int sizeList(ArrayList);
10
11 void removeList(ArrayList, int);
12
13 void printList(ArrayList);

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 #define NumEntries 5
5
6 struct arrayList
7 {
8 int size;
9 int data[NumEntries];
10 };
11
12 ArrayList newList()
13 {
14 ArrayList list;
15 int count;
16
17 list = (ArrayList) malloc(sizeof(struct arrayList));
18 for (count=0; count<NumEntries; count++)
19 {
20 list -> data[count] = 0;
21 }
22 list -> size = 0;
23 }
24
25 void addList(ArrayList list, int value)
26 {
27 list -> data[list -> size] = value;
28 list -> size++;
29 }
30
31 int getList(ArrayList list, int position)
32 {
33 int entry;
34
35 entry = list -> data[position];
36 return entry;
37 }
38
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 83

39 int sizeList(ArrayList list)


40 {
41 return list -> size;
42 }
43
44 void removeList(ArrayList list, int position)
45 {
46 int count;
47
48 for (count=position; count<(list -> size-1); count++)
49 {
50 list -> data[count] = list -> data[count+1];
51 }
52 list -> size--;
53 list -> data[list -> size] = 0;
54 }
55
56 void printList(ArrayList list)
57 {
58 int count;
59
60 for (count=0; count<list -> size; count++)
61 {
62 printf("Element %d is %d\n", count, getList(list, count));
63 }
64 printf("\n");
65 }

1 Element 0 is 1
2 Element 1 is 2
3 Element 2 is 3
4
5 Element 0 is 1
6 Element 1 is 3
7
8 Element 0 is 1
9 Element 1 is 3
10 Element 2 is 5
11
12
13 All done

The program above encapsulates the data in such a way that the calling functions can not (easily) access
the arrayList structure. But the program still has the disadvantage that the number of elements that can
be stored in the ArrayList is fixed and can not be modified. The program below begins an improvement
that will eventually permit the ArrayList to be expanded in size.

1 #include <stdio.h>
2 #include "module1.h"
3
4 #define NumEntries 5
5
6 struct arrayList
7 {
8 int maxSize;
9 int size;
10 int *data;
11 };
12
13 ArrayList newList()
14 {
15 ArrayList list;
16 int count;
17
18 list = (ArrayList) malloc(sizeof(struct arrayList));
19 list -> data = (int*) malloc(NumEntries*sizeof(int));
20 for (count=0; count<NumEntries; count++)
21 {
22 list -> data[count] = 0;
23 }
84 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

24 list -> size = 0;


25 list -> maxSize = NumEntries;
26 }
27
28 void addList(ArrayList list, int value)
29 {
30 list -> data[list -> size] = value;
31 list -> size++;
32 }
33
34 int getList(ArrayList list, int position)
35 {
36 int entry;
37
38 entry = list -> data[position];
39 return entry;
40 }
41
42 int sizeList(ArrayList list)
43 {
44 return list -> size;
45 }
46
47 void removeList(ArrayList list, int position)
48 {
49 int count;
50
51 for (count=position; count<(list -> size-1); count++)
52 {
53 list -> data[count] = list -> data[count+1];
54 }
55 list -> size--;
56 list -> data[list -> size] = 0;
57 }
58
59 void printList(ArrayList list)
60 {
61 int count;
62
63 printf("\nContents of list:\n");
64 for (count=0; count<list -> size; count++)
65 {
66 printf("Element %d is %d\n", count, getList(list, count));
67 }
68 printf("\n");
69 }

Now that the array used to store the elements in the arrayList is pointed to by a variable in the structure
instead of being hard-coded in the structure, it becomes trivial to resize the array as necessary. The
following method performs this processing.

1 void resizeList(ArrayList list, int newSize)


2 {
3 int count;
4
5 list -> data = (int*) realloc(list->data, newSize*sizeof(int));
6 list -> maxSize = newSize;
7 for (count=list->size; count<list->maxSize; count++)
8 {
9 list -> data[count] = 0;
10 }
11 }

In addition, the following code must be added to the beginning of the addList function. (The value 10
used to increment the size of the array was chosen arbitrarily.)

1 if (list->size >= list->maxSize)


2 {
3 resizeList(list, list->maxSize+10);
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 85

4 }

Although the array list module is a good start, there are some memory management problems that must
be fixed before the module could be used.

5.23 Linked Lists


Now that we can allocate memory dynamically, we can write a linked-list program that allocates mem-
ory as necessary.

main.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 int main(int numParms, char *parms[])
5 {
6 LinkedList myList;
7
8 myList = newList();
9
10 myList = addList(myList, 3);
11 printList(myList);
12
13 myList = addList(myList, 2);
14 printList(myList);
15
16 myList = addList(myList, 1);
17 printList(myList);
18
19 myList = removeList(myList, 1);
20 printList(myList);
21
22 myList = addList(myList, 5);
23 printList(myList);
24
25 myList = removeList(myList, 1);
26 myList = removeList(myList, 1);
27 myList = removeList(myList, 0);
28 printList(myList);
29
30 printf("\nAll done\n");
31 return 0;
32 }

module1.h

1 typedef struct linkedList *LinkedList;


2
3 LinkedList newList(void);
4
5 LinkedList addList(LinkedList, int);
6
7 int getList(LinkedList, int);
8
9 int sizeList(LinkedList);
10
11 LinkedList removeList(LinkedList, int);
12
13 void printList(LinkedList);

module1.c

1 #include <stdio.h>
2 #include "module1.h"
3
4 struct Node
5 {
6 int data;
86 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

7 struct Node *next;


8 };
9
10 struct linkedList
11 {
12 int size;
13 struct Node *first;
14 };
15
16 LinkedList newList()
17 {
18 LinkedList list;
19 int count;
20
21 list = (LinkedList) malloc(sizeof(struct linkedList));
22 list -> first = NULL;
23 list -> size = 0;
24 }
25
26 LinkedList addList(LinkedList list, int value)
27 {
28 struct Node *newNode;
29
30 newNode = (struct Node *) malloc(sizeof(struct Node));
31 newNode -> data = value;
32 newNode -> next = list -> first;
33 list -> first = newNode;
34 list -> size++;
35 return list;
36 }
37
38 int getList(LinkedList list, int position)
39 {
40 int entry;
41 int count;
42 struct Node *current;
43
44 current = list -> first;
45 for (count=0; count<position; count++)
46 {
47 current = current -> next;
48 }
49 entry = current -> data;
50 return entry;
51 }
52
53 int sizeList(LinkedList list)
54 {
55 return list -> size;
56 }
57
58 LinkedList removeList(LinkedList list, int position)
59 {
60 int entry;
61 int count;
62 struct Node *current;
63 struct Node *previous;
64
65 previous = NULL;
66 current = list -> first;
67 for (count=0; count<position; count++)
68 {
69 previous = current;
70 current = current -> next;
71 }
72
73 if (previous == NULL)
74 { // delete the first element in the list
75 list -> first = current -> next;
76 }
77 else
78 { // delete an element that has another element before it
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 87

79 previous -> next = current -> next;


80 }
81
82 list -> size--;
83 return list;
84 }
85
86 void printList(LinkedList list)
87 {
88 int count;
89
90 if (list->size > 0)
91 {
92 printf("\nContents of list:\n");
93 for (count=0; count<list->size; count++)
94 {
95 printf("Element %d is %d\n", count, getList(list, count));
96 }
97 printf("\n");
98 }
99 else
100 {
101 printf("\nThe list is empty.\n");
102 }
103 }

1 Contents of list:
2 Element 0 is 3
3
4 Contents of list:
5 Element 0 is 2
6 Element 1 is 3
7
8 Contents of list:
9 Element 0 is 1
10 Element 1 is 2
11 Element 2 is 3
12
13 Contents of list:
14 Element 0 is 1
15 Element 1 is 3
16
17 Contents of list:
18 Element 0 is 5
19 Element 1 is 1
20 Element 2 is 3
21
22 The list is empty.

Note that the linked-list module is not yet complete. New elements are added to the beginning of the list
(new elements are normally added to the end of the list). Plus, there are several memory management
problems that need to be fixed.

5.24 Working with Strings


Working with strings and dynamic memory allocation requires some extra care due to the way that
strings are constructed.
The following examples illustrate 4 different ways of storing a collection of strings. In the first example,
the strings are stored in one large two-dimensional array. Since the array must be of a predefined size,
there is wasted space at the end of each string (each row in the array) and at the end of the array
(unused rows). Note that this example does not use dynamic memory allocation. The string termination
character is not shown in the diagrams.
88 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

abc
def
ghi
jkl

1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4
5 int main(int numParms, char *parms[])
6 {
7 const int LINE_SIZE = 1000;
8 const int STRING_SIZE = 100;
9 const int NUM_STRINGS = 100;
10
11 FILE *infile;
12 char strings[NUM_STRINGS][STRING_SIZE];
13 char line[LINE_SIZE];
14 char* result;
15 int size;
16 int count;
17
18 size = 0;
19 infile = fopen("in.txt", "r");
20 result = fgets(line, LINE_SIZE, infile);
21 while (result != NULL)
22 {
23 if (line[strlen(line)-1]==’\n’)
24 {
25 line[strlen(line)-1]=’\0’;
26 }
27 if ((size>=NUM_STRINGS) || (strlen(line)>=STRING_SIZE))
28 {
29 fprintf(stderr, "\nRan out of memory.\n");
30 exit(EXIT_FAILURE);
31 }
32 strcpy(strings[size], line);
33 printf("<%s>\n", line);
34 result = fgets(line, LINE_SIZE, infile);
35 size++;
36 }
37 fclose(infile);
38
39 printf("\n");
40 for (count=0; count<size; count++)
41 {
42 printf("<%s>\n", (char*) strings[count]);
43 }
44
45 printf("\nAll done\n");
46 return 0;
47 }

In the second example, memory is allocated dynamically for each string. A pointer to the memory is
stored in a char* array. Since the memory is allocated on a string by string basis, the amount of memory
allocated for each string is just enough to store the string and its termination character. However, the
array that contains the pointers to the strings is of fixed size and so there is extra space at the end of the
array.
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 89

char* array[]

abc

def

ghi

jkl

1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4
5 int main(int numParms, char *parms[])
6 {
7 const int LINE_SIZE = 1000;
8 const int NUM_STRINGS = 100;
9
10 FILE *infile;
11 char* strings[NUM_STRINGS];
12 char line[LINE_SIZE];
13 char* result;
14 int size;
15 int count;
16
17 size = 0;
18 infile = fopen("in.txt", "r");
19 result = fgets(line, LINE_SIZE, infile);
20 while (result != NULL)
21 {
22 if (line[strlen(line)-1]==’\n’)
23 {
24 line[strlen(line)-1]=’\0’;
25 }
26 if (size>=NUM_STRINGS)
27 {
28 fprintf(stderr, "\nRan out of memory.\n");
29 exit(EXIT_FAILURE);
30 }
31 strings[size] = malloc((strlen(line)+1)*sizeof(char));
32 if (strings[size] == NULL)
33 {
34 fprintf(stderr, "\nRan out of memory.\n");
35 exit(EXIT_FAILURE);
36 }
37 strcpy(strings[size], line);
38 printf("<%s>\n", line);
39 result = fgets(line, LINE_SIZE, infile);
40 size++;
41 }
42 fclose(infile);
43
44 printf("\n");
45 for (count=0; count<size; count++)
46 {
47 printf("<%s>\n", (char*) strings[count]);
48 }
49
50 for (count=0; count<size; count++)
51 {
52 free(strings[count]);
53 }
54
90 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

55 printf("\nAll done\n");
56 return 0;
57 }

In the third example, one block of memory is allocated at the beginning of the processing and the pro-
gram then allocates a portion of the big block of memory for each string. As in the previous example,
the pointer to each string is stored in an array of pointers.

char* array[]

abc def ghi jkl

1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4
5 int main(int numParms, char *parms[])
6 {
7 const int LINE_SIZE = 1000;
8 const int NUM_STRINGS = 100;
9 const int MEM_SIZE = 2000;
10
11 FILE *infile;
12 char* strings[NUM_STRINGS];
13 char line[LINE_SIZE];
14 char* memoryPool;
15 char* nextAvailable;
16 char* lastAvailable;
17 char* result;
18 int size;
19 int count;
20
21 size = 0;
22 memoryPool = malloc(MEM_SIZE*sizeof(char));
23 nextAvailable = memoryPool;
24 lastAvailable = memoryPool+MEM_SIZE-1;
25 infile = fopen("in.txt", "r");
26 result = fgets(line, LINE_SIZE, infile);
27 while (result != NULL)
28 {
29 if (line[strlen(line)-1]==’\n’)
30 {
31 line[strlen(line)-1]=’\0’;
32 }
33 if (size>=NUM_STRINGS)
34 {
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 91

35 fprintf(stderr, "\nRan out of memory.\n");


36 free(memoryPool);
37 exit(EXIT_FAILURE);
38 }
39 if (nextAvailable+strlen(line) > lastAvailable)
40 {
41 fprintf(stderr, "\nRan out of memory.\n");
42 free(memoryPool);
43 exit(EXIT_FAILURE);
44 }
45 strings[size] = nextAvailable;
46 nextAvailable += (strlen(line)+1)*sizeof(char);
47 strcpy(strings[size], line);
48 printf("<%s>\n", line);
49 result = fgets(line, LINE_SIZE, infile);
50 size++;
51 }
52 fclose(infile);
53
54 printf("\n");
55 for (count=0; count<size; count++)
56 {
57 printf("<%s>\n", (char*) strings[count]);
58 }
59
60 free(memoryPool);
61
62 printf("\nAll done\n");
63 return 0;
64 }

In the final example, one block of memory is allocated at the beginning of the processing and the pro-
gram then allocates a portion of the block of memory for each string and a portion of the block for each
pointer. The strings are stored at the beginning of the block and the pointers are stored at the end of the
block. Using this technique, as strings and pointers are added to the block, the amount of free space in
the block becomes smaller and smaller until there is no longer room for the next string and its associated
pointer.

abc def ghi jkl

1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4
5 int main(int numParms, char *parms[])
6 {
7 const int LINE_SIZE = 50;
8 const int MEM_SIZE = 2000;
9
10 FILE *infile;
11 char line[LINE_SIZE];
12 char* memoryPool;
13 char* firstString;
14 char* nextString;
92 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

15 int* firstPtr;
16 int* nextPtr;
17 int* current;
18 char* result;
19 char* ptr;
20
21 memoryPool = malloc(MEM_SIZE*sizeof(char));
22 firstString = memoryPool;
23 nextString = firstString;
24 firstPtr = (int*) memoryPool;
25 firstPtr += (MEM_SIZE-sizeof(int))/sizeof(int);
26 nextPtr = firstPtr;
27
28 infile = fopen("in.txt", "r");
29 result = fgets(line, LINE_SIZE, infile);
30 while (result != NULL)
31 {
32 if ((((char*)nextString)+strlen(line)) > ((char*)nextPtr))
33 {
34 fprintf(stderr, "\nRan out of memory\n");
35 free(memoryPool);
36 exit(EXIT_FAILURE);
37 }
38 if (line[strlen(line)-1]==’\n’)
39 {
40 line[strlen(line)-1]=’\0’;
41 }
42 strcpy(nextString, line);
43 *nextPtr = (int) nextString;
44 printf("<%s>\n", line);
45 nextString += strlen(line)+1;
46 nextPtr -= 1;
47 result = fgets(line, LINE_SIZE, infile);
48 }
49 fclose(infile);
50
51
52 printf("\n\n");
53 current = (int*) firstPtr;
54 while (current > nextPtr)
55 {
56 ptr = (char*) *current;
57 printf("<%s>\n", ptr);
58 current--;
59 }
60
61 free(memoryPool);
62
63 printf("\nAll done\n");
64 return 0;
65 }

In the last two examples in this section that sub-allocate memory from a large block of memory, the
pointers are absolute pointers, that is, they point directly at the memory address of the associated string.
These pointers could also be made relative pointers (relative to the beginning of the block of memory).
The pointers would then be an offset into the block of memory. To determine the actual address of a
string, the address of the beginning of the block of memory would be added to the value of the offset
into the block. Using this type of addressing would permit the block to be resized more easily if it
became full.

Also, all four of the examples handle only insertions into the collection of strings. Other operations such
as deleting a string and resizing a string are not handled but the processing for these operations could
be added with some additional effort.

The examples shown in this section provide the foundation for implementing a module that maintains a
collection of variable-length “objects” such as the array list module that has been developed throughout
these notes.
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 93

5.25 Function Pointers


In the previous sections of this chapter, we have seen how pointers can be used to point to data values.
In this section, we examine how pointers can also be used to point to functions. Suppose that we have
to write a function that performs various types of processing on an array. We create a different version
of the evaluation function for each type of processing that can be performed.

1 #include <stdio.h>
2
3 int evalF1(int[], int);
4 int evalF2(int[], int);
5 int function1(int);
6 int function2(int);
7
8 int main(int numParms, char *parms[])
9 {
10 int array1[] = {1, 2, 3, 4, 5};
11 int result;
12
13 result = evalF1(array1, 5);
14 printf("%d\n", result);
15
16 result = evalF2(array1, 5);
17 printf("%d\n", result);
18
19 return 0;
20 }
21
22 int evalF1(int array[], int size)
23 {
24 int result;
25 int count;
26
27 result = 0;
28 for (count=0; count<size; count++)
29 {
30 result += function1(array[count]);
31 }
32 return result;
33 }
34
35 int evalF2(int array[], int size)
36 {
37 int result;
38 int count;
39
40 result = 0;
41 for (count=0; count<size; count++)
42 {
43 result += function2(array[count]);
44 }
45 return result;
46 }
47
48 int function1(int value)
49 {
50 return value;
51 }
52
53 int function2(int value)
54 {
55 return value*value;
56 }

1 15
2 55

In the simple example shown above, the processing can be defined quite easily. If the number of possible
functions was a bit larger, we could simplify the processing somewhat by passing a switch to the eval-
94 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

uation function. The switch would be used by the evaluation function to determine which lower-level
function to call.
1 #include <stdio.h>
2
3 int evalF(int[], int, int);
4 int function1(int);
5 int function2(int);
6
7 int main(int numParms, char *parms[])
8 {
9 int array1[] = {1, 2, 3, 4, 5};
10 int result;
11
12 result = evalF(array1, 5, 1);
13 printf("%d\n", result);
14
15 result = evalF(array1, 5, 2);
16 printf("%d\n", result);
17
18 return 0;
19 }
20
21 int evalF(int array[], int size, int whichFunction)
22 {
23 int result;
24 int count;
25
26 result = 0;
27 for (count=0; count<size; count++)
28 {
29 if (whichFunction == 1)
30 {
31 result += function1(array[count]);
32 }
33 else if (whichFunction == 2)
34 {
35 result += function2(array[count]);
36 }
37 }
38 return result;
39 }
40
41 int function1(int value)
42 {
43 return value;
44 }
45
46 int function2(int value)
47 {
48 return value*value;
49 }

1 15
2 55

Instead of hard-coding the values of the switch, using an enumerated type would be more appropriate.

Suppose that we have a large number of functions that could be applied. We would not want to create
a large number of condition statements in the evaluation function, one for each type of processing that
can be applied. The following example illustrates how we can pass a function as parameter (actually a
pointer to the function) and then invoke the function within the function that is called.

1 #include <stdio.h>
2
3 int eval(int[], int, int(*fp)(int));
4 int function1(int);
5 int function2(int);
6
7 int main(int numParms, char *parms[])
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 95

8 {
9 int array1[] = {1, 2, 3, 4, 5};
10 int (*f) (int);
11 int result;
12
13 f = function1;
14 result = eval(array1, 5, *f);
15 printf("%d\n", result);
16
17 f = function2;
18 result = eval(array1, 5, *f);
19 printf("%d\n", result);
20
21 return 0;
22 }
23
24 int eval(int array[], int size, int (*f) (int))
25 {
26 int result;
27 int count;
28
29 result = 0;
30 for (count=0; count<size; count++)
31 {
32 result += f(array[count]);
33 }
34 return result;
35 }
36
37
38 int function1(int value)
39 {
40 return value;
41 }
42
43 int function2(int value)
44 {
45 return value*value;
46 }

1 15
2 55

The program above is significantly easier to write than the previous two programs. However, we can
still improve the program slightly by passing the address of the appropriate function in the argument
list for eval.
1 #include <stdio.h>
2
3 int eval(int[], int, int(*fp)(int));
4 int function1(int);
5 int function2(int);
6
7 int main(int numParms, char *parms[])
8 {
9 int array1[] = {1, 2, 3, 4, 5};
10 int result;
11
12 result = eval(array1, 5, &function1);
13 printf("%d\n", result);
14
15 result = eval(array1, 5, &function2);
16 printf("%d\n", result);
17
18 return 0;
19 }
20
21 int eval(int array[], int size, int (*f) (int))
22 {
23 int result;
24 int count;
96 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

25
26 result = 0;
27 for (count=0; count<size; count++)
28 {
29 result += f(array[count]);
30 }
31 return result;
32 }
33
34
35 int function1(int value)
36 {
37 return value;
38 }
39
40 int function2(int value)
41 {
42 return value*value;
43 }

1 15
2 55

Now that we can define function pointers, we have the ability to define “complete” objects using a
structure. The structure can contain not only the data (instance variables) of the object but also pointers
to the functions used by the objects. These objects would have to be hand crafted but it is possible (with
some additional effort) to invoke the function pointers dynamically using a table of function pointers
and the associated functions. (Take a look on the web for “vtable” or virtual function table for more
information.)
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 97

5.26 Big/Little Endian Memory Organization


The order in which the bytes used to represent ints, doubles, floats, pointers, etc. are stored is referred
to as big-endian or little-endian memory organization. The diagram below illustrates the order in
which the bytes used to store the integer value 1 are stored at the memory location 0x003D24F0 in the
big-endian and little-endian representations. Note that with the little-endian representation, the bytes
can be viewed in two equivalent ways (the only difference is the order of the memory locations).
003D24F0 003D24F3

00 00 00 01 Big Endian

003D24F0 003D24F3

01 00 00 00 Little Endian

or
003D24F3 003D24F0

00 00 00 01 Little Endian

The following function can be used to determine which memory organization is used by a particular
machine.
1 Little-endian machine // Windows XP 32-bit machine
2
3 Little-endian machine // x86_64 Linux system (member of CS Aviary flock)
4
5 Big-endian machine // Sun Blade 1000 system, running Solaris 8

On Unix/Linux use the command uname—a to obtain information about the machine type. The reason
for re-introducing the little/big-endian memory organization is so that the output generated by the
memory dump function described in the next section will make a bit more sense.

5.27 Memory Dump Function


When working with pointers, it can be helpful if you examine the contents of memory (and not just
the contents of a few variables). The following function prints the contents of memory beginning at the
memory location defined by the first parameter. The second parameter, s, specifies the number of bytes
to be printed.

1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <ctype.h>
4
5 void memoryDump(void *p, int s)
6 {
7 int *ptr;
8 int *intPtr;
9 int size;
10 unsigned char *charPtr;
11 unsigned char *charPtr2;
12
13 printf("\n");
14 printf("%10c %8c %8c %8c 0...4...8...C...\n", ’0’, ’4’, ’8’, ’C’);
15 intPtr = (int*) ((((long unsigned int) p) >> 4) << 4);
16 size = (s+15)/16*4;
17 for (ptr=intPtr; ptr<intPtr+size; ptr=ptr+4)
18 {
19 printf("%08X ", (unsigned int) ptr);
20
98 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

21 for (charPtr2=(char*)ptr; ((char*)charPtr2)<(char*)((long int)(ptr)+16); charPtr2+=4)


22 {
23 printf("%02X%02X%02X%02X ",*charPtr2,*(charPtr2+1),*(charPtr2+2),*(charPtr2+3));
24 }
25 printf(" ");
26
27 charPtr = (char*) ptr;
28 for (charPtr2=charPtr; charPtr2<charPtr+16; charPtr2++)
29 {
30 printf("%c", isprint(*charPtr2) ? *charPtr2 : ’.’);
31 }
32 printf("\n");
33 }
34 }

A conditional operator is used to determine (using the isprint function which is included in the ctype
library) whether or not a character is printable and then print either the character or a period.
The following program declares several arrays, prints the address of the first element in each array, and
then uses memoryDump to print the contents of some of the memory used by the program.

1 #include <stdio.h>
2
3 void memoryDump(void*, int);
4
5 int main(int numParms, char *parms[])
6 {
7 int array1[] = {1, 2, 3, 4, 5};
8 int array2[] = {6, 7};
9 int array3[] = {8, 9, 10};
10 char msg[] = "Hello";
11
12 printf("array1 is at %p \n", (int*) &array1[0]);
13 printf("array2 is at %p \n", (int*) &array2[0]);
14 printf("array3 is at %p \n", (int*) &array3[0]);
15 printf("msg is at %p \n", (char*) &msg[0]);
16
17 memoryDump(&msg[0], 80);
18
19 return 0;
20 }

The output from this program is shown below. Notice that the arrays are not allocated in order. The
program was run on a Windows (little-endian) machine so integer values are stored with the smallest
byte used to represent an int value first and the largest byte last.

1 array1 is at 0022FF68
2 array2 is at 0022FF60
3 array3 is at 0022FF48
4 msg is at 0022FF38
5
6 0 4 8 C 0...4...8...C...
7 0022FF30 14B8C377 00000000 48656C6C 6F000000 ...w....Hello...
8 0022FF40 ADAEC377 864BC167 08000000 09000000 ...w.K.g........
9 0022FF50 0A000000 02000000 60FF2200 05EFC177 ........‘."....w
10 0022FF60 06000000 07000000 01000000 02000000 ................
11 0022FF70 03000000 04000000 05000000 00304000 .............0@.

When the same program is run on a Sun Blade system (big-endian machine), the following results are
generated. Note that the order of the bytes used to represent each int are reversed.

1 0022FF70 03000000 04000000 05000000 00304000 .............0@.


2 array1 is at ffbefb08
3 array2 is at ffbefb00
4 array3 is at ffbefaf0
5 msg is at ffbefae8
6
7 0 4 8 C 0...4...8...C...
CHAPTER 5. POINTERS AND MEMORY MANAGEMENT 99

8 FFBEFAE0 FF29BC20 00000000 48656C6C 6F000000 .). ....Hello...


9 FFBEFAF0 00000008 00000009 0000000A 00000000 ................
10 FFBEFB00 00000006 00000007 00000001 00000002 ................
11 FFBEFB10 00000003 00000004 00000005 FFBEFB9C ................
12 FFBEFB20 00000005 FFBEFC00 00000000 00000000 ................

If the program is run on a 64-bit little-endian machine, the pointers are 8 bytes long and the information
generated by memoryDump is truncated but this should not cause any confusion since the full address of the
pointer is displayed in the calling program. (Compiling the program generates the message: warning:
cast from pointer to integer of different size) The memoryDump program could be modified to generate 16
hexadecimal values instead of 8 if it is necessary to view all values.

1 array1 is at 0x7fff1df1d440
2 array2 is at 0x7fff1df1d430
3 array3 is at 0x7fff1df1d420
4 msg is at 0x7fff1df1d410
5
6 0 4 8 C 0...4...8...C...
7 1DF1D410 48656C6C 6F000000 1B044000 00000000 Hello.....@.....
8 1DF1D420 08000000 09000000 0A000000 00000000 ................
9 1DF1D430 06000000 07000000 C0AB21B9 3F000000 ..........!.?...
10 1DF1D440 01000000 02000000 03000000 04000000 ................
11 1DF1D450 05000000 FF7F0000 00000000 00000000 ................

5.28 Pointer Pitfalls


Working with pointers correctly is arguably the most difficult part of learning to program in the C
language. The following are common mistakes that are made (both by beginning programmers and by
experienced programmers) when manipulating memory:
• forget to assign an address to a pointer variable;
• forget to assign a value to the memory location that the pointer variable points to;
• forget to free memory that has been allocated but is no longer required (a memory leak);
• forget to assign a value of NULL to a pointer variable once the memory that it used to point to has
been freed;
• set a pointer variable to point to a variable that is declared within a function (on the stack) and
then use the pointer variable once the function has returned control to the calling function.

5.29 Summary
In this chapter, we examined the basics of pointer manipulation and memory management. The cover-
age was deliberately brief. As you delve deeper into the C language, you will find that there are many
additional issues involving pointers and the use of memory.

5.30 Exercises
1. How would you implement the shallow copying of an array?
2. How would you implement the deep copying of an array?
3. What happens if you pass &array1[4] to a function (where array1 is declared to have at least 5
elements)?
4. When realloc is used to resize a chunk of memory, it is possible that the new memory location
could be the same as the old memory location. What are the circumstances in which this could be
true?
5. What is the difference between a symbolic constant and a literal constant?
100 CHAPTER 5. POINTERS AND MEMORY MANAGEMENT

6. The heap contains memory that is available for dynamic allocation. Describe an algorithm that
could be used by the C run-time system to manage the heap (keep track of which chunks of mem-
ory are being used and which chunks of memory are available for use.)
Chapter

6
Design by Contract

Beginning with this chapter, we finally start to add the code that will improve the quality of the pro-
grams that we write. In this chapter, we examine how error checking can be added to a module to ensure
that the module functions correctly even in the presence of erroneous input values.

6.1 Design by Contract


Programs that ensure that certain conditions are met before the programs perform any processing are
said to have been “designed by contract”. The following is an excerpt from wikipedia on design by
contract.
“If a routine from a class in object-oriented programming provides a certain functionality, it may:
• Impose a certain obligation to be guaranteed on entry by any client module that calls it: the rou-
tine’s precondition—an obligation for the client, and a benefit for the supplier (the routine itself),
as it frees it from having to handle cases outside of the precondition.
• Guarantee a certain property on exit: the routine’s postcondition—an obligation for the supplier,
and obviously a benefit (the main benefit of calling the routine) for the client.
• Maintain a certain property, assumed on entry and guaranteed on exit: the class invariant.”
“The contract is the formalization of these obligations and benefits. One could summarize design by
contract by the “three questions” that the designer must repeatedly ask:
• What does it expect?
• What does it guarantee?
• What does it maintain?”
1 http://en.wikipedia.org/wiki/Design_by_contract

The statement above is a slightly longwinded way of saying that in any module that you write, you
must ensure that any parameters passed to the module are valid (the preconditions), that the state of
your processing and data structures is always valid (the invariants), and that any values returned by
your module are also valid (the postconditions).

6.2 Basic Array List Module


The arrayList module that was developed in Chapter 4 is shown below (although this version has fixed
the memory leak that existed by including a destroyList function). As can be seen, this module contains
no error-checking and would fall apart if the user of the module made any mistakes.

1 #include <stdio.h>
2 #include "module1.h"
3

101
102 CHAPTER 6. DESIGN BY CONTRACT

4 #define NumEntries 5
5
6 struct arrayList
7 {
8 int maxSize;
9 int size;
10 int *data;
11 };
12
13 ArrayList newList()
14 {
15 ArrayList list;
16 int count;
17
18 list = (ArrayList) malloc(sizeof(struct arrayList));
19 list -> data = (int*) malloc(NumEntries*sizeof(int));
20 for (count=0; count<NumEntries; count++)
21 {
22 list -> data[count] = 0;
23 }
24 list -> size = 0;
25 list -> maxSize = NumEntries;
26 }
27
28 void addList(ArrayList list, int value)
29 {
30 if (list->size >= list->maxSize)
31 {
32 resizeList(list, list->maxSize+1);
33 }
34 list -> data[list -> size] = value;
35 list -> size++;
36 }
37
38 void resizeList(ArrayList list, int newSize)
39 {
40 int count;
41
42 list -> data = (int*) realloc(list->data, newSize*sizeof(int));
43 list -> maxSize = newSize;
44 for (count=list->size; count<list->maxSize; count++)
45 {
46 list -> data[count] = -1; // initialize new locations (just being careful)
47 }
48 }
49
50 int getList(ArrayList list, int position)
51 {
52 int entry;
53
54 entry = list -> data[position];
55 return entry;
56 }
57
58 int sizeList(ArrayList list)
59 {
60 return list -> size;
61 }
62
63 void removeList(ArrayList list, int position)
64 {
65 int count;
66
67 for (count=position; count<(list -> size-1); count++)
68 {
69 list -> data[count] = list -> data[count+1];
70 }
71 list -> size--;
72 list -> data[list -> size] = 0;
73 }
74
75 ArrayList destroyList(ArrayList list)
CHAPTER 6. DESIGN BY CONTRACT 103

76 {
77 free(list->data);
78 free(list);
79
80 return NULL;
81 }
82
83 void printList(ArrayList list)
84 {
85 int count;
86
87 printf("\nContents of list:\n");
88 for (count=0; count<list -> size; count++)
89 {
90 printf("Element %d is %d\n", count, getList(list, count));
91 }
92 printf("\n");
93 }

6.3 Error Checking


We would like to make the array list module as fool-proof as possible. There are many opportunities
for the user of this module to make mistakes and it is the responsibility of the person who develops the
module to perform as much error-checking as possible.
A simple form of error-checking uses the basic if statement. With this statement, we compare a param-
eter with its expected values and generate an error message if the value of the parameter is not correct.
However, since this is such a common sequence of actions, C provides a library (assert.h) that includes
macros that assist us in writing error-checking code. A simple form of an assert statement (macro) is:

1 assert(list != NULL);

The logical expression inside the parentheses is evaluated: if the result of the expression is true, process-
ing continues; if the expression is false, processing terminates with an appropriate error message.
The following program shows the array list module with assert statements that attempt to prevent the
module from performing any incorrect actions.

1 #include <stdio.h>
2 #include <assert.h>
3 #include "module1.h"
4
5 #define NumEntries 5
6
7 struct arrayList
8 {
9 int maxSize;
10 int size;
11 int *data;
12 };
13
14 ArrayList newList()
15 {
16 ArrayList list;
17 int count;
18
19 list = (ArrayList) malloc(sizeof(struct arrayList));
20 assert(list != NULL);
21
22 assert(NumEntries > 0);
23 list -> data = (int*) malloc(NumEntries*sizeof(int));
24 assert(list->data != NULL);
25
26 for (count=0; count<NumEntries; count++)
27 {
28 list -> data[count] = 0;
29 }
104 CHAPTER 6. DESIGN BY CONTRACT

30 list -> size = 0;


31 list -> maxSize = NumEntries;
32 }
33
34 void addList(ArrayList list, int value)
35 {
36 assert(list != NULL);
37 assert((0 <= list->size) && (list->size <= list->maxSize));
38
39 if (list->size >= list->maxSize)
40 {
41 resizeList(list, list->maxSize+1);
42 }
43 list -> data[list->size] = value;
44 list -> size++;
45 }
46
47 void resizeList(ArrayList list, int newSize)
48 {
49 int count;
50
51 assert(list != NULL);
52
53 list -> data = (int*) realloc(list->data, newSize*sizeof(int));
54 assert(list->data != NULL);
55
56 list -> maxSize = newSize;
57 for (count=list->size; count<list->maxSize; count++)
58 {
59 list -> data[count] = -1; // initialize new locations (just being careful)
60 }
61 }
62
63 int getList(ArrayList list, int position)
64 {
65 int entry;
66
67 assert(list != NULL);
68 assert((0 <= list->size) && (list->size <= list->maxSize));
69 assert((0 <= position) && (position < list->size));
70
71 entry = list -> data[position];
72 return entry;
73 }
74
75 int sizeList(ArrayList list)
76 {
77 assert(list != NULL);
78 assert((0 <= list->size) && (list->size <= list->maxSize));
79 return list -> size;
80 }
81
82 void removeList(ArrayList list, int position)
83 {
84 int count;
85 assert(list != NULL);
86 assert((0 <= list->size) && (list->size <= list->maxSize));
87 assert((0 <= position) && (position < list->size));
88
89 for (count=position; count<(list -> size-1); count++)
90 {
91 list -> data[count] = list -> data[count+1];
92 }
93 list -> size--;
94 list -> data[list -> size] = 0;
95 }
96
97 ArrayList destroyList(ArrayList list)
98 {
99 if (list != NULL)
100 {
101 if (list->data != NULL)
CHAPTER 6. DESIGN BY CONTRACT 105

102 {
103 free(list->data);
104 }
105 free(list);
106 }
107
108 return NULL;
109 }
110
111 void printList(ArrayList list)
112 {
113 int count;
114
115 assert(list != NULL);
116 assert((0 <= list->size) && (list->size <= list->maxSize));
117
118 printf("\nContents of list:\n");
119 for (count=0; count<list -> size; count++)
120 {
121 printf("Element %d is %d\n", count, getList(list, count));
122 }
123 printf("\n");
124 }

In this module, when checking that we have a valid list data structure, the best we can do is ensure
that the pointer to list is not NULL. If we want to add some additional protection for this data structure,
we could add an additional variable (such as an iint) to the beginning of the data structure and set this
variable to a specific and unusual value. Then, in addition to ensuring that the pointer is not NULL, we
could also ensure that the additional variable contains the correct value.

6.4 Summary
Making programs as error-proof as possible is not particularly hard work but it is work that is often
avoided by the programmer who is rushing to meet a deadline. A solution to this problem is to write
the error-checking code at the same time as the basic code is being written, that is, do not leave error-
checking until the end of development of the program. Once you become familiar with writing ap-
propriate error-checking code along with your basic code, you are on the way to becoming a good
developer.
106 CHAPTER 6. DESIGN BY CONTRACT
Chapter

7
Unit Tests

In the previous chapter, we examined how comprehensive error checking could be added within mod-
ules to ensure that each module is able to detect incorrect parameters or invalid situations. Performing
such internal testing is one step towards providing quality software. A second step involves ensuring
that a module (or small group of modules) generates the correct output for a collection of test prob-
lems. This process is referred to as unit testing. In the golden days of computing, the programmer (or a
program tester) performed unit tests manually—typing values for parameters and then visually deter-
mining whether or not the function returned the correct value. Now, however, unit tests are specified
in an executable form. So a unit test is defined in another module and it calls a particular function with
specific parameter values and then compares the output returned by the function with the correct out-
put to ensure that the module is functioning correctly. Unit tests are an executable specification of what
the system is supposed to do. Unit tests can (and should) be re-run every time that the system is rebuilt
to ensure that no changes have been made that result in incorrect processing. By defining unit tests and
running them frequently, we add additional quality to the system.

7.1 A Simple Unit Test


We will begin by writing a simple set of unit tests for the array list module that was developed in
Chapter 4.
The unit test is fairly simple. To begin, an initialization function (initSuite) is called to initialize any
variables required by the test. Then, each unit test (addTest, removeTest) is called in turn. Finally, a
cleanup function (cleanSuite) is called to perform any final processing that is necessary.

1 #include <stdio.h>
2 #include <assert.h>
3 #include "module1.h"
4
5 int initSuite(void);
6 int cleanSuite(void);
7 void addTest(void);
8 void removeTest(void);
9
10 ArrayList myList;
11
12 int main(int numParms, char *parms[])
13 {
14 int returnCode;
15
16 if (initSuite()==0)
17 {
18 addTest();
19 removeTest();
20 cleanSuite();
21 printf("Tests completed successfully.\n");
22 returnCode = 0;
23 }
24 else

107
108 CHAPTER 7. UNIT TESTS

25 {
26 printf("System could not be initialized.\n");
27 returnCode = 1;
28 }
29
30 return returnCode;
31 }
32
33 // The suite initialization function.
34
35 int initSuite(void)
36 {
37 int result;
38
39 if ((myList=newList()) != NULL)
40 {
41 result = 0;
42 }
43 else
44 {
45 result = 1;
46 }
47 return result;
48 }
49
50 // The suite cleanup function.
51
52 int cleanSuite(void)
53 {
54 myList = destroyList(myList);
55 return 0;
56 }
57
58 // Test adding elements to the list
59
60 void addTest(void)
61 {
62 assert(myList != NULL);
63
64 addList(myList, 1);
65 assert(sizeList(myList)==1);
66 assert(getList(myList,0)==1);
67
68 addList(myList, 2);
69 assert(sizeList(myList)==2);
70 assert(getList(myList,0)==1);
71 assert(getList(myList,1)==2);
72
73 addList(myList, 3);
74 assert(sizeList(myList)==3);
75 assert(getList(myList,0)==1);
76 assert(getList(myList,1)==2);
77 assert(getList(myList,2)==3);
78 }
79
80 // Test removing elements from the list
81
82 void removeTest(void)
83 {
84 assert(myList != NULL);
85
86 assert(sizeList(myList)==3);
87
88 removeList(myList, 0);
89 assert(sizeList(myList)==2);
90 assert(getList(myList,0)==2);
91 assert(getList(myList,1)==3);
92
93 removeList(myList, 0);
94 assert(sizeList(myList)==1);
95 assert(getList(myList,0)==3);
96
CHAPTER 7. UNIT TESTS 109

97 removeList(myList, 0);
98 assert(sizeList(myList)==0);
99 }

This program can be compiled using the statement:

1 sh-3.2$ gcc -ggdb unittest.c module1.c -o unittest

When the program is run, the following output is generated:

1 Tests completed successfully.

Unfortunately, there are some problems with this approach. First, as soon as a test fails, the entire system
halts. Secondly, there is no information generated about the number of tests that were performed and
the number of tests that were successful. This information could be generated by collecting the various
statistics but we will see in the next section how a unit-testing framework can make life much easier for
us.

7.2 CUnit
In the past decade (or so), various frameworks for developing unit tests have been created. One of the
most popular frameworks is JUnit, a framework for writing unit tests for Java programs. JUnit has been
ported to C with the result being the CUnit framework. In this section, we describe how to setup the
CUnit framework. (It should be noted that the JUnit framework is more user friendly than CUnit and that
JUnit has been integrated into the Eclipse system while CUnit has not.)

CUnit consists of several components that must be installed correctly before CUnit will run.
To avoid path
problems, the following installation instructions describe how the CUnit components can be integrated
with your existing C compiler. While this is not an ideal solution, it does remove the many problems
that are encountered when attempting to make the CUnit libraries available to the C compiler without
merging the CUnit libraries into the C compiler directories.

7.3 Creating a Unit Test


The following is a unit test that performs the same processing as the simple unit test defined earlier.

1 #include <stdio.h>
2 #include <string.h>
3 #include "module1.h"
4 #include "CUnit/Basic.h"
5 #include "CUnit/CUnit.h"
6 #include "CUnit/TestDB.h"
7 #include "CUnit/TestRun.h"
8 #include "CUnit/Automated.h"
9
10 int initSuite1(void);
11 int cleanSuite1(void);
12 void addTest(void);
13 void removeTest(void);
14
15 ArrayList myList;
16
17 /* The main() function for setting up and running the tests.
18 * Returns a CUE_SUCCESS on successful running, another
19 * CUnit error code on failure.
20 */
21
22 int main(int numParms, char *parms[])
23 {
24 CU_pSuite pSuite = NULL;
25
26 /* initialize the CUnit test registry */
27 if (CUE_SUCCESS != CU_initialize_registry())
110 CHAPTER 7. UNIT TESTS

28 {
29 return CU_get_error();
30 }
31
32 /* add a suite to the registry */
33 pSuite = CU_add_suite("Suite_1", initSuite1, cleanSuite1);
34 if (NULL == pSuite)
35 {
36 CU_cleanup_registry();
37 return CU_get_error();
38 }
39
40 /* add the tests to the suite */
41 /* NOTE - ORDER IS IMPORTANT - MUST TEST addTest() BEFORE removeTest() */
42
43 if ((NULL == CU_add_test(pSuite, "test of additions", addTest)) ||
44 (NULL == CU_add_test(pSuite, "test of removals", removeTest)))
45 {
46 CU_cleanup_registry();
47 return CU_get_error();
48 }
49
50 /* Run all tests using the CUnit Basic interface */
51 CU_basic_set_mode(CU_BRM_VERBOSE);
52 CU_basic_run_tests();
53 CU_cleanup_registry();
54 return CU_get_error();
55 }
56
57 // The suite initialization function.
58
59 int initSuite1(void)
60 {
61 if ((myList=newList()) == NULL)
62 {
63 return -1;
64 }
65 else
66 {
67 return 0;
68 }
69 }
70
71 // The suite cleanup function.
72
73 int cleanSuite1(void)
74 {
75 myList = destroyList(myList);
76 return 0;
77 }
78
79 // Test adding elements to list
80
81 void addTest(void)
82 {
83 CU_ASSERT(myList != NULL);
84
85 addList(myList, 1);
86 CU_ASSERT(sizeList(myList)==1);
87 CU_ASSERT(getList(myList,0)==1);
88
89 addList(myList, 2);
90 CU_ASSERT(sizeList(myList)==2);
91 CU_ASSERT(getList(myList,0)==1);
92 CU_ASSERT(getList(myList,1)==2);
93
94 addList(myList, 3);
95 CU_ASSERT(sizeList(myList)==3);
96 CU_ASSERT(getList(myList,0)==1);
97 CU_ASSERT(getList(myList,1)==2);
98 CU_ASSERT(getList(myList,2)==3);
99 }
CHAPTER 7. UNIT TESTS 111

100
101 // Test removing elements from list
102
103 void removeTest(void)
104 {
105 CU_ASSERT(myList != NULL);
106
107 CU_ASSERT(sizeList(myList)==3);
108
109 removeList(myList, 0);
110 CU_ASSERT(sizeList(myList)==2);
111 CU_ASSERT(getList(myList,0)==2);
112 CU_ASSERT(getList(myList,1)==3);
113
114 removeList(myList, 0);
115 CU_ASSERT(sizeList(myList)==1);
116 CU_ASSERT(getList(myList,0)==3);
117
118 removeList(myList, 0);
119 CU_ASSERT(sizeList(myList)==0);
120 }

Although the main function is somewhat large and confusing, most of the code is “boiler plate”, code
that can be copied exactly as it is into any other unit test. The main function just calls the setup function,
calls each unit test, and then calls the cleanup function.
To compile the unit test with the associated module, the following statement is used (note the addition
of a library parameter):

1 sh-3.2$ gcc -ggdb unittest.c module1.c -lcunit_dll -o unittest

When the program is run, the following output is generated:

1 sh-3.2$ unittest
2
3
4 CUnit - A Unit testing framework for C - Version 2.1-0
5 http://cunit.sourceforge.net/
6
7
8 Suite: Suite_1
9 Test: test of additions ... passed
10 Test: test of removals ... passed
11
12 --Run Summary: Type Total Ran Passed Failed
13 suites 1 1 n/a 0
14 tests 2 2 2 0
15 asserts 15 15 15 0

As can be seen, CUnit generates more information than we did in the simple unit test, plus CUnit does
not stop as soon as an error is encountered.

7.4 Summary
Writing unit tests is absolutely critical to ensuring that high-quality code is generated. Using a frame-
work such as CUnit makes writing a large number of unit tests (and suites of unit tests) relatively straight-
forward. Do not skimp on unit tests! In recent years, unit tests have taken on a higher profile due to
test-driven development. With TDD, the unit tests are actually written before the basic code instead of
afterwards. While this may seem like a backwards way of writing code, it turns out that writing the unit
tests first provides many benefits that are not immediately obvious.
112 CHAPTER 7. UNIT TESTS
Appendix

A
Best Programming Practices

The most important part of being a good programmer isn’t getting the code to work, that just takes hard
work and practice. There is, however, an art to making code readable by you, your peers, and your
instructor. As your programs grow in complexity and size, this becomes more and more important. The
following are practices that we’ve picked up over time. They don’t make your programs work but they
do make them easier to read, which helps make it easier for you to make your programs work.
The goal of this document is to instil best practices as early as possible. Bad habits are easy to pick
up and the hardest to overcome. Taking these practices to heart now will help you become a better
programmer and reduce the stress and strain of having to change when you enter the workforce.

A.1 Variables and Naming


People tend to think that giving a variable a shorter name makes it easier to code; there’s less to type so
you can code faster. Here are two examples of implementing the multiplication of two fractions:

Good Bad

class Fraction { class F {


int numerator; int a, b;
int denominator; }
}
...
...
F f1, f2, r;
Fraction fraction1;
Fraction fraction2; ...
Fraction result;
r.a = f1.a * f2.a;
... r.b = f1.b * f2.b;

result.numerator = fraction1.numerator *
fraction2.numerator;
result.denominator = fraction1.denominator *
fraction2.denominator;

If this were a long program with many functions manipulating the fraction data it would become very
easy to get confused as to what a and b represent. The rules of good naming also apply to other iden-
tifiers in your program (like the names of routines and structures). Different languages have different
naming conventions for their identifiers, you should follow the conventions of the language you are
using.
Note that it is better to declare one variable per line. This makes it easier to tell what variables you have
declared without having to read across a single line. It also makes it easier and more logical to initialize
variables as part of the declaration.

113
114 APPENDIX A. BEST PROGRAMMING PRACTICES

Finally, always declare all of the variables used within a function at the beginning of the function. It is
bad practice to declare variables midstream for a number of reasons. First, when looking at the code
later it’s hard to determine all variables and their types without scanning the entire function. Second,
you have the potential for scoping problems since a variable may mask another variable with the same
name (declared earlier) resulting in potentially erroneous accesses to variables. The lone exception to
this rule is small-scope counters, like those used in for loops (if your language supports it!).

A.2 White Space


Whitespace is free so don’t be afraid to use it. When memory was a scarce commodity it made sense to
make your programs smaller by not using spaces. Now a few extra bytes don’t matter and make your
code a lot easier to read.
Spacing portions of a statement makes the code easier to read if done wisely. Logical blocks of code
within a function should be separated with a blank line to indicate that a group of instructions work
together to get something done. This also holds true for the variable dictionary; place a blank line
between your variable declarations and the rest of your code. These practices are illustrated in the first
example below.
If there is only one statement following a control statement (such as an if or case) do not put both
statements on a single line. As well as making the code hard to follow (when scanning it looks like you
forgot the statement), doing so makes it hard to add statements at a later date. This practice is illustrated
in the second example below.
The indenting of your code is fundamental to making it readable. Any code within a set of braces or
to be executed as part of a control statement must be indented. This is done to show that the indented
statements belong to a specific block and are executed as a group (e.g. based on the results of the control
statement’s execution). The level of indentation should be between 2 to 4 spaces, with the amount
chosen and used consistently throughout a project. Fewer spaces make it hard to tell that there is any
indentation, and more make the code too offset from the containing block. Note that you should use
spaces (not tabs) to ensure there are no alignment issues when the code is printed (or viewed in different
editors). This practice is illustrated in the third example below.
As an initial example, consider the fraction code above. Note the use of spaces around the assignment
and multiplication operators and how the operands are aligned after the assignment operator. While not
always necessary, it is often useful to break a statement across multiple lines so that it is easily viewable.
This is often done for complex if statements where each logical expression is on an individual line and
aligned to indicate that the belong to the same statement (with the logical operator at the end of each
line).
APPENDIX A. BEST PROGRAMMING PRACTICES 115

Here are some more examples:

Good Bad

fraction sum; fraction sum;


fraction right; fraction right;
fraction left; fraction left;
long divisor; long divisor;
left.numerator*=right.denominator;
// give them common denominators right.numerator*=left.denominator;
left.numerator *= right.denominator; sum.denominator=right.denominator*left.denominator;
right.numerator *= left.denominator; sum.numerator=left.numerator+right.numerator;
sum.denominator = right.denominator * divisor=gcd(sum.numerator,sum.denominator);
left.denominator; sum.numerator/=divisor;
sum.denominator/=divisor;
// add them
sum.numerator = left.numerator +
right.numerator;

// reduce the result


divisor = gcd( sum.numerator, sum.denominator );
sum.numerator /= divisor; ---------------------------------------------------
--------------------------------------------------- if ( value == 0 ) result = true;
if ( value == 0 ) { if ( denominator == 0 ) denominator = 1;
result = true;
}

if (denominator == 0 ) {
denominator = 1;
} ---------------------------------------------------
--------------------------------------------------- done = false;
done = false; x = getNextValue();
x = getNextValue(); while ( !done ) {
while ( !done ) { processX( x );
processX( x ); x = getNextValue();
if ( x >= sentinel ) {
x = getNextValue(); done = true;
if ( x >= sentinel ) { }
done = true; }
}
}

Also note the use of braces in if structures, even when they only contain a single statement. Braces help
keep the code organized, and will save you from dangling-else problems, and make it easier to add
additional statements.

A.3 Exit Points


It’s imperative that any block of code only have a single exit point. For something like a for or while
loop this done through the conditional test at the top of the loop. For a routine (which is just another
block of code) this is done through the return statement. There are a number of methods that can be
used to short circuit this standard flow (each of which is equivalent to the use of a goto). The following
two sections discuss these issues.

A.3.1 Break and Continue


If you’ve never seen these statements I’ll simply say – don’t use them! To skip some amount of code
and go to the next iteration of a loop or to jump out of the loop is simply adding "advanced" goto
statements. Constructs such as for and while were defined so we wouldn’t have to use gotos. I don’t
normally say things like this, but if you use either of these in your assignments (obviously excluding the
switch statement), labs, or the exam you will lose marks.

As an example, consider the following code with a break and a goto. Is there any difference?
116 APPENDIX A. BEST PROGRAMMING PRACTICES

Bad Also Bad

for ( i=0 ; i < LIMIT ; i++ ) { for ( i=0 ; i < LIMIT


; i++ ) {
...
...
if ( found )
break; if ( found )
goto exit_for;
...
} ...

}
exit_for:

How should this code be written? Start by using a while loop that terminates on found being set to true,
and any other necessary conditions.

A.3.2 Switch Statements


witch statements are the one exception to the rule about using break statements. If several cases have
identical behaviours, then fall-through is permitted, but cases that have similar behaviours with par-
tially shared code relying on fall-through is not. Always terminate each group of statements with a
break, even the last one.

For example, the following are two similar (though not identical) situations:

Good Bad

switch (command) { switch (command) {


case ’a’: case ’a’:
case ’i’: printf("adding
value++; 1\n");
break; case ’i’:
value++;
case ’s’: break;
case ’d’:
value--; case ’s’:
break; printf("subtracting
} 1\n");
case ’d’:
value--;
}

The “bad” example should be broken up into four separate cases, each terminating with a break. If there
is a substantial amount of code shared between cases, either move it to a routine, or rearrange the logic.

A.3.3 Return Statements


A common error is to use return statements anywhere in a function; if you’re done what needs to be
done then exit the function. The problem with this is that you interrupt the logical flow of the function
and jump to the end of the function, which is equivalent to a goto statement.

The reading of complex functions becomes quite difficult if they can exit at any point in their code (i.e.
they no longer have a logical flow). Instead, reorganize your code such that there is only one return
statement and it is the last statement of the function. Instead of exiting when a condition is reached, set
a variable that can be checked and used as the return value. Here is an example:
APPENDIX A. BEST PROGRAMMING PRACTICES 117

Good Bad

int processPositive( int value ) int processPositive(


{ int value )
int isPositive = TRUE; {
if ( value < 0 ) {
if ( value < 0 ) { return FALSE;
isPositive = FALSE; } else {
} else { process( value );
process( value ); return TRUE;
} }
}
return isPositive;
}

Obviously, this is a simple example but in functions that are tens or hundreds of lines long it becomes a
real problem.
Note that recursion is a special case. While it is possible to rearrange a recursive function to only have
one exit point it tends to make the code unwieldy.

A.3.4 Brace Placement


There’s nothing that can make a programmer’s blood boil like a discussion on where to put your opening
braces. There are a couple of popular choices, either of which are acceptable for use in this course. In the
“K&R” style used in the optional textbook for this course, opening braces are placed on the same line
as the start of a control structure, and on a line by themselves at the start of a routine. Closing braces
appear on a line by themselves (except before an else clause), and are aligned on the same indent as the
beginning of the control structure. In the “ANSI” style used in the C standard documentation, all braces
appear on a line by themselves, aligned on the same left indent as the control structure.

K&R ANSI

int factorial( int n ) int factorial( int n )


{ {
int result = 1; int result = 1;
int i; int i;

if ( n >= 0 && n <= 12 ) { if ( n >= 0 && n <=


for (i = 1; i <= n; i++) { 12 )
result *= i; {
} for (i = 1; i <=
} else { n; i++)
result = -1; {
} result *= i;
}
return result; }
} else
{
result = -1;
}

return result;
}

In the K&R style, note the spacing before opening braces; do not jam them against other characters.
There are arguments for and against each style. Whichever of the two styles you choose, use it consis-
tently throughout a project.
(Please note that the style used in the Programming Pearls textbook, where the first line of code in a
118 APPENDIX A. BEST PROGRAMMING PRACTICES

routine is on the same line as the opening brace, can be awkward for auto-indenting text editors and
should be avoided.)

A.3.5 Functional Decomposition


This is simple and straightforward, when done properly. Every routine you write should have a specific
purpose and shouldn’t deviate from that purpose. For example, if you have a method that computes
the circumference of a circle then that’s all it should do. The method has no “right” to do anything else
(e.g. changing the diameter before computing the circumference).
This concept extends to the development of complex applications. If you are asked to implement a
program that must perform a number of tasks, start by making the tasks routines. This is the first step
toward good software design. It prepares you for the design and implementation of complex modules
(and Abstract Data Types) used by other programs.

A.3.6 Commenting
Well-written code should tell the story of what it’s doing. Choosing appropriate control structures, using
the simplest possible logic, and following the previously discussed practices can help. But when code
isn’t completely self-explanatory, comments can fill in the rest of the story.

A well-documented routine

/**
* PURPOSE:
* Calculate the factorial of an integer n; that is, n!
* INPUT PARAMETERS:
* n: valid range is 0 to 12 (inclusive)
* OUTPUT PARAMETERS:
* returns an integer containing n!
* if n is invalid, it will return -1
*/
int factorial( int n )
{
int result = 1;
int i;

// 12! is the maximum that will fit in a 32-bit int


if ( n >= 0 && n <= 12 ) {
for (i = 1; i <= n; i++) {
result *= i;
}
} else {
result = -1;
}

return result;
}

One principle to remember for writing good comments is that your comments should primarily an-
swer the question why, rather than how. The answer to “how?” should already be answered by your
code; for example, rather than writing /* this is a quicksort */, move the sort code to a function called
quicksort(). Use comments to justify non-obvious decisions, explain your intent, or in rare cases, to
overtly describe your method (if it is a novel or highly-optimized solution to a problem). For example,
what was the purpose of the number 12 in the factorial method above? See the comment below for an
explanation.
Some variable declarations require comments. One example of a vital fact about a variable that may
not be mentioned in its name is its unit (e.g. is the value in a variable timer measuring milliseconds
or hours?). You can also comment about ranges of values or encoded meanings; for example: int
comparison; //negative is less than, zero is equal, positive is greater than.
APPENDIX A. BEST PROGRAMMING PRACTICES 119

Also, externally-visible functions generally should have a descriptive prologue comment at the top.
The public interface of a module needs to be documented so that it can be used without reading the
underlying code. Work on the assumption that the code may be someday released as a (binary) library,
and the only documentation available will be headers and your comments. Some of the semantics of
each function should already be represented in the function and parameter names. To keep a consistent
style, this information may have to be duplicated in the comments. But the really important information
is what’s not obvious: ranges or encoded meanings of input or output values, side effects, and any other
conditions not inherent in the purpose of the function. For example, in the commenting of factorial
below, knowing the upper bound of the parameter is 12 is not obvious and requires a comment, but the
lower bound being 0 is part of the definition of n! and did not need to be explicitly commented (though
it was just as easy to type it as to leave it out). For the purposes of your assignments in this course, write
these kinds of prologue comments for all your routines, because the marker will need them.
120 APPENDIX A. BEST PROGRAMMING PRACTICES
Appendix

B
Programming Standards

This document lists the programming standards that you must follow for the programming questions
of your assignments. Failure to follow these standards will result in the loss of marks.

B.1 Commenting Files and Functions


1. Each of your program files must begin with a comment block like the following:

/**
* Name of class or program (matches filename)
*
* COMP 2160 SECTION Axx
* INSTRUCTOR Name of your instructor
* ASSIGNMENT Assignment #, question #
* AUTHOR your name, your student number
* DATE date of completion
*
* PURPOSE: what is the purpose of your program?
*/

2. If the purpose of a routine is not self-explanatory, write a prologue comment at the beginning,
similar to the following:

/**
* PURPOSE: Describe what it does, including side effects.
* INPUT PARAMETERS:
* Describe the parameters that accept data values.
* OUTPUT PARAMETERS:
* Describe the parameters that return data values
* (some parameters may be listed under both headings).
* Also describe the method return value (if any).
*/

You may omit any part of the prologue comment that does not apply (for example, if the method
has no parameters).

B.2 Writing Readable Code


3. Use blank lines to separate blocks of code and declarations to improve readability. In particular,
use blank lines between declarations and other code, and between routines.
4. Comment blocks of code. Describe why you wrote the code this way, not what each line does.
5. Use meaningful but reasonable variable names.
• a — Very bad. Too short, not meaningful.
• average — Good if there is only one possible average.
• average_mark — Good.

121
122 APPENDIX B. PROGRAMMING STANDARDS

• average_of_all_the_marks_in_the_list — Bad. Too long and wordy.


If a concise variable name does not completely describe the data it stores, add a comment to the
declaration with additional information.
6. Use consistent indentation to clarify control structures (e.g. loops and if constructs). Levels of
indentation should clearly indicate the depth of nesting.
7. Align else with the corresponding if for readability. Common styles include:

if (...) {
statement;
...
} else {
statement;
...
}

or:

if (...)
{
statement;
...
}
else
{
statement;
...
}

Avoid long lines; where needed, continuations of a statement on a new line should be indented too. Any
readable and consistent style is acceptable. The essential features are that all statements that are nested
within another statement must be indented, and that the braces must be in predictable and consistent
positions.

B.3 Writing Maintainable and Extendable Code


8. Avoid the use of literal constants ("magic numbers") in your program. Generally use constant
identifiers rather than literal constants in your program. Acceptable exceptions are strings that
appear only once, and small fundamental values. For example:
• sum = 0; — Literal constant 0 is OK
• count = count + 1; — Literal constant 1 is OK
• price = total * 1.07; — Magic values are not proper. Create a named constant like PST_RATE
= 1.07.
• lastDigit = accountNumber % 10; — Literal constant 10 is OK.
• while (command != ’Q’){
...
if (command == ’Q’)... — Duplication is not proper. Create a named constant like QUIT_COMMAND
= ’Q’.

9. Declare all variables at the beginning of a routine and never within a sub-block.
10. Never change the value of a for loop variable inside the loop.
11. Use the best possible construct. For loops are only used when the number of iterations is known
in advance; use a while or do/while for non-deterministic loops.
12. Avoid duplication of code. Every job, task, or formula should be implemented in one place.
13. Use appropriate language-specific naming standards for C/C++.
APPENDIX B. PROGRAMMING STANDARDS 123

B.4 Input and Output


14. All output produced on the console must have appropriate titles and headings. Output that is sent
to a file typically does not require titles or headings.
15. Print console output neatly. Use tabular output where appropriate.
16. Print a message on the console at the end of the program that indicates whether or not the pro-
gram completed successfully; e.g. End of processing. This message should be printed by the last
statement in your main program.
124 APPENDIX B. PROGRAMMING STANDARDS

You might also like