Learning C With GDB: Log in
Learning C With GDB: Log in
com/blog/5-learning-c-with-gdb
Log in
Coming from a background in higher-level languages like Ruby, Scheme, or Haskell, learning C can be
challenging. In addition to having to wrestle with C’s lower-level features like manual memory
management and pointers, you have to make do without a REPL. Once you get used to exploratory
programming in a REPL, having to deal with the write-compile-run loop is a bit of a bummer.
It occurred to me recently that I could use gdb as a pseudo-REPL for C. I’ve been experimenting with
using gdb as a tool for learning C, rather than merely debugging C, and it’s a lot of fun.
My goal in this post is to show you that gdb is a great tool for learning C. I’ll introduce you to a few of
my favorite gdb commands, and then I’ll demonstrate how you can use gdb to understand a
notoriously tricky part of C: the difference between arrays and pointers.
An introduction to gdb
Start by creating the following little C program, minimal.c:
int main()
{
int i = 1337;
return 0;
}
Note that the program does nothing and has not a single printf statement.1 Behold the brave new world
of learning C with gdb!
Compile it with the -g flag so that gdb has debug information to work with, and then feed it to gdb:
You should now find yourself at a rather stark gdb prompt. I promised you a REPL, so here goes:
1 of 7 7/15/20, 11:35 PM
Learning C with gdb - Blog - Recurse Center https://www.recurse.com/blog/5-learning-c-with-gdb
(gdb) print 1 + 2
$1 = 3
Amazing! print is a built-in gdb command that prints the evaluation of a C expression. If you’re unsure
of what a gdb command does, try running help name-of-the-command at the gdb prompt.
I’m going to ignore why 2147483648 == -2147483648; the point is that even arithmetic can be tricky in C,
and gdb understands C arithmetic.
Let’s now set a breakpoint in the main function and start the program:
The program is now paused on line 3, just before i gets initialized.Interestingly, even though i hasn’t
been initialized yet, we can still lookat its value using the print commnd:
(gdb) print i
$3 = 32767
In C, the value of an uninitialized local variable is undefined, so gdb might print something different
for you!
(gdb) next
(gdb) print i
$4 = 1337
2. The size of the chunk, measured in bytes. The size of a variable’s chunk is determined by the
variable’s type.
One of the distinctive features of C is that you have direct access to a variable’s chunk of memory. The
& operator computes a variable’s address, and the sizeof operator computes a variable’s size in memory.
2 of 7 7/15/20, 11:35 PM
Learning C with gdb - Blog - Recurse Center https://www.recurse.com/blog/5-learning-c-with-gdb
In words, this says that i‘s chunk of memory starts at address 0x7fff5fbff5b4 and takes up four bytes of
memory.
I mentioned above that a variable’s size in memory is determined by its type, and indeed, the sizeof
operator can operate directly on types:
This means that, on my machine at least, int variables take up fourbytes of space and double variables
take up eight.
Gdb comes with a powerful tool for directly examing memory: the x command. The x command
examines memory, starting at a particular address. It comes with a number of formatting commands
that provide precise control over how many bytes you’d like to examine and how you’d like to print
them; when in doubt, try running help x at the gdb prompt.
The & operator computes a variable’s address, so that means we can feed &i to x and thereby take a look
at the raw bytes underlying i’s value:
The flags indicate that I want to examine 4 values, formatted as hex numerals, one byte at a time. I’ve
chosen to examine four bytes because i’s size in memory is four bytes; the printout shows i’s raw byte-
by-byte representation in memory.
One subtlety to bear in mind with raw byte-by-byte examinations is that on Intel machines, bytes are
stored in “little-endian” order: unlike human notation, the least significant bytes of a number come
first in memory.
One way to clarify the issue would be to give i a more interesting value and then re-examine its chunk
of memory:
(gdb) ptype i
type = int
(gdb) ptype &i
type = int *
(gdb) ptype main
type = int (void)
Types in C can get complex but ptype allows you to explore them interactively.
3 of 7 7/15/20, 11:35 PM
Learning C with gdb - Blog - Recurse Center https://www.recurse.com/blog/5-learning-c-with-gdb
int main()
{
int a[] = {1,2,3};
return 0;
}
Compile it with the -g flag, run it in gdb, then next over the initialization line:
At this point you should be able to print the contents of a and examine its type:
(gdb) print a
$1 = {1, 2, 3}
(gdb) ptype a
type = int [3]
Now that our program is set up correctly in gdb, the first thing we should do is use x to see what a
looks like under the hood:
This means that a’s chunk of memory starts at address 0x7fff5fbff5dc. The first four bytes store a[0], the
next four store a[1], and the final four store a[2]. Indeed, you can check that sizeof knows that a’s size
in memory is twelve bytes:
At this point, arrays seem to be quite array-like. They have their own array-like types and store their
members in a contiguous chunk of memory. However, in certain situations, arrays act a lot like
pointers! For instance, we can do pointer arithmetic on a:
= preserve do
:escaped
(gdb) print a + 1
$3 = (int *) 0x7fff5fbff570
In words, this says that a + 1 is a pointer to an int and holds the address 0x7fff5fbff570. At this point
you should be reflexively passing pointers to the x command, so let’s see what happens:
= preserve do
4 of 7 7/15/20, 11:35 PM
Learning C with gdb - Blog - Recurse Center https://www.recurse.com/blog/5-learning-c-with-gdb
:escaped
(gdb) x/4xb a + 1
0x7fff5fbff570: 0x02 0x00 0x00 0x00
Note that 0x7fff5fbff570 is four more than 0x7fff5fbff56c, the address of a’s first byte in memory. Given
that int values take up four bytes, this means that a + 1 points to a[1].
In fact, array indexing in C is syntactic sugar for pointer arithmetic: a[i] is equivalent to *(a + i). You
can try this in gdb:
= preserve do
:escaped
(gdb) print a[0]
$4 = 1
(gdb) print *(a + 0)
$5 = 1
(gdb) print a[1]
$6 = 2
(gdb) print *(a + 1)
$7 = 2
(gdb) print a[2]
$8 = 3
(gdb) print *(a + 2)
$9 = 3
We’ve seen that in some situations a acts like an array and in others it acts like a pointer to its first
element. What’s going on?
The answer is that when an array name is used in a C expression, it “decays” to a pointer to the array’s
first element. There are only two exceptions to this rule: when the array name is passed to sizeof and
when the array name is passed to the & operator.2
The fact that a doesn’t decay to a pointer when passed to the & operator brings up an interesting
question: is there a difference between the pointer that a decays to and &a?
= preserve do
:escaped
(gdb) x/4xb a
0x7fff5fbff56c: 0x01 0x00 0x00 0x00
(gdb) x/4xb &a
0x7fff5fbff56c: 0x01 0x00 0x00 0x00
However, their types are different. We’ve already seen that the decayed value of a is a pointer to a’s
first element; this must have type int *. As for the type of &a, we can ask gdb directly:
= preserve do
:escaped
(gdb) ptype &a
type = int (*)[3]
In words, &a is a pointer to an array of three integers. This makes sense: a doesn’t decay when passed to
&, and a has type int [3].
You can observe the distinction between a’s decayed value and &a by checking how they behave with
5 of 7 7/15/20, 11:35 PM
Learning C with gdb - Blog - Recurse Center https://www.recurse.com/blog/5-learning-c-with-gdb
= preserve do
:escaped
(gdb) print a + 1
$10 = (int *) 0x7fff5fbff570
(gdb) print &a + 1
$11 = (int (*)[3]) 0x7fff5fbff578
Note that adding 1 to a adds four to a’s address, whereas adding 1 to &a adds twelve!
= preserve do
:escaped
(gdb) print &a[0]
$11 = (int *) 0x7fff5fbff56c
Conclusion
Hopefully I’ve convinced you that gdb a neat exploratory environment for learning C. You can print
the evaluation of expressions, examine raw bytes in memory, and tinker with the type system using
ptype.
If you’d like to experiment further with using gdb to learn C, I have a few suggestions:
2. Investigate how structs are stored in memory. How do they compare to arrays?
3. Use gdb’s disassemble command to learn assembly programming! A particularly fun exercise is to
investigate how the function call stack works.
4. Check out gdb’s “tui” mode, which provides a grahical ncurses layer on top of regular gdb. On
OS X, you’ll likely need to install gdb from source.
Alan is a facilitator at Hacker School. He’d like to thank David Albert, Tom Ballinger, Nicholas
Bergson-Shilcock, and Amy Dyer for helpful feedback.
Curious about Hacker School? Read about us and apply to our fall batch.
1. Depending on how aggressive your C compiler is about optimizing useless code, you may have
to make it do something :) I tried running these examples on my Raspberry Pi and everything
got compiled away.↩
2. Formally, an array name is a “non-modifiable lvalue”. When used in an expression where an
rvalue is required, an array name decays to a pointer to its first element. As for the exceptions,
the & operator requires an lvalue and sizeof is just weird. ↩
Recent posts
6 of 7 7/15/20, 11:35 PM
Learning C with gdb - Blog - Recurse Center https://www.recurse.com/blog/5-learning-c-with-gdb
7 of 7 7/15/20, 11:35 PM