0% found this document useful (0 votes)
40 views34 pages

C Boot Camp: Feb 26, 2017 Ray Axel Jerry

The document summarizes the agenda and materials for a C bootcamp at Carnegie Mellon University. The bootcamp will cover C basics like pointers, debugging tools, and the C standard library. Attendees are instructed to download example C code and slides from a provided web address. The basics section will summarize concepts like pointers, memory management, structs, and arrays.

Uploaded by

Ahmed Hamouda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views34 pages

C Boot Camp: Feb 26, 2017 Ray Axel Jerry

The document summarizes the agenda and materials for a C bootcamp at Carnegie Mellon University. The bootcamp will cover C basics like pointers, debugging tools, and the C standard library. Attendees are instructed to download example C code and slides from a provided web address. The basics section will summarize concepts like pointers, memory management, structs, and arrays.

Uploaded by

Ahmed Hamouda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Carnegie Mellon

C Boot Camp

Feb 26, 2017

Ray
Axel
Jerry
Carnegie Mellon

Agenda

■ C Basics
■ Debugging Tools / Demo
■ Appendix
C Standard Library
getopt
stdio.h
stdlib.h
string.h
Carnegie Mellon

C Basics Handout

ssh <andrewid>@shark.ics.cs.cmu.edu
cd ~/private
wget http://cs.cmu.edu/~213/activities/cbootcamp.tar.gz
tar xvpf cbootcamp.tar.gz
cd cbootcamp
make

■ Contains useful, self-contained C examples


■ Slides relating to these examples will have the file
names in the top-right corner!
Carnegie Mellon

C Basics
■ The minimum you must know to do well in this class
■ You have seen these concepts before
■ Make sure you remember them.

■ Summary:
■ Pointers/Arrays/Structs/Casting
■ Memory Management
■ Function pointers/Generic Types
■ Strings
■ GrabBag (Macros, typedefs, header guards/files, etc)
Carnegie Mellon

Pointers
■ Stores address of a value in memory
■ e.g. int*, char*, int**, etc
■ Access the value by dereferencing (e.g. *a).
Can be used to read or write a value to given address
■ Dereferencing NULL causes undefined behavior
(usually a segfault)
■ Pointer to type A references a block of sizeof(A) bytes
■ Get the address of a value in memory with the ‘&’
operator
■ Pointers can be aliased, or pointed to same address
Carnegie Mellon

Call by Value vs Call by Reference ./passing_args


■ Call-by-value: Changes made to arguments passed to a function
aren’t reflected in the calling function
■ Call-by-reference: Changes made to arguments passed to a
function are reflected in the calling function
■ C is a call-by-value language
■ To cause changes to values outside the function, use pointers
■ Do not assign the pointer to a different value (that won’t be reflected!)
■ Instead, dereference the pointer and assign a value to that address

void swap(int* a, int* b) { int x = 42;


int temp = *a; int y = 54;
*a = *b; swap(&x, &y);
*b = temp; printf(“%d\n”, x); // 54
} printf(“%d\n”, y); // 42
Carnegie Mellon

Pointer Arithmetic ./pointer_arith


■ Can add/subtract from an address to get a new address
■ Only perform when absolutely necessary (i.e., malloclab)
■ Result depends on the pointer type

■ A+i, where A is a pointer = 0x100, i is an int


■ int* A: A+i = 0x100 + sizeof(int) * i = 0x100 + 4 * i
■ char* A: A+i = 0x100 + sizeof(char) * i = 0x100 + 1 * i
■ int** A: A+i = 0x100 + sizeof(int*) * i = 0x100 + 8 * i

■ Rule of thumb: explicitly cast pointer to avoid confusion


■ Prefer ((char*)(A) + i) to (A + i), even if A has type char*
Carnegie Mellon

Structs ./structs
■ Collection of values placed under one name in a single
block of memory
■ Can put structs, arrays in other structs
■ Given a struct instance, access the fields using the ‘.’
operator
■ Given a struct pointer, access the fields using the ‘->’
operator
struct inner_s { struct outer_s { outer_s out_inst;
int i; char ar[10]; out_inst.ar[0] = ‘a’;
char c; struct inner_s in; out_inst.in.i = 42;
}; }; outer_s* out_ptr = &out_inst;
out_ptr->in.c = ‘b’;
Carnegie Mellon

Arrays/Strings
■ Arrays: fixed-size collection of elements of the same type
■ Can allocate on the stack or on the heap
■ int A[10]; // A is array of 10 int’s on the stack
■ int* A = calloc(10, sizeof(int)); // A is array of 10
int’s on the heap

■ Strings: Null-character (‘\0’) terminated character arrays


■ Null-character tells us where the string ends
■ All standard C library functions on strings assume null-termination.
Carnegie Mellon

Casting
■ Can convert a variable to a different type
■ Integer Casting:
■ Signed <-> Unsigned: Keep Bits - Re-Interpret
■ Small -> Large: Sign-Extend MSB
■ Cautions:
■ Cast Explicitly: int x = (int) y instead of int x = y
■ Casting Down: Truncates data
■ Cast Up: Upcasting and dereferencing a pointer causes undefined
memory access

■ Rules for Casting Between Integer Types


Carnegie Mellon

Malloc, Free, Calloc


■ Handle dynamic memory allocation on HEAP
■ void* malloc (size_t size):
■ allocate block of memory of size bytes
■ does not initialize memory
■ void* calloc (size_t num, size_t size):
■ allocate block of memory for array of num elements, each size bytes long
■ initializes memory to zero
■ void free(void* ptr):
■ frees memory block, previously allocated by malloc, calloc, realloc, pointed
by ptr
■ use exactly once for each pointer you allocate
■ size argument:
■ should be computed using the sizeof operator
■ sizeof: takes a type and gives you its size
■ e.g., sizeof(int), sizeof(int*)
Carnegie Mellon

mem_mgmt.c
Memory Management Rules
./mem_valgrind.sh
■ malloc what you free, free what you malloc
■ client should free memory allocated by client code
■ library should free memory allocated by library code
■ Number mallocs = Number frees
■ Number mallocs > Number Frees: definitely a memory leak
■ Number mallocs < Number Frees: definitely a double free
■ Free a malloc’ed block exactly once
■ Should not dereference a freed memory block
■ Only malloc when necessary
■ Persistent, variable sized data structures
■ Concurrent accesses (we’ll get there later in the semester)
Carnegie Mellon

Stack vs Heap vs Data


■ Local variables and function arguments are placed on the
stack
■ deallocated after the variable leaves scope
■ do not return a pointer to a stack-allocated variable!
■ do not reference the address of a variable outside its scope!
■ Memory blocks allocated by calls to malloc/calloc are
placed on the heap
■ Globals, constants are placed in data section
■ Example:
■ // a is a pointer on the stack to a memory block on the heap
■ int* a = malloc(sizeof(int));
Carnegie Mellon

Typedefs ./typedefs
■ Creates an alias type name for a different type
■ Useful to simplify names of complex data types
■ Be careful when typedef-ing away pointers!
struct list_node {
int x;
};

typedef int pixel;


typedef struct list_node* node;
typedef int (*cmp)(int e1, int e2); // you won’t use this in 213

pixel x; // int type


node foo; // struct list_node* type
cmp int_cmp; // int (*cmp)(int e1, int e2) type
Carnegie Mellon

Macros ./macros
■ A way to replace a name with its macro definition
■ No function call overhead, type neutral
■ Think “find and replace” like in a text editor
■ Uses:
■ defining constants (INT_MAX, ARRAY_SIZE)
■ defining simple operations (MAX(a, b))
■ 122-style contracts (REQUIRES, ENSURES)
■ Warnings:
■ Use parentheses around arguments/expressions, to avoid problems after
substitution
■ Do not pass expressions with side effects as arguments to macros

#define INT_MAX 0x7FFFFFFFF


#define MAX(A, B) ((A) > (B) ? (A) : (B))
#define REQUIRES(COND) assert(COND)
#define WORD_SIZE 4
#define NEXT_WORD(a) ((char*)(a) + WORD_SIZE)
Carnegie Mellon

Generic Types
■ void* type is C’s provision for generic types
■ Raw pointer to some memory location (unknown type)
■ Can’t dereference a void* (what is type void?)
■ Must cast void* to another type in order to dereference it
■ Can cast back and forth between void* and other pointer
types
// stack usage:
// stack implementation:
int x = 42; int y = 54;
typedef void* elem;
stack S = stack_new():
push(S, &x);
stack stack_new();
push(S, &y);
void push(stack S, elem e);
int a = *(int*)pop(S);
elem pop(stack S);
int b = *(int*)pop(S);
Carnegie Mellon

Header Files
■ Includes C declarations and macro definitions to be shared
across multiple files
■ Only include function prototypes/macros; implementation code goes in .c file!
■ Usage: #include <header.h>
■ #include <lib> for standard libraries (eg #include <string.h>)
■ #include “file” for your source files (eg #include “header.h”)
■ Never include .c files (bad practice)
// list.h // list.c // stacks.h
struct list_node { #include “list.h” #include “list.h”
int data; struct stack_head {
struct list_node* next; node new_list() { node top;
}; // implementation node bottom;
typedef struct list_node* node; } };
typedef struct stack_head* stack
node new_list(); void add_node(int e, node l) {
void add_node(int e, node l); // implementation stack new_stack();
} void push(int e, stack S);
Carnegie Mellon

Header Guards
■ Double-inclusion problem: include same header file twice
//grandfather.h //father.h //child.h
#include “grandfather.h” #include “father.h”
#include “grandfather.h”

Error: child.h includes grandfather.h twice

■ Solution: header guard ensures single inclusion


//grandfather.h //father.h //child.h
#ifndef GRANDFATHER_H #ifndef FATHER_H #include “father.h”
#define GRANDFATHER_H #define FATHER_H #include “grandfather.h”

#endif #endif

Okay: child.h only includes grandfather.h once


Carnegie Mellon

Debugging
GDB, Valgrind
Carnegie Mellon

GDB
■ No longer stepping through assembly!
Some GDB commands are different:
■ si / si → step / next
■ break file.c:line_num
■ disas → list
■ print <any_var_name> (in current frame)

■ Use TUI mode (layout src)


■ Nice display for viewing source/executing
commands
■ Buggy, so only use TUI mode to step
through lines (no continue / finish)
Carnegie Mellon

Valgrind
■ Find memory errors, detect memory leaks
■ Common errors:
■ Illegal read/write errors
■ Use of uninitialized values
■ Illegal frees
■ Overlapping source/destination addresses
■ Typical solutions
■ Did you allocate enough memory?
■ Did you accidentally free stack
variables/something twice?
■ Did you initialize all your variables?
■ Did use something that you just free’d?
■ --leak-check=full
■ Memcheck gives details for each
definitely/possibly lost memory block (where it
was allocated
Carnegie Mellon

Appendix
Carnegie Mellon

C Program Memory Layout


Carnegie Mellon

Variable Declarations & Qualifiers


■ Global Variables:
■ Defined outside functions, seen by all files
■ Use “extern” keyword to use a global variable defined in another file
■ Const Variables:
■ For variables that won’t change
■ Data stored in read-only data section
■ Static Variables:
■ For locals, keeps value between invocations
■ USE SPARINGLY
■ Note: static has a different meaning when referring to functions
■ Volatile Variables:
■ Compiler will not make assumptions about current value, useful for
asynchronous reads/writes, i.e. interrupts
■ “volatile” == “subject to change at any time”
Carnegie Mellon

C Libraries
Carnegie Mellon

string.h: Common String/Array Methods


■ One the most useful libraries available to
you
■ Used heavily in shell/proxy labs
■ Important usage details regarding
arguments:
■ prefixes: str -> strings, mem -> arbitrary
memory blocks.
■ ensure that all strings are ‘\0’ terminated!
■ ensure that dest is large enough to store src!
■ ensure that src actually contains n bytes!
■ ensure that src/dest don’t overlap!
Carnegie Mellon

string.h: Common String/Array Methods


■ Copying:
■ void *memcpy (void *dest, void *src, size_t n): copy n bytes of
src into dest, return dest
■ char *strcpy(char *dest, char *src): copy src string into dest,
return dest. Make sure dest is large enough to contain src.
■ Concatenation:
■ char *strncat (char *dest, char *src, size_t n): append copy
of src to end of dest reading at most n bytes, return dest
■ char *strcat (char *dest, char *src) works for arbitrary length
strings, but has the safety issues you’ve seen in attacklab
Carnegie Mellon

string.h: Common String/Array Methods (Continued)


■ Comparison:
■ int strncmp (char *str1, char *str2, size_t n): compare at
most n bytes of str1, str2 by character (based on ASCII value of each
character, then string length), return comparison result
str1 < str2: -1,
str1 == str2: 0,
str1 > str2: 1
■ int strcmp(char *str1, char *str2): compare str1 to str2. Make sure
each string is long enough to be safely compared.
Carnegie Mellon

string.h: Common String/Array Methods (Continued)


■ Searching:
■ char *strstr (char *str1, char *str2): return pointer to
first occurrence of str2 in str1, else NULL
■ char *strtok (char *str, char *delimiters): tokenize
str according to delimiter characters provided in delimiters.
return the one token for each strtok call, using str = NULL
■ Other:
■ size_t strlen (const char *str): returns length of the
string (up to, but not including the ‘\0’ character)
■ void *memset (void *ptr, int val, size_t n): set first n
bytes of memory block addressed by ptr to val
For setting bytes only. Don’t use it to set or initialize int arrays,
for example.
Carnegie Mellon

stdlib.h: General Purpose Functions


■ Dynamic memory allocation:
■ malloc, calloc, free
■ String conversion:
■ int atoi(char *str) : parse string into integral value (return 0 if not parsed)
■ System Calls:
■ void exit(int status) : terminate calling process, return status to parent process
■ void abort() : aborts process abnormally
■ Searching/Sorting:
■ provide array, array size, element size, comparator (function pointer)
■ bsearch: returns pointer to matching element in the array
■ qsort: sorts the array destructively
■ Integer arithmetic:
■ int abs(int n) : returns absolute value of n
■ Types:
■ size_t: unsigned integral type (store size of any object)
Carnegie Mellon

stdio.h
■ Another really useful
library.
■ Used heavily in
cache/shell/proxy labs
■ Used for:
■ argument parsing
■ file handling
■ input/output
■ printf, a fan favorite, comes
from this library!
Carnegie Mellon

stdio.h: Common I/O Methods


■ FILE *fopen (char *filename, char *mode): open the file with
specified filename in specified mode (read, write, append, etc), associate
it with stream identified by returned file pointer
■ int fscanf (FILE *stream, char *format, ...): read data
from the stream, store it according to the parameter format at the
memory locations pointed at by additional arguments.
■ int fclose (FILE *stream): close the file associated with stream
■ int fprintf (FILE *stream, char *format, ... ): write the
C string pointed at by format to the stream, using any additional
arguments to fill in format specifiers.
Carnegie Mellon

Getopt
■ Need to include unistd.h to use int main(int argc, char **argv)
■ Used to parse command-line {
arguments. int opt, x;
■ Typically called in a loop to /* looping over arguments */
retrieve arguments while((opt=getopt(argc,argv,“x:"))>0){
■ Switch statement used to handle switch(opt) {
options case 'x':
■ colon indicates required argument
x = atoi(optarg);
■ optarg is set to value of option
argument break;
■ Returns -1 when no more default:
arguments present printf(“wrong argument\n");
■ See recitation 6 slides for more break;
examples }
}
}
Carnegie Mellon

Note about Library Functions


■ These functions can return error codes
■ malloc could fail
■ int x;
if ((x = malloc(sizeof(int))) == NULL)
printf(“Malloc failed!!!\n”);
■ a file couldn’t be opened
■ a string may be incorrectly parsed
■ Remember to check for the error cases and handle the
errors accordingly
■ may have to terminate the program (eg malloc fails)
■ may be able to recover (user entered bad input)

You might also like