Lecture1-51-101
Lecture1-51-101
} Overflow =
} Put more into the buffer than it can hold
52
What is a buffer overflow?
} A buffer overflow is a bug that affects low-level code, typically in
C and C++, with significant security implications
} But an attacker can alter the situations that cause the program
to do much worse
} Steal private information (e.g., Heartbleed)
} Corrupt valuable information
} Run code of the attacker’s choice
53
Why study them?
} Buffer overflows are still relevant today
} C and C++ are still popular
} Buffer overflows still occur with regularity
54
C and C++ still very popular
55
Critical systems in C/C++
} Most OS kernels and utilities
} fingerd, X windows server, shell
56
History of buffer overflows
} Morris worm
} Propagated across machines (too aggressively, thanks to a bug)
} One way it propagated was a buffer overflow attack against a
vulnerable version of fingerd on VAXes
} Sent a special string to the finger daemon, which caused it to execute code that
created a new worm copy
} Didn’t check OS: caused Suns running BSD to crash
} End result: $10-100M in damages, probation, community service
Morris now a professor at MIT
57
History of buffer overflows (cont.)
} CodeRed
} Exploited an overflow in the MS-IIS server
} 300,000 machines infected in 14 hours
58
History of buffer overflows (cont.)
} SQL Slammer
} Exploited an overflow in the MS-SQL server
} 75,000 machines infected in 10 minutes
59
60
61
What we’ll do
} Understand how these attacks work, and how to defend against
them
62
Note about terminology
} I use the term buffer overflow to mean any access of a buffer
outside of its allotted bounds
} Could be an over-read, or an over-write
} Could be during iteration (“running off the end”) or by direct access (e.g.,
by pointer arithmetic)
} Out-of-bounds access could be to addresses that precede or follow the
buffer
} Others sometimes use different terms
} They might reserve buffer overflow to refer only to actions that write
beyond the bounds of a buffer
} Contrast with terms buffer underflow (write prior to the start), buffer overread
(read past the end), out-of-bounds access, etc.
63
Benign outcome
void func(char *arg1)
{
char buffer[4];
strcpy(buffer, arg1);
...
}
int main()
{
char *mystr = “AuthMe!”;
func(mystr);
...
}
buffer
SEGFAULT (0x00216551) (during subsequent access)
64
Security-relevant outcome
void func(char *arg1)
{
int authenticated = 0;
char buffer[4];
strcpy(buffer, arg1);
if(authenticated) { ...
}
int main()
{
char *mystr = “AuthMe!”;
func(mystr);
...
}
A00 00
u 00t 00h 4d00 65 210000
00 00 %ebp %eip &arg1
buffer authenticated
65
Could it be worse?
void func(char *arg1)
!
E
{
char buffer[4];
...
O D
strcpy(buffer, arg1);
C
}
All ours!
00 00 00 00 %ebp %eip &mystr
buffer
strcpy will let you write as much as you want (til a ‘\0’)
66
Aside: User-supplied strings
} These examples provide their own strings
67
Code Injection
Code Injection: Main idea
void func(char *arg1)
{
char buffer[4];
sprintf(buffer, arg1);
...
}
%eip
69
Code Injection: Main idea
void func(char *arg1)
{
char buffer[4];
sprintf(buffer, arg1);
...
}
%eip
70
Challenge1: Loading code into memory
} It must be the machine code instructions (i.e., already
compiled and ready to run)
71
What code to run?
72
Shellcode
#include <stdio.h>
int main( ) {
char *name[2];
name[0] = “/bin/sh”;
name[1] = NULL;
execve(name[0], name, NULL);
}
Machine code
pushl %eax “\x50” (Part of)
Assembly
73
Challenge 2:
Getting injected code to run
%eip
74
Recall: memory layout summary
} Calling function:
} 1.Push arguments onto the stack (in reverse)
} 2.Push the return address, i.e., the address of the instruction you
want run after control returns to you
} 3.Jump to the function’s address
} Called function:
} 4.Push the old frame pointer onto the stack (%ebp)
} 5.Set frame pointer (%ebp) to where the end of the stack is right
now (%esp)
} 6.Push local variables onto the stack
} Returning function:
} 7.Reset the previous stack frame: %esp = %ebp, %ebp = (%ebp)
} 8.Jump back to return address: %eip = 4(%esp)
75
Hijacking the saved %eip
%eip %ebp
76
Hijacking the saved %eip
%eip %ebp
77
Challenge 3:
Finding the return address
} If we don’t have access to the code, we don’t know how far the
buffer is from the saved %ebp
78
Improving our chances: nop sleds
nop is a single-byte instruction
(just moves to the next instruction)
Jumping anywhere
%eip %ebp here will work
79
Putting it all together
good
padding guess
%eip
80
Other memory exploits
Other attacks
} The code injection attack we have just considered is called stack
smashing
} The term was coined by Aleph One in 1996
82
Heap overflow
} Stack smashing overflows a stack allocated buffer
83
Heap overflow
typedef struct _vulnerable_struct {
char buff[MAX_LEN];
int (*cmp)(char*,char*);
} vulnerable;
int foo(vulnerable* s, char* one, char*
two)
{
strcpy( s->buff, one ); copy one into buff
strcat( s->buff, two ); copy two into buff
return s->cmp( s->buff, "file://foobar"
);
}
84
Heap overflow variants
} Overflow into the C++ object vtable
} C++ objects (that contain virtual functions) are represented using a
vtable, which contains pointers to the object’s methods
} This table is analogous to s->cmp in our previous example, and a similar
sort of attack will work
} Overflow into adjacent objects
} Where buff is not collocated with a function pointer, but is allocated
near one on the heap
} Overflow heap metadata
} Hidden header just before the pointer returned by malloc
} Flow into that header to corrupt the heap itself
} Malloc implementation to do your dirty work for you!
85
Integer overflow
void vulnerable()
{
HUGE
char *response;
int nresp = packet_get_int();
Wrap around
if (nresp > 0) {
response = malloc(nresp*sizeof(char*));
for (i = 0; i < nresp; i++)
response[i] = packet_get_string(NULL);
} Overflow
86
Corrupting data
} The attacks we have shown so far affect code
} Return addresses and function pointers
87
Read overflow
} Rather than permitting writing past the end of a buffer, a bug
could permit reading past the end
88
Read overflow
int main() {
char buf[100], *p;
int i, len;
while (1) {
}
p = fgets(buf,sizeof(buf),stdin);
if (p == NULL) return 0;
len = atoi(p); Read integer
p = fgets(buf,sizeof(buf),stdin);
if (p == NULL) return 0; } Read message
}
for (i=0; i<len; i++)
if (!iscntrl(buf[i])) putchar(buf[i]);
else putchar('.'); Echo back (partial)message
printf(“\n”); May exceed
}} actual message
length!
89
Sample transcript
% ./echo-server
24
every good boy does fine
ECHO: |every good boy does fine|
10
hello there OK: input length
ECHO: |hello ther| < buffer size
25
hello BAD:
ECHO: |hello..here..y does fine.| length
leaked data > size !
90
Heartbleed
} The Heartbleed bug was a read overflow
in exactly this style
} Format specifiers
} Position in string indicates stack argument to print
} Kind of specifier indicates type of the argument
} %s = string
} %d = integer
} etc.
93
What’s the difference?
void safe()
{
char buf[80];
if(fgets(buf, sizeof(buf), stdin)==NULL)
return;
printf(“%s”,buf);!
}
void vulnerable()
{
char buf[80];
if(fgets(buf, sizeof(buf), stdin)==NULL)
return;
printf(buf); Attacker controls the format string
}
94
printf implementation
int i = 10;
printf(“%d %p\n”, i, &i);
0x00000000 0xffffffff
%ebp %eip &fmt 10 &i
95
Back to our example
void vulnerable()
{
char buf[80];
if(fgets(buf, sizeof(buf), stdin)==NULL)
return;
printf(buf);
}
“%d %x"
0x00000000 0xffffffff
%ebp %eip &fmt
caller’s
stack frame
96
Format string vulnerabilities
} printf(“100% dave”);
} Prints stack entry 4 bytes above saved %eip
} printf(“%s”);
} Prints bytes pointed to by that stack entry
} printf(“%d %d %d %d …”);
} Prints a series of stack entries as integers
} printf(“%08x %08x %08x %08x …”);
} Same, but nicely formatted hex
} printf(“100% no way!”)
} WRITES the number 3 to address pointed to by stack entry
97
Why is this a buffer overflow?
} We should think of this as a buffer overflow in the sense that
} The stack itself can be viewed as a kind of buffer
} The size of that buffer is determined by the number and size of the
arguments passed to a function
98
Vulnerability prevalence
http://web.nvd.nist.gov/view/vuln/statistics
99
Time to switch hats
100
Software Security
Questions
101