0% found this document useful (0 votes)
8 views

Lecture1-51-101

Buffer overflows are vulnerabilities primarily found in low-level programming languages like C and C++, where more data is written to a buffer than it can hold, potentially leading to security breaches. These vulnerabilities can allow attackers to crash programs, steal sensitive information, or execute malicious code. The document outlines the history, implications, and various types of buffer overflow attacks, including code injection and heap overflows, emphasizing the importance of understanding these issues for effective defense.

Uploaded by

Sabah Anzi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Lecture1-51-101

Buffer overflows are vulnerabilities primarily found in low-level programming languages like C and C++, where more data is written to a buffer than it can hold, potentially leading to security breaches. These vulnerabilities can allow attackers to crash programs, steal sensitive information, or execute malicious code. The document outlines the history, implications, and various types of buffer overflow attacks, including code injection and heap overflows, emphasizing the importance of understanding these issues for effective defense.

Uploaded by

Sabah Anzi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Buffer overflows

Buffer overflows from 10,000 ft


} Buffer =
} Contiguous memory associated with a variable or field
} Common in C
} All strings are (NUL-terminated) arrays of char’s

} Overflow =
} Put more into the buffer than it can hold

} Where does the overflowing data go?


} Well, now that you are an expert in memory layouts…

52
What is a buffer overflow?
} A buffer overflow is a bug that affects low-level code, typically in
C and C++, with significant security implications

} Normally, a program with this bug will simply crash

} But an attacker can alter the situations that cause the program
to do much worse
} Steal private information (e.g., Heartbleed)
} Corrupt valuable information
} Run code of the attacker’s choice

53
Why study them?
} Buffer overflows are still relevant today
} C and C++ are still popular
} Buffer overflows still occur with regularity

} They have a long history


} Many different approaches developed to defend against them, and bugs
like them

} They share common features with other bugs that we will


study
} In how the attack works
} In how to defend against it

54
C and C++ still very popular

The 2018 Top Programming Languages - IEEE Spectrum

55
Critical systems in C/C++
} Most OS kernels and utilities
} fingerd, X windows server, shell

} Many high-performance servers


} Microsoft IIS, Apache httpd, nginx
} Microsoft SQL server, MySQL, redis, Memcached

} Many embedded systems


} Mars rover, industrial control systems, automobiles

A successful attack on these systems is particularly dangerous!

56
History of buffer overflows

} Morris worm
} Propagated across machines (too aggressively, thanks to a bug)
} One way it propagated was a buffer overflow attack against a
vulnerable version of fingerd on VAXes
} Sent a special string to the finger daemon, which caused it to execute code that
created a new worm copy
} Didn’t check OS: caused Suns running BSD to crash
} End result: $10-100M in damages, probation, community service
Morris now a professor at MIT

57
History of buffer overflows (cont.)

} CodeRed
} Exploited an overflow in the MS-IIS server
} 300,000 machines infected in 14 hours

58
History of buffer overflows (cont.)

} SQL Slammer
} Exploited an overflow in the MS-SQL server
} 75,000 machines infected in 10 minutes

59
60
61
What we’ll do
} Understand how these attacks work, and how to defend against
them

} These require knowledge about:


} The compiler
} The OS
} The architecture

Analyzing security requires a whole-systems view

62
Note about terminology
} I use the term buffer overflow to mean any access of a buffer
outside of its allotted bounds
} Could be an over-read, or an over-write
} Could be during iteration (“running off the end”) or by direct access (e.g.,
by pointer arithmetic)
} Out-of-bounds access could be to addresses that precede or follow the
buffer
} Others sometimes use different terms
} They might reserve buffer overflow to refer only to actions that write
beyond the bounds of a buffer
} Contrast with terms buffer underflow (write prior to the start), buffer overread
(read past the end), out-of-bounds access, etc.

63
Benign outcome
void func(char *arg1)
{
char buffer[4];
strcpy(buffer, arg1);
...
}
int main()
{
char *mystr = “AuthMe!”;
func(mystr);
...
}

Upon return, sets %ebp to 0x0021654d


M e ! \0
A00 00
u 00t 00h %ebp
4d 65 21 00 %eip &arg1

buffer
SEGFAULT (0x00216551) (during subsequent access)
64
Security-relevant outcome
void func(char *arg1)
{
int authenticated = 0;
char buffer[4];
strcpy(buffer, arg1);
if(authenticated) { ...
}
int main()
{
char *mystr = “AuthMe!”;
func(mystr);
...
}

Code still runs; user now ‘authenticated’


M e ! \0

A00 00
u 00t 00h 4d00 65 210000
00 00 %ebp %eip &arg1

buffer authenticated

65
Could it be worse?
void func(char *arg1)
!
E
{
char buffer[4];

...

O D
strcpy(buffer, arg1);

C
}

All ours!
00 00 00 00 %ebp %eip &mystr

buffer
strcpy will let you write as much as you want (til a ‘\0’)

What could you write to memory to wreak havoc?

66
Aside: User-supplied strings
} These examples provide their own strings

} In reality strings come from users in myriad ways


} Text input
} Packets
} Environment variables
} File input…

} Validating assumptions about user input is extremely


important

67
Code Injection
Code Injection: Main idea
void func(char *arg1)
{
char buffer[4];
sprintf(buffer, arg1);
...
}

%eip

Text … 00 00 00 00 %ebp %eip &arg1 … Haxx0r c0d3


buffer

(1) Load my own code into memory

69
Code Injection: Main idea
void func(char *arg1)
{
char buffer[4];
sprintf(buffer, arg1);
...
}

%eip

Text … 00 00 00 00 %ebp %eip &arg1 … Haxx0r c0d3


buffer

(1) Load my own code into memory

(2) Somehow get %eip to point to it

70
Challenge1: Loading code into memory
} It must be the machine code instructions (i.e., already
compiled and ready to run)

} We have to be careful in how we construct it:


} It can’t contain any all-zero bytes
} Otherwise, sprintf / gets / scanf / … will stop copying
} How could you write assembly to never contain a full zero byte?
} It can’t use the loader (we’re injecting)

71
What code to run?

} Goal: general-purpose shell


} Command-line prompt that gives attacker general access to the
system

} The code to launch a shell is called shellcode

72
Shellcode
#include <stdio.h>
int main( ) {
char *name[2];
name[0] = “/bin/sh”;
name[1] = NULL;
execve(name[0], name, NULL);
}

xorl %eax, %eax “\x31\xc0”

Machine code
pushl %eax “\x50” (Part of)
Assembly

pushl $0x68732f2f “\x68””//sh” your


pushl $0x6e69622f “\x68””/bin” input
movl %esp,%ebx “\x89\xe3”
pushl %eax “\x50”
... ...

73
Challenge 2:
Getting injected code to run

} We can’t insert a “jump into my code” instruction


} We don’t know precisely where our code is

%eip

Text … 00 00 00 00 %ebp %eip &arg1 … \x0f \x3c \x2f ...


buffer

74
Recall: memory layout summary
} Calling function:
} 1.Push arguments onto the stack (in reverse)
} 2.Push the return address, i.e., the address of the instruction you
want run after control returns to you
} 3.Jump to the function’s address
} Called function:
} 4.Push the old frame pointer onto the stack (%ebp)
} 5.Set frame pointer (%ebp) to where the end of the stack is right
now (%esp)
} 6.Push local variables onto the stack
} Returning function:
} 7.Reset the previous stack frame: %esp = %ebp, %ebp = (%ebp)
} 8.Jump back to return address: %eip = 4(%esp)

75
Hijacking the saved %eip

%eip %ebp

Text … 00 00 00 00 %ebp %eip


0xbff &arg1 … \x0f \x3c \x2f ...
buffer
0xbff

But how do we know the address?

76
Hijacking the saved %eip

What if we are wrong?

%eip %ebp

Text … 00 00 00 00 %ebp %eip


0xbff
0xbdf &arg1 … \x0f \x3c \x2f ...
buffer
0xbff

This is most likely data,


so the CPU will panic
(Invalid Instruction)

77
Challenge 3:
Finding the return address
} If we don’t have access to the code, we don’t know how far the
buffer is from the saved %ebp

} One approach: just try a lot of different values!


} Worst case scenario: it’s a 32 (or 64) bit memory space, which means
2^32 (2^64) possible answers

} Without address randomization:


} The stack always starts from the same fixed address
} The stack will grow, but usually it doesn’t grow very deeply (unless
the code is heavily recursive)

78
Improving our chances: nop sleds
nop is a single-byte instruction
(just moves to the next instruction)

Jumping anywhere
%eip %ebp here will work

Text … 00 00 00 00 %ebp %eip


0xbff
0xbdf nop nop… \x0f \x3c \x2f ...
&arg1
buffer
0xbff

Now we improve our chances


of guessing by a factor of #nops

79
Putting it all together

But it has to be something;


we have to start writing wherever
the input to gets/etc. begins.

good
padding guess
%eip

Text … 00 00 00 00 %ebp %eip


0xbff
0xbdf nop nop… \x0f \x3c \x2f ...
&arg1
buffer
nop sled malicious code

80
Other memory exploits
Other attacks
} The code injection attack we have just considered is called stack
smashing
} The term was coined by Aleph One in 1996

} Constitutes an integrity violation, and arguably a violation of


availability

} Other attacks exploit bugs with buffers, too

82
Heap overflow
} Stack smashing overflows a stack allocated buffer

} You can also overflow a buffer allocated by malloc, which


resides on the heap

83
Heap overflow
typedef struct _vulnerable_struct {
char buff[MAX_LEN];
int (*cmp)(char*,char*);
} vulnerable;
int foo(vulnerable* s, char* one, char*
two)
{
strcpy( s->buff, one ); copy one into buff
strcat( s->buff, two ); copy two into buff
return s->cmp( s->buff, "file://foobar"
);
}

must have strlen(one)+strlen(two) < MAX_LEN


or we overwrite s->cmp

84
Heap overflow variants
} Overflow into the C++ object vtable
} C++ objects (that contain virtual functions) are represented using a
vtable, which contains pointers to the object’s methods
} This table is analogous to s->cmp in our previous example, and a similar
sort of attack will work
} Overflow into adjacent objects
} Where buff is not collocated with a function pointer, but is allocated
near one on the heap
} Overflow heap metadata
} Hidden header just before the pointer returned by malloc
} Flow into that header to corrupt the heap itself
} Malloc implementation to do your dirty work for you!

85
Integer overflow
void vulnerable()
{
HUGE
char *response;
int nresp = packet_get_int();
Wrap around
if (nresp > 0) {
response = malloc(nresp*sizeof(char*));
for (i = 0; i < nresp; i++)
response[i] = packet_get_string(NULL);
} Overflow

If we set nresp to 1073741824 and sizeof(char*) is 4


then nresp*sizeof(char*) overflows to become 0
subsequent writes to allocated response overflow it

86
Corrupting data
} The attacks we have shown so far affect code
} Return addresses and function pointers

} But attackers can overflow data as well, to


} Modify a secret key to be one known to the attacker, to be able to
decrypt future intercepted messages
} Modify state variables to bypass authorization checks (earlier
example with authenticated flag)
} Modify interpreted strings used as part of commands
} E.g., to facilitate SQL injection

87
Read overflow
} Rather than permitting writing past the end of a buffer, a bug
could permit reading past the end

} Might leak secret information

88
Read overflow

int main() {
char buf[100], *p;
int i, len;
while (1) {

}
p = fgets(buf,sizeof(buf),stdin);
if (p == NULL) return 0;
len = atoi(p); Read integer
p = fgets(buf,sizeof(buf),stdin);
if (p == NULL) return 0; } Read message
}
for (i=0; i<len; i++)
if (!iscntrl(buf[i])) putchar(buf[i]);
else putchar('.'); Echo back (partial)message
printf(“\n”); May exceed
}} actual message
length!

89
Sample transcript

% ./echo-server
24
every good boy does fine
ECHO: |every good boy does fine|
10
hello there OK: input length
ECHO: |hello ther| < buffer size
25
hello BAD:
ECHO: |hello..here..y does fine.| length
leaked data > size !

90
Heartbleed
} The Heartbleed bug was a read overflow
in exactly this style

} The SSL server should accept a “heartbeat” message that it


echoes back

} The heartbeat message specifies the length of its echo-back


portion, but the buggy SSL software did not check the length
was accurate

} Thus, an attacker could request a longer length, and read past


the contents of the buffer
} Leaking passwords, crypto keys, …
91
Format string vulnerabilities
Formatted I/O
} C’s printf family supports formatted I/O

void print_record(int age, char *name)


{
printf(“Name: %s\tAge: %d\n”,name,age);
}

} Format specifiers
} Position in string indicates stack argument to print
} Kind of specifier indicates type of the argument
} %s = string
} %d = integer
} etc.

93
What’s the difference?
void safe()
{
char buf[80];
if(fgets(buf, sizeof(buf), stdin)==NULL)
return;
printf(“%s”,buf);!
}

void vulnerable()
{
char buf[80];
if(fgets(buf, sizeof(buf), stdin)==NULL)
return;
printf(buf); Attacker controls the format string
}

94
printf implementation

int i = 10;
printf(“%d %p\n”, i, &i);

0x00000000 0xffffffff
%ebp %eip &fmt 10 &i

printf’s stack frame caller’s


stack frame

• printf takes variable number of arguments


• printf pays no mind to where the stack frame “ends”
• It presumes that you called it with (at least) as many arguments
as specified in the format string

95
Back to our example

void vulnerable()
{
char buf[80];
if(fgets(buf, sizeof(buf), stdin)==NULL)
return;
printf(buf);
}

“%d %x"

0x00000000 0xffffffff
%ebp %eip &fmt

caller’s
stack frame

96
Format string vulnerabilities
} printf(“100% dave”);
} Prints stack entry 4 bytes above saved %eip
} printf(“%s”);
} Prints bytes pointed to by that stack entry
} printf(“%d %d %d %d …”);
} Prints a series of stack entries as integers
} printf(“%08x %08x %08x %08x …”);
} Same, but nicely formatted hex
} printf(“100% no way!”)
} WRITES the number 3 to address pointed to by stack entry

97
Why is this a buffer overflow?
} We should think of this as a buffer overflow in the sense that
} The stack itself can be viewed as a kind of buffer
} The size of that buffer is determined by the number and size of the
arguments passed to a function

} Providing a bogus format string thus induces the program to


overflow that “buffer”

98
Vulnerability prevalence

http://web.nvd.nist.gov/view/vuln/statistics

99
Time to switch hats

100
Software Security

Questions

101

You might also like