Preface: Most of the stuff you find in this thread won't work for modern day software or computers, however it is essential you understand the basic fundamentals and such of exploitation. Why? Modern exploitation is more difficult and complex than it used to be, but the principles remain the same, however more steps are needed to get code execution.
To start off, if you're new into exploitation you might not understand what the above means. Why is exploitation harder today? Developers realized that bugs will always be present, before, now, and forever. So instead of trying to fix bugs, they shifted to fixing and patching the exploits that utilize these bugs. Also exploits are different than vulnerabilities, I'd like to clear this up now. Vulnerabilities are bugs in programs that can potentially be used for exploitation. Exploits themselves are the actual payloads that are crafted to gain code execution. Because of developers trying to patch exploits, the way they do this is via methods such as DEP (Data Execution Prevention) and ASLR (Address Space Layout Randomization). We'll talk about these later, and how DEP can be bypassed.
For code examples if you want to work through them, I would recommend using a wargame VM or using Damn Vulnerable Linux, as it should have Data Execution Prevention and Address Space Layout Randomization turned off.
Shell and shellcode: Generally through exploitation you don't just want to print different pretty text, you want to invoke a shell. Why? An extreme example would be something like the PS3. If you can exploit a program on it, you can gain shell access. Even better is the fact that the program you invoke the shell from, the shell will have the same privileges as the program itself, so if the program is running as root (uid0) and you manage to exploit it, you just gained root access to the system.
Now generally speaking shellcode is obsolete now due to Data Execution Prevention (DEP), but I'll still cover it because it could be used today, and it can be used in wargames and competitions as well. These shellcode payloads are written using Intel x86 Assembly, this is exactly why I put the notice saying you want some sort of background with assembly. When it comes to exploitation, you want your payload to be as small as possible. Even if the buffer you need to fill is 64 bytes and gives you plenty of room, you give yourself more lee-way as you give yourself more space for your NOPSled (we'll talk about this shortly as well).
There's tons of shellcode payloads all over the internet, but I recommend you study the code below to actually understand how it works. Now because we want our payload to be small, we can't just insert a C program using "execve("/bin/sh");", the payload would be huge and wouldn't work. You may compile a program and think "wow, 141kb that's small!". Well, we're going to create a program that invokes a shell in just 30 bytes. You could get this even lower, down to 23 bytes if you want, but I'll just be providing my payload. It's not perfect but it works.
xor eax, eax
xor ebx, ebx
xor ecx, ecx
xor edx, edx
mov al, 11
mov ebx, esp
mov ecx, esp
Well we can't just insert these instructions into the stack and expect it to work... or can we? All we need to do is compile this with nasm using "nasm -f elf [filename].asm". Now we need to use "objdump -d [filename]". This will give us the binary of our shellcode. We're now ready to build our payload, below is the binary of the above shellcode;
68 2f 2f 73 68
68 2f 62 69 6e
Buffer Overflows: Ok, now let's say we were exploiting the following C program:
int main(int argc, char **argv)
So we have a 64 byte buffer that we need to overflow, however we also need to modify our EIP (instruction pointer) to point to our payload, or it won't execute. Generally speaking (it can vary, your best bet is utilizing GDB to find your exact addresses), you'll need to overwrite 4 bytes of garbage (the ebp), then the next 4 bytes will the instruction pointer. If you get "Illegal Instruction" or "Segmentation Fault", you're pointing to a bad memory address and need to change your payload. To actually create and load this payload, we'll use perl (yay).
First, we know our payload is 29 bytes. We need to fill a 64 byte buffer to overflow it. So, you could, if you wanted to be a perfectionist, insert garbage or null bytes for the first (64 - 29 = 35 bytes). If you want your payload to actually have a better chance of working and could really care less, you're going to add a NOPSled. A NOP instruction exchanges (xch) eax with eax, so it essentially does nothing as the name implies. The good thing about this instruction is if we land our EIP on any of these nops, even if it's not our shellcodes first instruction, we're golden. If we add nulls, we have to hit the first instruction on the dot. A nop in binary is 0x90. We will add 35 nops, our shellcode, 4 bytes of trash, and a return instruction into our payload.
Now, for tutorial sake, I'm going to say I went into gdb and found one of my NOPS to be 0x080482AA (basic debugging can be found at the end of the thread), we are working with little endian, meaning when we store this we'll have to store it in reverse. In the end, our command to construct our payload will be (\x is an escape sequence for printing hex):
Congratulations, if everything worked out, you now have a shell running off the program you exploited, with the same permissions as the program had.
Heap Overflows: Heap overflows are more complex than stack overflows, this is because the stack can be directly manipulated by the program easily. With the heap it's not such an easier process, they are also harder to find, therefore in the real world, you'll have a much better bet finding heap overflows than stack overflows, developers are becoming wise. Unlike the stack section, I'm not going to go into a complete tutorial on how to exploit it, but I will say this; the heap is generally used for global variables and such, but is also used when the programmer calls an allocation such as malloc(). This allocates chunks of data to the program for usage, however calling the syscall (nbrk) is very slow, therefore good (most) allocators will do tons of optimization, one of which being that if the program requests 256 bytes for an allocation and there's a 256 (or two 12 byte chunks, it will reuse this chunk instead of calling for more memory from the operating system. How heap overflows work is via overflowing the current chunk with a payload (except if DEP is enabled, then you'd have to use ret2libc), and precisely overflow the meta of the next chunk. Each chunk has metadata on it saying if it's 1. available, 2. the size, 3. a pointer to the previous free chunk, 4. a pointer to the next free chunk (think of this as a linked list). We need to overwrite "availability" and "size" with garbage to get to the pointers. Once we get to the pointers, you can go about this in multiple ways. You can use the GOT (Global Offset Table), or a return address, this is what you will write to the "previous free" pointer. You will write the address in which you want to call in the global offset table or what you want the return address to be in the "next free" pointer. Complicated, as I said I won't be covering heap overflows here, maybe in another tutorial if it's requested.
Protection Mechanisms: Developers started to realize patching bugs was never going to work and was a losing battle, so they decided to patch exploitation instead. These mechanisms make everything above either even more difficult or impossible. We'll talk about two of these mechanisms;
DEP or Data Execution Prevention AKA. StackShield. This nifty little feature makes the program distinguish between data and executable code where it otherwise would not have, meaning that our above methods where we inject shellcode into memory and try to access it, won't work. You'll most likely get a segfault, or it just won't work. Modern linux distros and windows force this upon the operating system, so unless you turn it off, it doesn't matter what program you're attempting to exploit, DEP will block shellcode execution. This is bypassed using the GOT or Global Offset Table, because (at least in linux), libc is included with any and every C program regardless. If you can get the address of "execve", you can call it and pass it the "/bin/sh" address.
ASLR or Address Space Layout Randomization. This is a true nightmare for those who wish to exploit a program. Basically, ASLR or DEP are defective and are easy to get around. DEP and ASLR combined, is not fun for exploitation. This feature randomizes the addresses, making guessing impractical and our little GOT method useless. I'm not talking about how to bypass ASLR in this tutorial, however it's only really effective on 64 bit systems, as 32 bit systems don't have a large enough entropy to make guessing too much of an issue.