Buffer overflow how does it work




















A buffer overflow condition exists when a program attempts to put more data in a buffer than it can hold or when a program attempts to put data in a memory area past a buffer. In this case, a buffer is a sequential section of memory allocated to contain anything from a character string to an array of integers. Writing outside the bounds of a block of allocated memory can corrupt data, crash the program, or cause the execution of malicious code. Buffer overflow is probably the best known form of software security vulnerability.

Most software developers know what a buffer overflow vulnerability is, but buffer overflow attacks against both legacy and newly-developed applications are still quite common. Part of the problem is due to the wide variety of ways buffer overflows can occur, and part is due to the error-prone techniques often used to prevent them.

Buffer overflows are not easy to discover and even when one is discovered, it is generally extremely difficult to exploit. Nevertheless, attackers have managed to identify buffer overflows in a staggering array of products and components. In a classic buffer overflow exploit, the attacker sends data to a program, which it stores in an undersized stack buffer. Although this type of stack buffer overflow is still common on some platforms and in some development communities, there are a variety of other types of buffer overflow, including Heap buffer overflow and Off-by-one Error among others.

Another very similar class of flaws is known as Format string attack. Even bounded functions, such as strncpy , can cause vulnerabilities when used incorrectly. The combination of memory manipulation and mistaken assumptions about the size or makeup of a piece of data is the root cause of most buffer overflows.

Attackers use buffer overflows to corrupt the execution stack of a web application. By sending carefully crafted input to a web application, an attacker can cause the web application to execute arbitrary code — effectively taking over the machine.

Buffer overflow flaws can be present in both the web server or application server products that serve the static and dynamic aspects of the site, or the web application itself. We're expecting the first A's to fill the buffer, the B's to overwrite the EBP and the C's to overwrite the return address.

To those characters, the four B's 0x42 and the four C's 0x43 will be added, producing a string with a total length of bytes. From within gdb , the program can be executed using the run -command and when this string is passed to the program, it will produce the following result:.

Segmentation fault. This is an error the CPU produces when you something tries to access a part of the memory it should not be accessing. It didn't happen because a piece of memory was overwritten, it happened because the return address was overwritten with C's 0x There's nothing at address 0x and if there is, it does not belong to the program so it is not allowed to read it.

This produces the segmentation fault. If we check out the registers by entering the command info registers in gdb, we confirm that the EBP and return address are overwritten:. The EIP Extended Instruction Pointer contains the address of the next instruction to be executed, which now points to the faulty address.

Now 0x is a faulty address. However, if this address would point to malicious code, we could have a problem. We've seen how to change the return address. In order to exploit the problem with the buffer we aim to change the return address to somewhere we would have some code that, when executed, could do something beneficial to us as an attacker; like launching a shell. A shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability.

It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine, but any piece of code that performs a similar task can be called shellcode.

The art of crafting shellcode is a blog post of its own. Shellcodes depend on the operating system and CPU and are commonly written in assembler. Many samples of shellcode can be found on the Internet exploit-db and the piece we're using here is one that spawns a command shell.

Create a file called shellcode. Assemble it using nasm :. This produces a shellcode. Our goal is to get the faulty program buf to execute the shellcode. In order to do this, we will pass the shellcode as the command-line parameter so it will eventually end up in the buffer. We then overwrite the return address the C's in the previous example so it will point back to a memory address somewhere in the buffer.

This will make the program jump to the shellcode and execute that code instead of the regular program. Memory may move around a bit during execution of the program, so we do not exactly know on which address the shellcode will start in the buffer.

The NOP-sled is a way to deal with this. Anywhere the return address lands in the NOP-sled, it's going to slide along the buffer until it hits the start of the shellcode. With a NOP-sled, it doesn't matter where the shellcode is in the buffer for the return address to hit it. What we do know is that it will be somewhere in the buffer and its size will be 25 bytes. With a shellcode of 25 bytes and a payload of bytes, we have 83 bytes left to fill, which we will divide on both sides of the shellcode like this:.

The NOP-sled will be placed at the start of the payload, followed by the shellcode. After the shellcode we will place a filler, for now consisting of a bunch of 'E' characters 0x45 in hexadecimal.

This filler will later be replaced by the memory address pointing to somewhere in the NOP-sled inside the buffer. The following image depicts what we expect our memory to look like after execution with the given payload:.

For this example I chose a block of E's with a size of 20 5 times 4 bytes. This means we have 63 bytes left for the NOP-sled in a byte payload. The size of the block of E's does not really matter, but we want it to be at least 4 bytes since it must contain a full memory address.

Also, a bigger NOP-sled has the advantage that we have a higher chance of hitting it with the memory jump that we're trying to achieve, even if memory gets reallocated a bit at runtime.

When we run the program with the payload built in the previous chapter, the following output is produced:. Again, I've used the python code to generate a NOP-sled of 63 bytes, concatenated with the shellcode and 5 times 4 bytes of E's. The address mentioned in the segmentation fault is 0x, which shows that the E's correctly ended up at the return address. Inspecting the memory of the program when it crashed, we can confirm the payload was placed as expected:.

The left-most column in this image contains the memory address of the left-most byte in the row. Using this we now know the runtime memory addresses that contain the payload. Grainne McKeever. Yohann Sillam , Ron Masas. Matthew Hathaway. Research Labs Daniel Kerman. Application Security Bruce Lynch. Application Delivery Data Security. Data Security Application Delivery Application Security.

Nik Hewitt. Terry Ray. Latest Articles. App Security Edge Security DDoS Essentials.



0コメント

  • 1000 / 1000