Codex vs. Claude: Which One Handles Reverse Engineering Skills Better?
In my previous blog, I demonstrated the use of skills via OpenAI’s Codex to perform static analysis on multiple pieces of malware. This included two different skills, IoC extraction and unpacking packed malware. The demonstration proved that AI with agents is capable to do simple & straightforward analysis for samples, saving analyst time as well as enabling more samples to be analyzed quickly.0
I mentioned that I wanted to create and use Claude skills as well. This is for having parity with both popular AI providers (OpenAI and Anthropic), and it provides an opportunity do a comparison to see what advantages/disadvantages each provides.
Creating the Skills
To start out, I’ve created the Claude versions of my last two skills, re-ioc-extractor and re-unpacker, into the working code repository from last time: https://github.com/hackersifu/reverse-engineering-skills
The README.md contains the updated instructions for Claude as well.
README details
Defining the Test
The initial skills created for Codex were tested with multiple malware samples at the same time. Part of this was done to see how much AI could handle at the same time.
This test will be against one sample, with the purpose of measuring output of both types of skills within both Claude and Codex.
Environment Details:
Windows (FLARE-VM Image)
Claude – Claude Code v2.1.74 (CLI) gpt-5.3-codex
Codex – Codex CLI v0.111.0 (CLI), Sonnet 4.6
We’ll start with the IOC extraction first, using Claude.
Claude IoC Extraction Test:
Prompt Used: There is a malicious file within this directory. Perform IoC extraction following the re-ioc-extraction skill details.
What I noticed is that Claude needed a bit more help with running the commands. This is due to some guardrails I’ve put in place to ensure that the AI doesn’t go hog wild and start running executables, but it seemed more noticeable with Claude.
However, the full IoC report was very impressive.
Claude IoC Extraction Report:
● I have enough evidence now. Let me produce the complete IOC report.
IOC Extraction Report — Cerber5.exe Sample: Cerber5.exe File type: PE32 executable for MS Windows 5.00 (GUI), Intel i386, 4 sections Analysis method: Static only — no execution. Hashes, strings -a (ASCII), strings -u (UTF-16/Unicode via Sysinternals Strings v2.54)
Note: The binary exhibits heavy packing/obfuscation. ASCII strings output is predominantly noise. The richest readable content comes from the PE resource section (version info, certificate chain) and the import table. IOC extraction is correspondingly limited — sandbox/detonation logs would yield significantly more network and persistence IOCs.
A) IOC Table
Type: sha256 Indicator: b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d Confidence: confirmed Context: Sample hash Evidence: b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d *Cerber5.exe ──────────────────────────────────────── Type: sha1 Indicator: c07dfdea8da2da5bad036e7c2f5d37582e1cf684 Confidence: confirmed Context: Sample hash Evidence: c07dfdea8da2da5bad036e7c2f5d37582e1cf684 *Cerber5.exe ──────────────────────────────────────── Type: md5 Indicator: fe1bc60a95b2c2d77cd5d232296a7fa4 Confidence: confirmed Context: Sample hash Evidence: fe1bc60a95b2c2d77cd5d232296a7fa4 *Cerber5.exe ──────────────────────────────────────── Type: file_name Indicator: PDFWriter.EXE Confidence: contextual Context: Fake OriginalFilename in PE version info resource — masquerade Evidence: OriginalFilename / PDFWriter.EXE (UTF-16 strings, VS_VERSION_INFO block) ──────────────────────────────────────── Type: url Indicator: http://www.usertrust.com Confidence: contextual Context: Embedded Authenticode cert chain (OCSP/CRL infra — legitimate CA endpoint) Evidence: http://www.usertrust.com1 (strings line 6727) ──────────────────────────────────────── Type: url Indicator: http://crl.usertrust.com/UTN-USERFirst-Object.crl Confidence: contextual Context: Embedded cert CRL endpoint Evidence: 1http://crl.usertrust.com/UTN-USERFirst-Object.crl05 (strings line 6747) ──────────────────────────────────────── Type: url Indicator: http://ocsp.usertrust.com Confidence: contextual Context: Embedded cert OCSP endpoint Evidence: http://ocsp.usertrust.com0 (strings line 6749) ──────────────────────────────────────── Type: url Indicator: https://secure.comodo.net/CPS Confidence: contextual Context: Embedded cert policy URL Evidence: https://secure.comodo.net/CPS0C (strings line 6761 context) ──────────────────────────────────────── Type: url Indicator: http://crl.comodoca.com/COMODORSACodeSigningCA.crl Confidence: contextual Context: Embedded cert CRL endpoint Evidence: 2http://crl.comodoca.com/COMODORSACodeSigningCA.crl0t (strings line 6788) ──────────────────────────────────────── Type: url Indicator: http://crt.comodoca.com/COMODORSACodeSigningCA.crt Confidence: contextual Context: Embedded cert download URL Evidence: 2http://crt.comodoca.com/COMODORSACodeSigningCA.crt0$ (strings line 6790) ──────────────────────────────────────── Type: url Indicator: http://ocsp.comodoca.com Confidence: contextual Context: Embedded cert OCSP endpoint (appears twice) Evidence: http://ocsp.comodoca.com0 (strings lines 6791, 6827) ──────────────────────────────────────── Type: url Indicator: http://crl.comodoca.com/COMODORSACertificationAuthority.crl Confidence: contextual Context: Embedded cert CRL endpoint Evidence: ;http://crl.comodoca.com/COMODORSACertificationAuthority.crl0q (strings line 6824) ──────────────────────────────────────── Type: url Indicator: http://crt.comodoca.com/COMODORSAAddTrustCA.crt Confidence: contextual Context: Embedded cert download URL Evidence: /http://crt.comodoca.com/COMODORSAAddTrustCA.crt0$ (strings line 6826) ──────────────────────────────────────── Type: candidate Indicator: igythbnc76p2250F557-11B5-11CF-BB82-00AA00BDCE0Bf Confidence: candidate Context: Unstructured identifier string in UTF-16 resource section; possibly obfuscated CLSID or sample-specific token Evidence: igythbnc76p2250F557-11B5-11CF-BB82-00AA00BDCE0Bf (UTF-16 strings, first readable line)
B) Structured IOC List (YAML)
hashes: - value: "b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d" algorithm: "sha256" confidence: "confirmed" source: "hash command output" evidence_snippet: "b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d *Cerber5.exe"
- value: "c07dfdea8da2da5bad036e7c2f5d37582e1cf684"
algorithm: "sha1"
confidence: "confirmed"
source: "hash command output"
evidence_snippet: "c07dfdea8da2da5bad036e7c2f5d37582e1cf684 *Cerber5.exe"
- value: "fe1bc60a95b2c2d77cd5d232296a7fa4"
algorithm: "md5"
confidence: "confirmed"
source: "hash command output"
evidence_snippet: "fe1bc60a95b2c2d77cd5d232296a7fa4 *Cerber5.exe"
network: - kind: "url" value: "http://www.usertrust.com" confidence: "contextual" source: "strings output" evidence_snippet: "http://www.usertrust.com1" notes: "Embedded Authenticode cert chain — legitimate CA infrastructure"
- kind: "url"
value: "http://crl.usertrust.com/UTN-USERFirst-Object.crl"
confidence: "contextual"
source: "strings output"
evidence_snippet: "1http://crl.usertrust.com/UTN-USERFirst-Object.crl05"
notes: "Embedded cert CRL endpoint"
- kind: "url"
value: "http://ocsp.usertrust.com"
confidence: "contextual"
source: "strings output"
evidence_snippet: "http://ocsp.usertrust.com0"
notes: "Embedded cert OCSP endpoint"
- kind: "url"
value: "https://secure.comodo.net/CPS"
confidence: "contextual"
source: "strings output"
evidence_snippet: "https://secure.comodo.net/CPS0C"
notes: "Embedded cert policy URL"
- kind: "url"
value: "http://crl.comodoca.com/COMODORSACodeSigningCA.crl"
confidence: "contextual"
source: "strings output"
evidence_snippet: "2http://crl.comodoca.com/COMODORSACodeSigningCA.crl0t"
notes: "Embedded cert CRL endpoint"
- kind: "url"
value: "http://crt.comodoca.com/COMODORSACodeSigningCA.crt"
confidence: "contextual"
source: "strings output"
evidence_snippet: "2http://crt.comodoca.com/COMODORSACodeSigningCA.crt0$"
notes: "Embedded cert download URL"
- kind: "url"
value: "http://ocsp.comodoca.com"
confidence: "contextual"
source: "strings output"
evidence_snippet: "http://ocsp.comodoca.com0"
notes: "Embedded cert OCSP endpoint — appears twice in chain"
- kind: "url"
value: "http://crl.comodoca.com/COMODORSACertificationAuthority.crl"
confidence: "contextual"
source: "strings output"
evidence_snippet: ";http://crl.comodoca.com/COMODORSACertificationAuthority.crl0q"
notes: "Embedded cert CRL endpoint"
- kind: "url"
value: "http://crt.comodoca.com/COMODORSAAddTrustCA.crt"
confidence: "contextual"
source: "strings output"
evidence_snippet: "/http://crt.comodoca.com/COMODORSAAddTrustCA.crt0$"
notes: "Embedded cert download URL"
file_names: - value: "PDFWriter.EXE" confidence: "contextual" source: "UTF-16 strings output" evidence_snippet: "OriginalFilename\nPDFWriter.EXE" notes: "Fake PE version info OriginalFilename — masquerades as Acro Software PDF Writer"
notes: - "Candidate identifier string present in UTF-16 resource section: 'igythbnc76p2250F557-11B5-11CF-BB82-00AA00BDCE0Bf' — possibly obfuscated CLSID or Cerber-specific token; labeled candidate/incomplete due to ambiguity" - "PE version info CompanyName: 'Acro Software Inc.' / FileDescription: 'PDF Writer Application' / FileVersion: '2, 7, 7, 1' — all appear to be forged masquerade metadata" - "Import table exposes: CreateMutexA, MoveFileExW, CreateProcessW, DeleteFileA, LoadLibraryA/W — present in strings output but not directly actionable as network/host IOCs without sandbox confirmation" - "All network URLs extracted are from an embedded Authenticode code-signing certificate chain (COMODO/UserTrust infrastructure). They are contextual only — not C2 indicators."
static_risk_notes: - "File type is PE32 (Windows GUI, i386) targeting Windows 5.00 — very broad OS compatibility, consistent with mass-deployment malware." - "ASCII strings output is predominantly short noise fragments with no coherent readable content — strong indicator of runtime packing or encryption. This severely limits static IOC extraction." - "PE has only 4 sections (.text, .rdata, .data, .rsrc), which is sparse and consistent with a packed/compressed loader." - "PE version info resource embeds a convincing but fake 'Acro Software Inc. PDF Writer' identity (OriginalFilename: PDFWriter.EXE) — a known Cerber ransomware masquerade technique." - "Import table includes MoveFileExW (used by Cerber for MOVEFILE_DELAY_UNTIL_REBOOT self-deletion), CreateMutexA (anti-reinfection mutex), and CreateProcessW (process spawning) — a high-risk import combination consistent with ransomware behavior." - "Recommended next step: controlled sandbox detonation with network logging, filesystem monitoring, and registry auditing to recover runtime C2 addresses, dropped file paths, registry persistence keys, and mutex names that packing conceals from static analysis."
Gaps & Recommended Next Steps
Gap: No C2 domains/IPs Reason: Binary is packed; network config is decrypted at runtime Recommended Action: Sandbox detonation with DNS/proxy logging ──────────────────────────────────────── Gap: No registry persistence keys Reason: Same — runtime only Recommended Action: Monitor HKCU\Software\Microsoft\Windows\CurrentVersion\Run and HKLM equivalents during detonation ──────────────────────────────────────── Gap: No mutex name Reason: CreateMutexA import present but argument not in static strings Recommended Action: Capture via API monitor (e.g., ProcMon, API Monitor) ──────────────────────────────────────── Gap: No dropped file paths Reason: Runtime-only behavior Recommended Action: Filesystem diff before/after detonation ──────────────────────────────────────── Gap: No ransom note paths/extensions Reason: Encrypted in packed body Recommended Action: Detonation required
✻ Cooked for 2m 42s
Codex IoC Extraction Test
Prompt Used: There is a malicious file within this directory. Perform IoC extraction following the re-ioc-extraction skill details.
Codex runs more autonomously at the beginning, I believe due to how the skill is written. This means that I can technically pay less attention to it while it runs. However, it takes a bit longer in regards to total run time. It also likes to use `rg` quite a bit for some reason, whereas Claude didn’t try to run that as much.
Codex IoC Extraction Report:
Using re-ioc-extraction on Cerber5.exe with static evidence only (Get-FileHash +
strings.exe).
| Type | Indicator | Confidence | Context | Evidence |
|---|---|---|---|---|
| sha256 | b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d | confirmed |
Sample hash | Hash :
B3E1E9D97D74C416C2A30DD11858789AF5554CF2DE62F577C13944A19623777D (hash command output) |
| sha1 | c07dfdea8da2da5bad036e7c2f5d37582e1cf684 | confirmed | Sample hash | Hash :
C07DFDEA8DA2DA5BAD036E7C2F5D37582E1CF684 (hash command output) |
| md5 | fe1bc60a95b2c2d77cd5d232296a7fa4 | confirmed | Sample hash | Hash :
FE1BC60A95B2C2D77CD5D232296A7FA4 (hash command output) |
| url | http://www.usertrust.com1 | incomplete | URL-like string in embedded cert/signing
metadata region | 742:http://www.usertrust.com1 (Cerber5.strings.n6.txt) |
| url | 1http://crl.usertrust.com/UTN-USERFirst-Object.crl05 | incomplete | URL-like CRL
reference with leading/trailing contamination | 750:1http://crl.usertrust.com/UTN-
USERFirst-Object.crl05 (Cerber5.strings.n6.txt) |
| url | http://ocsp.usertrust.com0 | incomplete | URL-like OCSP reference with trailing
contamination | 751:http://ocsp.usertrust.com0 (Cerber5.strings.n6.txt) |
| url | https://secure.comodo.net/CPS0C | incomplete | URL-like CPS reference with trail
ing contamination | 764:https://secure.comodo.net/CPS0C (Cerber5.strings.n6.txt) |
| url | 2http://crl.comodoca.com/COMODORSACodeSigningCA.crl0t | incomplete | URL-like CRL
reference with leading/trailing contamination | 765:2http://crl.comodoca.com/
COMODORSACodeSigningCA.crl0t (Cerber5.strings.n6.txt) |
| url | 2http://crt.comodoca.com/COMODORSACodeSigningCA.crt0$ | incomplete | URL-like CRT
reference with leading/trailing contamination | 766:2http://crt.comodoca.com/
COMODORSACodeSigningCA.crt0$ (Cerber5.strings.n6.txt) |
| url | http://ocsp.comodoca.com0 | incomplete | URL-like OCSP reference with trailing c
ontamination | 767:http://ocsp.comodoca.com0 (Cerber5.strings.n6.txt) |
| email | demuus@yandex.ru0 | incomplete | Email-like string with trailing contamination
| 768:demuus@yandex.ru0 (Cerber5.strings.n6.txt) |
| file_name | PDFWriter.EXE | contextual | Embedded PE metadata file name |
732:PDFWriter.EXE (Cerber5.strings.n6.txt) |
hashes:
- value: "b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d"
algorithm: "sha256"
confidence: "confirmed"
source: "hash command output"
evidence_snippet: "Hash :
B3E1E9D97D74C416C2A30DD11858789AF5554CF2DE62F577C13944A19623777D"
- value: "c07dfdea8da2da5bad036e7c2f5d37582e1cf684"
algorithm: "sha1"
confidence: "confirmed"
source: "hash command output"
evidence_snippet: "Hash : C07DFDEA8DA2DA5BAD036E7C2F5D37582E1CF684"
- value: "fe1bc60a95b2c2d77cd5d232296a7fa4"
algorithm: "md5"
confidence: "confirmed"
source: "hash command output"
evidence_snippet: "Hash : FE1BC60A95B2C2D77CD5D232296A7FA4"
network:
- kind: "url"
value: "http://www.usertrust.com1"
confidence: "incomplete"
source: "strings output"
evidence_snippet: "742:http://www.usertrust.com1"
- kind: "url"
value: "1http://crl.usertrust.com/UTN-USERFirst-Object.crl05"
confidence: "incomplete"
source: "strings output"
evidence_snippet: "750:1http://crl.usertrust.com/UTN-USERFirst-Object.crl05"
- kind: "url"
value: "http://ocsp.usertrust.com0"
confidence: "incomplete"
source: "strings output"
evidence_snippet: "751:http://ocsp.usertrust.com0"
- kind: "url"
value: "https://secure.comodo.net/CPS0C"
confidence: "incomplete"
source: "strings output"
source: "strings output"
evidence_snippet: "766:2http://crt.comodoca.com/COMODORSACodeSigningCA.crt0$"
- kind: "url"
value: "http://ocsp.comodoca.com0"
confidence: "incomplete"
source: "strings output"
evidence_snippet: "767:http://ocsp.comodoca.com0"
file_names:
- value: "PDFWriter.EXE"
confidence: "contextual"
source: "strings output"
evidence_snippet: "732:PDFWriter.EXE"
emails:
- value: "demuus@yandex.ru0"
confidence: "incomplete"
source: "strings output"
evidence_snippet: "768:demuus@yandex.ru0"
notes:
- "No confirmed registry keys, mutexes, explicit process command lines, or clean C2
URLs/domains were directly observed in extracted strings."
- "Several network/email strings appear inside certificate/signing metadata and include
adjacent-byte contamination; retained exactly as seen per traceability/no-guessing
rules."
If you want, I can run a second pass that outputs a separate candidate_normalized list
(mechanically trimmed URL/email forms) while keeping this raw-evidence set unchanged.
IoC Extraction Report from Claude
IoC Extraction - Next Steps and Recommendations from Claude
Now let’s move on to the unpacking tests.
Claude Unpacker Test
Prompt Used: None (I invoked the skill in the same context window as re-ioc-extractor)
During the run, Claude took the opportunity to open up Detect-It-Easy on it’s own, which was new in comparison to working with Codex previously
Claude Unpacking Report:
I have all the data needed. Let me produce the full unpacking report.
---
Unpacking Report — Cerber5.exe
--- Packing Assessment
Verdict: Packed — custom packer (not UPX, no public signature match)
Confidence: high
Evidence (verbatim excerpts):
- Not UPX (confirmed):
upx: Cerber5.exe: NotPackedException: not packed by UPX
- No packer signature (DIE returned empty output):
die.exe Cerber5.exe → (no output) — signature database has no match for this sample's
packing stub.
- Entry point is near the END of .text, a classic stub-at-tail packing pattern:
- EP RVA: 0x497a0
- .text section: RVA 0x1000, VirtSize 0x493A5 → spans 0x1000–0x4A3A5
- EP at 0x497a0 is only ~`0x5A5bytes from the section end → the bulk of.text` (≈ 299
KB) precedes the entry point and is likely the encrypted/compressed payload blob.
- .text entropy elevated at 6.29 with a disproportionately small analyzable function
count:
Section .text VirtSize=300005 RawSize=300032 Entropy=6.2904
FLOSS found only 14 functions to emulate in a 300 KB code section. Normal compiled code
at this size would yield hundreds of functions.
- FLOSS decoded zero meaningful strings; all 34 "decoded" values are garbage fragments:
0aALo PbAP pcAl1 0gAL piAl 0mAL ...
- These are outputs of FLOSS emulating the obfuscation/encoding routine in the stub
itself, not real payload strings.
- Import table contains probable decoy imports alongside legitimate stub imports
(GetProcAddress + LoadLibraryA/W = runtime dynamic linking confirmed by capa). Unusual
GDI internals (GdiQueryTable, BRUSHOBJ_pvAllocRbrush, GdiCleanCacheDC) and rarely-called
APIs are present — padding to confuse import-based classification.
- capa sees only 3 capabilities across 14 functions:
log keystrokes via polling collection/keylog
read clipboard data host-interaction/clipboard
link function at runtime on Windows linking/runtime-linking
- A Cerber5 ransomware binary should trigger dozens of capa capabilities (encryption,
file traversal, persistence, C2). The near-empty capa result means the analyzer cannot
see the payload code at all — it is obfuscated from static analysis.
---
Unpacking Plan (prioritized)
Step 1 — Static: Inspect the entry point stub in a disassembler (no execution)
The decryption stub is at VA 0x4497a0 (ImageBase 0x400000 + EP 0x497a0). A disassembler
can be used to trace the stub logic without running the sample:
Tools: Ghidra (free), Binary Ninja, or IDA Free — load Cerber5.exe, navigate to 0x4497a0,
and trace the first 50–100 instructions to identify:
- A decryption loop (XOR/ADD/ROL over a large memory range)
- A VirtualAlloc + memcpy/RtlMoveMemory pattern
- A jump/call eax or jmp [reg] transferring control to the newly decrypted region
This is safe (read-only analysis). Record the decryption key and algorithm if visible.
Validation: If the stub is a simple XOR/ADD cipher over the .text blob, it may be
reproducible with a short Python script — producing an unpacked PE without execution.
---
Step 2 — Static: Attempt byte-pattern identification
Run a binary search for known packer stubs (PEiD database, Yara):
yara -r <your-rules-dir> Cerber5.exe
If you have a custom Yara ruleset for common Cerber loaders or commercial protectors
(Themida, WinLicense, Obsidium), run those rules first. This may narrow down the stub
type without execution.
---
Step 3 — Dynamic (sandbox required — execution gate)
PAUSE: The following steps require executing Cerber5.exe in a controlled environment.
Engineer must approve and confirm:
- Isolated VM/sandbox with a clean, snapshotted state (revertible)
- Network posture: host-only or null-routed — Cerber5 is ransomware with active C2
behavior; do NOT run with internet access
- Monitoring tools to be active before launch: process monitor (ProcMon), API logger (API
Monitor or x32dbg + logging plugin), memory scanner (PE-sieve, Hollows Hunter)
- File system: monitor for dropped files, ransom notes, encrypted files — run against a
scratch folder with no real data
Paste sandbox outputs (process tree, API log, memory events) here and I will extract IOCs
and identify the unpack transition point from the evidence.
What to look for during controlled detonation:
┌─────────────────────────────────────────────────────┬──────────────────────────────┐
│ Signal │ What it means │
├─────────────────────────────────────────────────────┼──────────────────────────────┤
│ VirtualAlloc(RWX) or VirtualProtect(..., │ Memory allocated for │
│ PAGE_EXECUTE_READWRITE) │ unpacked payload │
├─────────────────────────────────────────────────────┼──────────────────────────────┤
│ │ Payload being │
│ Write loop to the new allocation │ decrypted/decompressed into │
│ │ memory │
├─────────────────────────────────────────────────────┼──────────────────────────────┤
│ jmp eax / call [reg] after the write │ Transfer of execution to │
│ │ unpacked payload (OEP) │
├─────────────────────────────────────────────────────┼──────────────────────────────┤
│ New PE header (MZ/PE) appearing in memory │ In-memory PE — target for │
│ │ dumping │
├─────────────────────────────────────────────────────┼──────────────────────────────┤
│ │ Process hollowing — payload │
│ CreateProcess/WriteProcessMemory/SetThreadContext │ injected into another │
│ │ process │
└─────────────────────────────────────────────────────┴──────────────────────────────┘
Recommended dynamic tools (if available in your sandbox):
- x32dbg: Set breakpoint on VirtualAlloc. When the allocation with RWX happens, dump the
allocated region and search for MZ header.
- PE-sieve / Hollows Hunter (by hasherezade): Run after detonation to automatically
detect and dump injected/unpacked PE regions from memory.
- Any.run / Cuckoo sandbox: Submit and retrieve memory dumps + behavioral logs if an
automated sandbox is available.
---
Artifacts
Original sample:
path: Cerber5.exe
sha256: b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d
sha1: c07dfdea8da2da5bad036e7c2f5d37582e1cf684
md5: fe1bc60a95b2c2d77cd5d232296a7fa4
file: PE32 executable for MS Windows 5.00 (GUI), Intel i386, 4 sections
notes: Packed with unknown/custom packer. EP at end of .text (0x4497a0).
Rich decoy import table. 14 visible functions in 300 KB .text section.
Unpacked/dumped artifacts:
None produced — dynamic execution not yet approved.
Pending sandbox detonation and memory dump (Step 3 above).
---
Notable Import Table Findings (Static — no execution)
Even though the full payload is packed, the stub's import table reveals high-risk
capabilities that will be active at runtime. These are confirmed present in the binary's
IAT:
┌──────────────────────────────┬───────────────────┬─────────────────────────────────┐
│ Import │ DLL │ Risk Signal │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ GetProcAddress + │ KERNEL32 │ Dynamic API resolution — hides │
│ LoadLibraryA/W │ │ payload imports │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ IsDebuggerPresent │ KERNEL32 │ Anti-debug check │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ MoveFileExW │ KERNEL32 │ Self-deletion via │
│ │ │ MOVEFILE_DELAY_UNTIL_REBOOT │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ CreateMutexA │ KERNEL32 │ Anti-reinfection mutex │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ CreateProcessW │ KERNEL32 │ Process spawning / hollowing │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ DeleteFileA │ KERNEL32 │ File deletion │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ CreateServiceW + │ ADVAPI32 │ Service-based persistence │
│ OpenSCManagerW │ │ │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ RegSetValueExW + │ ADVAPI32 │ Registry persistence │
│ RegCreateKeyW │ │ │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ GetUserNameW + │ KERNEL32/ADVAPI32 │ Victim fingerprinting │
│ GetComputerNameW │ │ │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ SHGetSpecialFolderPathA/W │ SHELL32 │ AppData/Desktop path resolution │
│ │ │ (file traversal) │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ SHEmptyRecycleBinW │ SHELL32 │ Recycle Bin wiping │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ GetAsyncKeyState + │ USER32 │ Keylogging (confirmed by capa) │
│ GetKeyboardLayout │ │ │
├──────────────────────────────┼───────────────────┼─────────────────────────────────┤
│ GetClipboardData + │ USER32 │ Clipboard harvesting (confirmed │
│ EnumClipboardFormats │ │ by capa) │
└──────────────────────────────┴───────────────────┴─────────────────────────────────┘
---
Next Steps (after unpacking)
1. Re-run /re-ioc-extraction on the unpacked artifact's strings and any sandbox logs —
this is expected to recover C2 domains/IPs, registry keys, mutex names, dropped file
paths, and ransom note paths that are currently hidden inside the packed payload.
2. Re-run capa on the unpacked artifact — expect dozens of capabilities (encryption, file
enumeration, network C2, persistence) vs. the current 3. The delta will validate
successful unpacking.
3. Extract Cerber config blob — Cerber5 embeds a JSON configuration containing C2
addresses, encryption parameters, and ransom note templates. After unpacking, search the
unpacked .data/.rdata for { and JSON-like structures.
4. Generate Yara rule from unpacked artifact strings/byte patterns for detection at the
payload layer.
5. Submit hashes to threat intel (VirusTotal, MalwareBazaar) — the SHA256
b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d may already have sandbox
reports with full IOC sets.
✻ Cooked for 5m 58s
Codex Unpacker Test
Prompt Used: None (I invoked the skill in the same context window as re-ioc-extractor)
Unpacking with Codex took about the same time as Claude, so no real performance boost or penalty. However, for this skill the results were vastly different.
Unpacker Report from Codex:
• Using re-unpacker on Cerber5.exe (static-only).
Packing Assessment
- Verdict: likely not packed
- Confidence: high
- Evidence:
- upx: .\Cerber5.exe: NotPackedException: not packed by UPX
- NumberOfSections: 4 with conventional PE sections: .text, .rdata, .data, .rsrc
- Entropy is not packer-typical across all sections:
SECTION .text ... ENT=6.290, .rdata ENT=5.684, .data ENT=2.705, .rsrc ENT=3.628
- Import table is rich (not minimal-stub-like):
ImportDLLCount: 9 and ImportFunctionCount: 323
- file.exe identifies a normal Windows PE:
PE32 executable (GUI) Intel 80386, for MS Windows
- Contextual note:
- Strings are present (Lines: 789 with strings -n 6), which also argues against heavy
runtime-only unpacking.
- FLOSS/capa were attempted but blocked in this sandbox by PyInstaller DLL extraction
permission errors.
Unpacking Plan (Prioritized)
1. Preserve baseline and provenance
- Keep original hashes: SHA256/SHA1/MD5 already collected.
2. Static confirmation pass (no execution)
- Re-run PE section/import analysis (already done with pefile) and archive output as
baseline.
- If available in your normal IR VM, run diec for a second packer-signature opinion.
3. Only attempt static unpack if evidence appears
- Current evidence does not justify upx -d or other unpacker runs.
4. Dynamic path only if new evidence indicates runtime unpacking
- Trigger condition: sandbox telemetry shows RWX+execute, stub-to-new-region jump,
hollowing/injection.
- If triggered, require engineer approval and controlled VM before execution.
Unpacking Report
- Original sample:
- Path: .\Cerber5.exe
- SHA256: b3e1e9d97d74c416c2a30dd11858789af5554cf2de62f577c13944a19623777d
- SHA1: c07dfdea8da2da5bad036e7c2f5d37582e1cf684
- MD5: fe1bc60a95b2c2d77cd5d232296a7fa4
- File type: PE32 executable (GUI) Intel 80386, for MS Windows
- Compile timestamp: 2017-07-22T09:03:05Z (from PE header)
- Artifacts produced:
- Unpacked/dumped artifacts: none
- Provenance: no packer confirmed; no dynamic execution performed.
- Environment/safety notes:
- Static-only analysis performed.
- No sample execution or detonation performed.
- FLOSS/capa blocked by sandbox DLL extraction permission issue.
Next Steps
1. Continue with static RE (functions/config extraction) rather than unpacking, since
packing evidence is weak.
2. If you want runtime confirmation, provide sandbox outputs (process/memory telemetry);
I’ll map stub-to-payload transitions.
3. Re-run IOC extraction on any new artifacts/logs from controlled sandbox telemetry
(especially memory dump outputs).
Full Summary of Results
This follow-up was meant to answer a simple question: if I take the same reverse engineering skills and run them across Codex and Claude, which one actually handles them better?
I’m not trying to “benchmark AI” in a vacuum here. I’m trying to answer the practical version of the question: which tool helps me produce defensible outputs faster, with fewer mistakes, and with less rework? In reverse engineering and malware analysis, that’s what matters. Not flashy answers — repeatable artifacts.
To keep things controlled, I ran both tools against the same sample (Cerber5.exe) inside a Windows (FLARE-VM) environment, and limited the workflow to static-only with explicit execution gates.
That constraint is important: it forces both tools to operate in the same “safe triage” mode I’d actually use when I’m trying to avoid accidentally detonating something or inventing indicators.
I tested two skills:
re-ioc-extraction (hashes + strings + strict, traceable evidence output)
re-unpacker (static-first packing assessment + unpacking plan with a hard “PAUSE” before detonation)
What I found is that Codex and Claude are both capable — but they’re good in different ways. And the differences show up quickly once you’re actually running commands, collecting evidence, and producing artifacts you’d be willing to hand to another analyst.
Results
IOC extraction (re-ioc-extraction)
Both tools produced usable output, but the style and “shape” of that output were different.
Claude produced more of an analyst report. It read like something you could drop into notes immediately: clearer gap analysis, clearer next steps, and better explanation of what the indicators actually represent (especially when the strings are mostly noise).
Codex produced a more artifact-first output. It stayed closer to the skill contract and treated the world like evidence: if a string is contaminated, it preserved that contamination and labeled it rather than cleaning it up and accidentally turning it into something “more true” than the source data.
The best way I can describe the difference:
Claude helps you tell the story of what you found (and what you didn’t find).
Codex helps you preserve the ground truth of what you observed.
And for IOC extraction, that distinction matters a lot. Most false confidence comes from “cleaning up” data too early.
Unpacking (re-unpacker)
This is where the comparison became the most interesting — and honestly, the most useful.
Both tools ran the same “static-first unpacking” workflow, but they landed on different high-confidence conclusions:
Claude concluded the sample is packed (high confidence) and backed it with a detailed narrative and a strong static-first plan.
Codex concluded the sample is likely not packed (high confidence) and based that primarily on “normal-looking PE structure + rich imports + strings exist,” while also noting that FLOSS/capa were blocked in the sandbox due to permission issues.
So yes — both tools ran the skill, but they did not arrive at the same conclusion.
That disagreement matters because unpacking decisions drive everything downstream:
whether you trust static strings
whether you expect meaningful imports/capa output
whether you invest time in dynamic unpacking + memory dumping
how you interpret “missing” IOCs (because packed samples are often silent statically)
This is also the point where you can’t just say “Tool A is better.” You have to ask: what evidence did each tool actually have access to?
From the data you captured, Claude’s “packed” conclusion leaned heavily on signals like FLOSS/capa results and deeper entry point reasoning. Codex explicitly hit a tooling limitation (FLOSS/capa blocked), which removes two major signals Claude relied on.
So the fair takeaway isn’t “Codex is wrong.” It’s:
This comparison is only fully fair when both tools can see the same evidence.
And right now, Claude produced the stronger unpacking write-up because it had (and used) richer signals.
Advantages and disadvantages
Codex
Advantages
More autonomous execution. Codex tends to run more independently at the beginning, and I can pay less attention while it drives collection and formatting. That matters in practice — because if a tool needs constant steering, it stops being an accelerator and starts being another task to babysit.
Great at strict artifacts. For IOC extraction, Codex preserved “ugly” evidence exactly as-is (e.g., http://www.usertrust.com1, demuus@yandex.ru0) and labeled it as incomplete rather than cleaning it up and pretending it’s perfect. That’s honestly what you want when “no invention” is the rule.
Good workflow behavior with the skill contract. It produces a consistent table + YAML output and even suggests a smart follow-up pattern: generate a separate normalized list while keeping raw evidence unchanged. That’s a real-world approach that prevents accidental data laundering.
Disadvantages
Less narrative depth. Codex outputs are useful, but they’re less “report-like” than Claude. The analysis is there, but it doesn’t read like a thorough investigative write-up. If you’re trying to hand this to leadership or another team, you may still need to write the story yourself.
Leans hard on rg. It likes to use ripgrep quite a bit, which can create friction on Windows depending on shell behavior and how exit codes get interpreted. It’s not a deal-breaker — but it’s one of those portability issues that shows up quickly when you move between environments.
Tooling failures can skew conclusions. In the unpacking test, FLOSS/capa attempts were blocked due to sandbox permission errors. That’s not “Codex is wrong,” but it does mean Codex was missing two signals Claude relied on heavily, and that likely contributed to the packer misclassification.
Claude
Advantages
Stronger reporting output. Claude’s IOC report is more readable and comes with a clean “Gaps & Recommended Next Steps” section that makes it obvious what you can’t know from static evidence and what you should do next. This is especially valuable on packed samples, where static-only evidence will always be incomplete.
Better at interpreting noisy evidence. It separated “certificate chain URLs” from true C2 indicators and called out that many hits are contextual rather than actionable. That’s exactly the kind of nuance that prevents people from filing bad detections.
More convincing unpacking rationale. Claude’s unpacking report is deep: entry point analysis, function count reasoning, entropy signal, and a clear argument that the payload isn’t visible to static tooling — plus a well-defined execution gate. This is the strongest output I saw across both tools for this skill.
Disadvantages
Needs more help with commands. Claude needed a bit more assistance to run through command execution compared to Codex. This may be influenced by how I’m enforcing guardrails, but it’s still a real “operator experience” difference.
More initiative than I expected. It opened Detect-It-Easy on its own during the run, which was new compared to my prior Codex runs. Depending on how strict you want determinism and reproducibility to be, that can be a plus (it’s proactive) or a minus (it’s doing things you didn’t ask for).
Recommendations from Results
If you’re a reader trying to decide what to use, here’s the practical guidance:
1) Use Codex when you want reliable collection + strict artifacts
If your goal is to produce machine-friendly outputs (tables/YAML) and you want the workflow to run with minimal babysitting, Codex is a strong fit. It’s especially good when you care about preserving raw evidence and not polishing data into fiction.
2) Use Claude when you want report-quality reasoning and next steps
If your goal is an investigation narrative, especially on packed samples where static results are sparse, Claude is better at turning imperfect evidence into a clear story, with explicit gaps and recommended next actions.
3) Split IOC output into raw vs normalized candidates
Keep the raw-evidence IOC set untouched, but optionally produce a second “normalized candidates” lane for defenders. Codex already hinted at this pattern, and it’s the right way to stay defensible without being unusable.
Bottom line - Who Wins?
If I had to summarize it in one sentence:
Codex is better at running the workflow and producing strict artifacts with less babysitting; Claude is better at turning imperfect evidence into a clear investigation narrative and unpacking plan. For this test, Claude produced the stronger unpacking conclusion.
Overall, Claude resulted in the better output. However, I think that Codex is not far off and has potential to produce results equivalent to Claude’s output.
And the deeper takeaway is this:
Skills are a great abstraction layer, but the tool that “handles skills better” depends on whether your primary output is an artifact you can operationalize, or a report you can hand to a human.