Published adversarial evaluations of network intrusion detection systems routinely report evasion rates that cannot be reproduced on real networks. The cause: gradient-based attacks (FGSM, PGD) optimize over feature spaces that include physically impossible values — negative TTL, sub-zero packet counts, impossible flag combinations. Unconstrained PGD reports 79% evasion on UNSW-NB15. Constrained PGD — the only result achievable on a real network — reports 7%. The 72 pp gap (+67 pp on NSL-KDD) is consistent across three datasets and three classifier architectures (MLP, Random Forest, XGBoost), three independent seeds each. Released netadv, an open-source domain-constrained adversarial evaluation library with 65 unit tests and reproducible Colab benchmarks.
Mike Sasso
Security Researcher · Adversarial ML · Detection Engineering · Hawaii
I study where security systems fail under adversarial pressure. Primary focus: whether published adversarial evaluations of network intrusion detection systems measure achievable attack success — or whether they measure how easily an optimizer exploits feature values that cannot exist on a real network. Preparing a paper for ACM CCS 2027.
// research
Research
Full adversarial ML lifecycle on 2.54M real network flows. Baseline XGBoost F1: 0.9640. SHAP identifies the feature regions that determine correct classification — and those same regions are the attack surface. Black-box transfer attack: adversarial examples crafted against an MLP surrogate evade XGBoost at 16–18%, a 15× increase in false negatives, with zero access to target weights. Defense: Madry PGD adversarial training holds F1 at 0.9494 at ε=0.20, vs. collapse to 0.27 for the standard model.
Three classifiers attacked after training via GradCAM-guided FGSM and PGD perturbations. Key finding: the regions GradCAM identifies as most important for correct classification are the same regions most vulnerable to adversarial perturbation. Explainability output maps directly to the attack surface — a result with direct implications for any ML-based security system that exposes saliency or attribution data to users or downstream systems.
// projects
Projects
Dual-agent forensic pipeline for autonomous Windows disk image investigation. Phase 1: deterministic triage — 25 SIFT commands, log-odds scoring across 9 MITRE techniques, no LLM hallucination surface. Phase 2: agentic investigation — Claude sequences tool calls across event logs, prefetch, registry, MFT, shellbags. Phase 3: adversarial auditor — receives findings only, mandate to refute. Red vs. Blue training loop: 1,245 evasion variants evolved, 83 detection signals learned.
Autonomous incident investigation triggered by a single SIEM alert. Correlates brute-force, lateral movement, and credential access logs across Splunk data lakes, maps to MITRE ATT&CK, generates analyst-ready IR reports. All SPL templates are read-only — no write-back commands in the codebase. Input fields validated against strict regex allowlists. A malicious model output cannot modify data.
Autonomous IR agent with hybrid semantic + ES|QL search over 73,909 real Windows attack events. Session-scoped write-back memory blocks IOC contamination between investigations architecturally. Memory content sanitized before every Elasticsearch write to block indirect prompt injection via poisoned retrieval. Independent forensic auditor labels every MITRE claim VERIFIED / REFUTED / UNVERIFIABLE with raw event evidence.
Reverse-engineered a legitimate RMM binary repurposed as a threat actor delivery vehicle. Decompiled the MSI payload, extracted hardcoded C2 credentials and attacker telemetry tokens, and mapped the full staging environment — including a leaked infrastructure blueprint exposed through an unsecured staging endpoint. Findings fed directly into detection logic and client incident response.
C-based file encryption and exfiltration tool built from primitives: TEA block cipher from spec, PKCS#7 padding, raw TCP socket C2, compiler-resistant key wiping. Built to understand what an attacker builds at the lowest level — the memory layout, socket patterns, wipe timing — so the detection is grounded in the primitive, not the abstraction.
// writing
Writing
Technical analyses, adversarial emulation breakdowns, detection engineering notes, and infrastructure writeups.
All writeups →