Try It
Describe any AI system — its stack, tools, and endpoints. The 5-agent pipeline maps its attack surface, probes defenses, and generates a remediation report.
Description
Why it matters: Most AI systems are deployed without red-teaming. Prompt injection, indirect injection via tools, privilege escalation through MCP endpoints, and guardrail bypasses are real attack surfaces that go untested. This pipeline automates the first pass.
Scope: This is a reasoning-based simulation — the agents analyze and reason about vulnerabilities given a target description. It does not send live traffic to real endpoints.
Pipeline Architecture
AI Red-Team Pipeline
══════════════════════════════════════════════════════
[User: Target Description]
│
▼
┌─────────────────────────────┐
│ PHASE 1 — AGENT-01-OSINT │ Surface mapping, trust boundary enumeration,
│ Target Ingestion │ tool discovery, guardrail identification
└─────────────────────────────┘
│ TARGET_PROFILE (YAML)
▼
┌─────────────────────────────┐
│ PHASE 2 — AGENT-02-REDTEAM │ 5–8 baseline probe attacks — injection vectors,
│ Baseline Probing │ fuzzing, boundary testing
└─────────────────────────────┘
│ ROUND_01_ATTACK_TELEMETRY
▼
┌─────────────────────────────┐
│ PHASE 3 — AGENT-03-AUDITOR │ SRI classification of each probe,
│ Structural Analysis │ ranked Vulnerability Matrix
└─────────────────────────────┘
│ DEEP_VULNERABILITY_MATRIX
▼
┌─────────────────────────────┐
│ PHASE 4 — AGENT-02-REDTEAM │ Adaptive mutations targeting confirmed weaknesses:
│ Deep Penetration │ Unicode encoding, XML wrapping, multi-turn sequences
└─────────────────────────────┘
│ DEEP_BREACH_ANALYTICS
▼
┌─────────────────────────────┐
│ PHASE 5 — AGENT-04-ZTC │ Executive summary, confirmed exploit chains,
│ Threat Assessment │ concrete remediation blueprint
└─────────────────────────────┘
│
▼
[Definitive Security Attestation Report]Each agent's output becomes the next agent's input. Phase prompts are loaded live from hosted MD files — swap the files to change agent behavior without redeploying.Dev Notes
Hot-Swappable Prompts
Each phase's system prompt is a Markdown file hosted on the portfolio site. The API route fetches them at runtime — change agent behavior by editing an MD file, no redeploy needed.
Streaming Architecture
The API route returns an NDJSON stream. Each line is a completed phase result. The UI renders each phase card as it arrives rather than waiting for the full pipeline to finish.
Serial Intelligence Chain
Agents don't run in parallel — each receives the previous agent's full output as its user message. This gives the pipeline memory and lets later agents build on earlier findings.