AI Red-Team Agent

5-phase pipeline that hunts vulnerabilities in AI systems

Try It

Describe any AI system — its stack, tools, and endpoints. The 5-agent pipeline maps its attack surface, probes defenses, and generates a remediation report.

Description

What it does: Describe any AI system and a 5-agent pipeline runs against it. Each agent is a specialized LLM with a different role — surface mapping, attack simulation, structural auditing, adaptive mutation, and final remediation. The output of each phase becomes the input to the next, forming a serial intelligence chain.

Why it matters: Most AI systems are deployed without red-teaming. Prompt injection, indirect injection via tools, privilege escalation through MCP endpoints, and guardrail bypasses are real attack surfaces that go untested. This pipeline automates the first pass.

Scope: This is a reasoning-based simulation — the agents analyze and reason about vulnerabilities given a target description. It does not send live traffic to real endpoints.

Pipeline Architecture

  AI Red-Team Pipeline
  ══════════════════════════════════════════════════════

  [User: Target Description]
          │
          ▼
  ┌─────────────────────────────┐
  │  PHASE 1 — AGENT-01-OSINT   │  Surface mapping, trust boundary enumeration,
  │  Target Ingestion           │  tool discovery, guardrail identification
  └─────────────────────────────┘
          │  TARGET_PROFILE (YAML)
          ▼
  ┌─────────────────────────────┐
  │  PHASE 2 — AGENT-02-REDTEAM │  5–8 baseline probe attacks — injection vectors,
  │  Baseline Probing           │  fuzzing, boundary testing
  └─────────────────────────────┘
          │  ROUND_01_ATTACK_TELEMETRY
          ▼
  ┌─────────────────────────────┐
  │  PHASE 3 — AGENT-03-AUDITOR │  SRI classification of each probe,
  │  Structural Analysis        │  ranked Vulnerability Matrix
  └─────────────────────────────┘
          │  DEEP_VULNERABILITY_MATRIX
          ▼
  ┌─────────────────────────────┐
  │  PHASE 4 — AGENT-02-REDTEAM │  Adaptive mutations targeting confirmed weaknesses:
  │  Deep Penetration           │  Unicode encoding, XML wrapping, multi-turn sequences
  └─────────────────────────────┘
          │  DEEP_BREACH_ANALYTICS
          ▼
  ┌─────────────────────────────┐
  │  PHASE 5 — AGENT-04-ZTC     │  Executive summary, confirmed exploit chains,
  │  Threat Assessment          │  concrete remediation blueprint
  └─────────────────────────────┘
          │
          ▼
  [Definitive Security Attestation Report]
Each agent's output becomes the next agent's input. Phase prompts are loaded live from hosted MD files — swap the files to change agent behavior without redeploying.

Dev Notes

Hot-Swappable Prompts

Each phase's system prompt is a Markdown file hosted on the portfolio site. The API route fetches them at runtime — change agent behavior by editing an MD file, no redeploy needed.

Streaming Architecture

The API route returns an NDJSON stream. Each line is a completed phase result. The UI renders each phase card as it arrives rather than waiting for the full pipeline to finish.

Serial Intelligence Chain

Agents don't run in parallel — each receives the previous agent's full output as its user message. This gives the pipeline memory and lets later agents build on earlier findings.