Specialized AI for security research.

A production API tuned for APT analysis, reverse engineering, vulnerability triage, and Web3 audits. Calibrated for direct technical answers on authorized security work, paired with a curated security-grade RAG corpus.

7.4× faster decode (Phase 5, May 2026) 5 specialty modes · auto-detect 96.7% adversarial refusal (AdvBench15)

Request Access Use with Claude Code →

Endpoint https://api.ai0day.com · Anthropic Messages API compatible · OpenAI Chat Completions compatible

01Capabilities

Each mode invokes a dedicated retrieval pipeline (hybrid dense + sparse retrieval over a curated security corpus) before generation. The underlying model is a security-domain-specialized large language model running on a 24× NVIDIA B200 dense inference cluster (4.6TB HBM3e total) with long-context support and on-prem data sovereignty.

APT Detection `mode: apt_detection`

Threat-actor TTPs, MITRE ATT&CK mapping, kill-chain reconstruction, lateral-movement detection, living-off-the-land detection patterns, C2 traffic detection signatures.

Reverse Engineering `mode: reverse_analysis`

Disassembly walkthroughs, packing identification, anti-debug behavior analysis, decompilation guidance, threat-family attribution, obfuscator pattern analysis.

Vulnerability Triage `mode: vuln_triage`

CVE analysis, CVSS breakdown, trigger conditions, defensive PoC outlines, kernel and userspace bug-class triage, mitigation chains.

Web3 Audit `mode: web3_audit`

Solidity audits (reentrancy, access control, oracle manipulation, flash-loan vectors), proxy upgrade safety, signature malleability, MEV/front-running surface.

02Getting Access

Access is invite-only. Both trial and paid keys are issued manually after a short conversation about your use case. We do this to keep the platform fast for actual security researchers and away from generic abuse.

Email [email protected] with: your name, organization, intended use case (red team / blue team / research / audit firm), and an estimate of your monthly request volume.
We respond within 24 hours with either a trial key or a paid-plan invoice.
You receive a key of the form sk-ai0day-… via email. Store it securely — keys are SHA-256 hashed at rest and cannot be recovered if lost.
Test with the curl snippet below. If anything is wrong, we revoke the key, fix it, and reissue.

curl https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer sk-ai0day-…" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Audit reentrancy in withdraw() function."}],
    "mode": "web3_audit",
    "max_tokens": 800,
    "temperature": 0.0
  }'

Health check: GET https://api.ai0day.com/health returns HTTP 200 with a status JSON when all components are operational. Use it in monitoring before integration goes live.

03Pricing

One product, two simple tiers. We do not meter tokens. We meter access duration. This means you can deploy long-context RAG queries without watching a per-token counter.

Alpha

Free · through 2026-05-31

For evaluation during the closed alpha. One key per organization; multi-device OK.

Unlimited requests during alpha (fair use)
All 4 modes (apt / reverse / vuln / web3)
Anthropic + OpenAI compatible endpoints
TG / Mail support during alpha window

Alpha keys expire 2026-05-31. Contact us via TG/Mail to extend or convert to paid. Use the alpha to validate, then upgrade.

Monthly

$10,000 / month / key

For production research workflows.

Sequential calls per key (no concurrent requests); up to 3 distinct devices per key (not simultaneous); device anomaly → key auto-restricted
Long-context support, all 4 modes, 24× NVIDIA B200 cluster (4.6TB HBM3e)
Renewal handled manually 5 days before expiry (TG/Mail)
Compatible with Claude Code (Anthropic), OpenAI clients, and direct REST
Direct Slack / TG channel for ops issues

Onboarding: invoice → PayPal or Stripe (card) → key issued same day.

Billing cycle is per key. We do not auto-charge; renewal is opt-in each month. If your key expires, requests return HTTP 401 key_expired until you renew.

04Use with Claude Code

Claude Code, Anthropic's CLI, can be redirected to AI0Day with two environment variables. Your existing Claude Code workflow (slash commands, MCP servers, hooks) continues to work — only the model and API endpoint change.

Anthropic Messages API

OpenAI-compatible

Direct REST

Set two environment variables. Claude Code sends every request to AI0Day instead of api.anthropic.com. The model is selected automatically by the gateway — you don’t have to choose.

# In ~/.zshrc / ~/.bashrc
export ANTHROPIC_BASE_URL="https://api.ai0day.com"
export ANTHROPIC_AUTH_TOKEN="sk-ai0day-…"

# Restart your shell, then verify:
claude --version
claude -p "Detect APT41 lateral movement signatures in a Linux env."

The gateway auto-detects which security mode to use (apt / reverse / vuln / web3) from the query content and routes to the specialized model. Multi-turn conversation history is supported. Tool use (function calling) is fully supported with native parsing for both Anthropic tools and OpenAI tools formats — pass them in the request body and the model emits structured tool calls. Token-by-token SSE streaming is on the roadmap.

If you prefer OpenAI client conventions (works with openai Python SDK, litellm, continue.dev, etc.):

export OPENAI_BASE_URL="https://api.ai0day.com/v1"
export OPENAI_API_KEY="sk-ai0day-…"

# Python
from openai import OpenAI
cli = OpenAI()
r = cli.chat.completions.create(
    model="ai0day",   # any non-empty string; gateway picks the right model
    messages=[{"role": "user", "content": "Triage CVE-2024-3400."}],
)
print(r.choices[0].message.content)

Mode auto-detected from query. To force a specific mode, append a suffix: model="ai0day-vuln_triage" (or -apt_detection / -reverse_analysis / -web3_audit).

For shell scripting, CI pipelines, or non-SDK environments:

curl https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer sk-ai0day-…" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Reverse engineer 0x55 0x48 0x89 0xe5 in x86_64 context."}
    ],
    "mode": "reverse_analysis",
    "max_tokens": 1024,
    "temperature": 0.0
  }'

Tip: The mode parameter is what triggers the security RAG retrieval pipeline. Without a mode, you get a generic LLM response. Always set mode to one of apt_detection / reverse_analysis / vuln_triage / web3_audit.

05Use Cases

Concrete workflows our users run in production. All examples use Claude Code with ANTHROPIC_BASE_URL pointing at AI0Day.

Analyzing a captured binary sample

$ claude
> /apt @./samples/loader.bin
> Disassemble the entry point and identify the unpacking routine.
> If you find a custom XOR loop, decode the embedded config block
> and tell me the C2 generation algorithm.

APT campaign attribution from telemetry

$ claude -p "Given these process-tree events from a Linux endpoint, \
which APT group's TTPs match best? Map to MITRE ATT&CK and \
suggest detection rules I can deploy in Falco."

Smart-contract audit before deploy

$ claude
> Audit contracts/Vault.sol and contracts/Strategy.sol.
> Flag every reentrancy path, oracle dependency, and access-control
> bypass. Output a severity-sorted finding list with line refs.

CVE triage in a CI pipeline

#!/bin/bash
# .github/workflows/triage.sh
curl -s https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer $AI0DAY_KEY" \
  -d "{\"messages\":[{\"role\":\"user\",\"content\":\"$CVE_DESC\"}],\
       \"mode\":\"vuln_triage\",\"max_tokens\":1500}" \
  | jq -r '.content' > triage_report.md

06FAQ

What model is behind AI0Day?

A specialized large language model calibrated for security research workflows. We do not disclose the specific base model, training methodology, or hardware configuration — those are operational details we keep internal. What we do disclose: the model is tuned on a curated, professionally-vetted security corpus, runs on dedicated production inference infrastructure, and is calibrated to provide direct technical answers for authorized penetration testing, CTF practice, vulnerability research, and reverse engineering. Adversarial-set refusal rate (Anthropic AdvBench15 + custom security probes) measures 96.7% — the model balances genuine helpfulness for legitimate research with appropriate refusal of clearly malicious requests.

How is my data handled?

Requests are processed in-memory only. The gateway logs request metadata (timestamp, key id, mode, latency, status) for billing and abuse detection. Request bodies and response content are not persisted. The RAG corpus is read-only public security data.

What's the latency?

For full-schema technical answers (1500-4000 tokens): P50 ≈ 60s, P95 ≈ 70s. Short queries (under 500 tokens): P50 ≈ 6s. Latency is dominated by GPU inference, not network. Phase 5 compile mode (May 2026) delivered 7.4× decode speedup over the previous eager baseline.

Can I get higher concurrency?

Single-key access is sequential — concurrent calls from the same key are throttled (HTTP 429), and unusual multi-device patterns auto-restrict the key. For high concurrency or domain-specific fine-tuning, we provision an additional dedicated B200 cluster + per-cycle tuning: $50K/month (excludes data and auxiliary costs).

Do you support function calling / tool use?

Yes — fully supported. Both the Anthropic Messages and OpenAI Chat Completions endpoints accept tools in the request body and the model emits structured tool calls in the corresponding response format (tool_use blocks for Anthropic, tool_calls for OpenAI). Auto tool choice is enabled by default. Multi-step tool-call workflows (model calls tool → you execute → return result → model continues) work the same as with the upstream APIs, so existing agent frameworks (Claude Code, OpenAI agents SDK, LangChain, etc.) work without code changes. Routing transparency (verified 2026-05-15): all tool-call requests are served by our on-prem H200×8 vLLM (no external API roundtrip, no third-party). Conversation data stays on-prem. Streaming + tools (real SSE event sequence) is supported on the Anthropic /v1/messages endpoint.

Why does Claude Code ask "please confirm authorization" on every target I give it?

That prompt comes from Anthropic Claude Code CLI's built-in client-side safety layer, not from AI0Day backend. Our backend already injects an OPERATOR_OVERRIDE for all alpha keys that pre-asserts authorization at the system-prompt level — so when you send a URL or CVE, our model immediately works the task. But the Anthropic CLI inserts its own authorization-check before your request leaves the client. One-line fix: add the authorization baseline snippet to your ~/.claude/CLAUDE.md, or switch to a client without a built-in safety layer — Codex CLI, Cursor, Continue.dev, bare httpx, or the OpenAI Python SDK all call AI0Day directly without an agent-layer prompt. Verified 2026-05-15 across multiple alpha cohorts.

What if a key gets compromised?

Email us. We disable it within minutes and reissue. Device fingerprints are tracked per key — if more than 3 distinct devices use the same key in 24 hours, we automatically flag it.

Specialized AI for security research.

01Capabilities

APT Detection mode: apt_detection

Reverse Engineering mode: reverse_analysis

Vulnerability Triage mode: vuln_triage

Web3 Audit mode: web3_audit

02Getting Access

03Pricing

Alpha

Monthly

04Use with Claude Code

05Use Cases

Analyzing a captured binary sample

APT campaign attribution from telemetry

Smart-contract audit before deploy

CVE triage in a CI pipeline

06FAQ

What model is behind AI0Day?

How is my data handled?

What's the latency?

Can I get higher concurrency?

Do you support function calling / tool use?

Why does Claude Code ask "please confirm authorization" on every target I give it?

What if a key gets compromised?

APT Detection `mode: apt_detection`

Reverse Engineering `mode: reverse_analysis`

Vulnerability Triage `mode: vuln_triage`

Web3 Audit `mode: web3_audit`