AI0Day

Specialized AI for security research.

A production API tuned for APT analysis, reverse engineering, vulnerability triage, and Web3 audits. Calibrated for direct technical answers on authorized security work, paired with a curated security-grade RAG corpus.

Endpoint https://api.ai0day.com  ·  Anthropic Messages API compatible  ·  OpenAI Chat Completions compatible

01Capabilities

Each mode invokes a dedicated retrieval pipeline (hybrid dense + sparse retrieval over a curated security corpus) before generation. The underlying model is a security-domain-specialized large language model running on a production-grade GPU inference cluster with long-context support.

APT Detection mode: apt_detection

Threat-actor TTPs, MITRE ATT&CK mapping, kill-chain reconstruction, lateral-movement detection, LOLBin hunting, C2 traffic patterns.

Reverse Engineering mode: reverse_analysis

Disassembly walkthroughs, packing identification, anti-debug bypasses, decompilation guidance, malware family attribution, custom obfuscator analysis.

Vulnerability Triage mode: vuln_triage

CVE analysis, CVSS breakdown, exploit conditions, PoC outlines, kernel and userspace memory-corruption review, mitigation chains.

Web3 Audit mode: web3_audit

Solidity audits (reentrancy, access control, oracle manipulation, flash-loan vectors), proxy upgrade safety, signature malleability, MEV/front-running surface.

02Getting Access

Access is invite-only. Both trial and paid keys are issued manually after a short conversation about your use case. We do this to keep the platform fast for actual security researchers and away from generic abuse.

  1. Email [email protected] with: your name, organization, intended use case (red team / blue team / research / audit firm), and an estimate of your monthly request volume.
  2. We respond within 24 hours with either a trial key or a paid-plan invoice.
  3. You receive a key of the form sk-ai0day-… via email. Store it securely — keys are SHA-256 hashed at rest and cannot be recovered if lost.
  4. Test with the curl snippet below. If anything is wrong, we revoke the key, fix it, and reissue.
curl https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer sk-ai0day-…" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Audit reentrancy in withdraw() function."}],
    "mode": "web3_audit",
    "max_tokens": 800,
    "temperature": 0.0
  }'
Health check: GET https://api.ai0day.com/health returns HTTP 200 with a status JSON when all components are operational. Use it in monitoring before integration goes live.

03Pricing

One product, two simple tiers. We do not meter tokens. We meter access duration. This means you can deploy long-context RAG queries without watching a per-token counter.

Trial

Free · 24 hours

For evaluation. One key per organization, single device.

  • 1,000 requests over 24 hours
  • All 4 modes (apt / reverse / vuln / web3)
  • Anthropic + OpenAI compatible endpoints
  • Email support during trial window
Trial keys are time-limited and cannot be extended. Use the trial to validate, then upgrade.

Billing cycle is per key. We do not auto-charge; renewal is opt-in each month. If your key expires, requests return HTTP 401 key_expired until you renew.

04Use with Claude Code

Claude Code, Anthropic's CLI, can be redirected to AI0Day with two environment variables. Your existing Claude Code workflow (slash commands, MCP servers, hooks) continues to work — only the model and API endpoint change.

Anthropic Messages API
OpenAI-compatible
Direct REST

Set two environment variables. Claude Code sends every request to AI0Day instead of api.anthropic.com. The model is selected automatically by the gateway — you don’t have to choose.

# In ~/.zshrc / ~/.bashrc
export ANTHROPIC_BASE_URL="https://api.ai0day.com"
export ANTHROPIC_AUTH_TOKEN="sk-ai0day-…"

# Restart your shell, then verify:
claude --version
claude -p "Detect APT41 lateral movement signatures in a Linux env."

The gateway auto-detects which security mode to use (apt / reverse / vuln / web3) from the query content and routes to the specialized model. Multi-turn conversation history is supported. Tool use (function calling) is fully supported with native parsing for both Anthropic tools and OpenAI tools formats — pass them in the request body and the model emits structured tool calls. Token-by-token SSE streaming is on the roadmap.

If you prefer OpenAI client conventions (works with openai Python SDK, litellm, continue.dev, etc.):

export OPENAI_BASE_URL="https://api.ai0day.com/v1"
export OPENAI_API_KEY="sk-ai0day-…"

# Python
from openai import OpenAI
cli = OpenAI()
r = cli.chat.completions.create(
    model="ai0day",   # any non-empty string; gateway picks the right model
    messages=[{"role": "user", "content": "Triage CVE-2024-3400."}],
)
print(r.choices[0].message.content)

Mode auto-detected from query. To force a specific mode, append a suffix: model="ai0day-vuln_triage" (or -apt_detection / -reverse_analysis / -web3_audit).

For shell scripting, CI pipelines, or non-SDK environments:

curl https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer sk-ai0day-…" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Reverse engineer 0x55 0x48 0x89 0xe5 in x86_64 context."}
    ],
    "mode": "reverse_analysis",
    "max_tokens": 1024,
    "temperature": 0.0
  }'
Tip: The mode parameter is what triggers the security RAG retrieval pipeline. Without a mode, you get a generic LLM response. Always set mode to one of apt_detection / reverse_analysis / vuln_triage / web3_audit.

05Use Cases

Concrete workflows our users run in production. All examples use Claude Code with ANTHROPIC_BASE_URL pointing at AI0Day.

Reverse engineering a malware sample

$ claude
> /apt @./samples/loader.bin
> Disassemble the entry point and identify the unpacking routine.
> If you find a custom XOR loop, decode the embedded config block
> and tell me the C2 generation algorithm.

APT campaign attribution from telemetry

$ claude -p "Given these process-tree events from a Linux endpoint, \
which APT group's TTPs match best? Map to MITRE ATT&CK and \
suggest detection rules I can deploy in Falco."

Smart-contract audit before deploy

$ claude
> Audit contracts/Vault.sol and contracts/Strategy.sol.
> Flag every reentrancy path, oracle dependency, and access-control
> bypass. Output a severity-sorted finding list with line refs.

CVE triage in a CI pipeline

#!/bin/bash
# .github/workflows/triage.sh
curl -s https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer $AI0DAY_KEY" \
  -d "{\"messages\":[{\"role\":\"user\",\"content\":\"$CVE_DESC\"}],\
       \"mode\":\"vuln_triage\",\"max_tokens\":1500}" \
  | jq -r '.content' > triage_report.md

06FAQ

What model is behind AI0Day?

A specialized large language model calibrated for security research workflows. We do not disclose the specific base model, training methodology, or hardware configuration — those are operational details we keep internal. What we do disclose: the model is tuned on a curated, professionally-vetted security corpus, runs on dedicated production inference infrastructure, and is calibrated to provide direct technical answers for authorized penetration testing, CTF practice, vulnerability research, and reverse engineering. Adversarial-set refusal rate (Anthropic AdvBench15 + custom security probes) measures 96.7% — the model balances genuine helpfulness for legitimate research with appropriate refusal of clearly malicious requests.

How is my data handled?

Requests are processed in-memory only. The gateway logs request metadata (timestamp, key id, mode, latency, status) for billing and abuse detection. Request bodies and response content are not persisted. The RAG corpus is read-only public security data.

What's the latency?

P50 around 12s, P95 around 30s for typical RAG queries (rag_k=3, max_tokens=512). Long-context queries (rag_k=8, max_tokens=1024) are ~14–22s. The latency budget is dominated by GPU inference time, not network.

Can I get higher concurrency?

The default rate limit is 5 concurrent globally, 2 per key. For higher concurrency contact us — we provision a dedicated inference cluster starting at $50K/month.

Do you support function calling / tool use?

Yes — fully supported. Both the Anthropic Messages and OpenAI Chat Completions endpoints accept tools in the request body and the model emits structured tool calls in the corresponding response format (tool_use blocks for Anthropic, tool_calls for OpenAI). Auto tool choice is enabled by default. Multi-step tool-call workflows (model calls tool → you execute → return result → model continues) work the same as with the upstream APIs, so existing agent frameworks (Claude Code, OpenAI agents SDK, LangChain, etc.) work without code changes.

What if a key gets compromised?

Email us. We disable it within minutes and reissue. Device fingerprints are tracked per key — if more than 3 distinct devices use the same key in 24 hours, we automatically flag it.