A production API tuned for APT analysis, reverse engineering, vulnerability triage, and Web3 audits. Calibrated for direct technical answers on authorized security work, paired with a curated security-grade RAG corpus.
https://api.ai0day.com
· Anthropic Messages API compatible
· OpenAI Chat Completions compatible
Each mode invokes a dedicated retrieval pipeline (hybrid dense + sparse retrieval over a curated security corpus) before generation. The underlying model is a security-domain-specialized large language model running on a production-grade GPU inference cluster with long-context support.
mode: apt_detectionThreat-actor TTPs, MITRE ATT&CK mapping, kill-chain reconstruction, lateral-movement detection, LOLBin hunting, C2 traffic patterns.
mode: reverse_analysisDisassembly walkthroughs, packing identification, anti-debug bypasses, decompilation guidance, malware family attribution, custom obfuscator analysis.
mode: vuln_triageCVE analysis, CVSS breakdown, exploit conditions, PoC outlines, kernel and userspace memory-corruption review, mitigation chains.
mode: web3_auditSolidity audits (reentrancy, access control, oracle manipulation, flash-loan vectors), proxy upgrade safety, signature malleability, MEV/front-running surface.
Access is invite-only. Both trial and paid keys are issued manually after a short conversation about your use case. We do this to keep the platform fast for actual security researchers and away from generic abuse.
sk-ai0day-… via email. Store it securely — keys are SHA-256 hashed at rest and cannot be recovered if lost.curl https://api.ai0day.com/v1/chat \
-H "Authorization: Bearer sk-ai0day-…" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Audit reentrancy in withdraw() function."}],
"mode": "web3_audit",
"max_tokens": 800,
"temperature": 0.0
}'
GET https://api.ai0day.com/health returns HTTP 200 with a status JSON when all components are operational. Use it in monitoring before integration goes live.
One product, two simple tiers. We do not meter tokens. We meter access duration. This means you can deploy long-context RAG queries without watching a per-token counter.
For evaluation. One key per organization, single device.
For production research workflows.
Billing cycle is per key. We do not auto-charge; renewal is opt-in each month. If your key expires, requests return HTTP 401 key_expired until you renew.
Claude Code, Anthropic's CLI, can be redirected to AI0Day with two environment variables. Your existing Claude Code workflow (slash commands, MCP servers, hooks) continues to work — only the model and API endpoint change.
Set two environment variables. Claude Code sends every request to AI0Day instead of api.anthropic.com. The model is selected automatically by the gateway — you don’t have to choose.
# In ~/.zshrc / ~/.bashrc
export ANTHROPIC_BASE_URL="https://api.ai0day.com"
export ANTHROPIC_AUTH_TOKEN="sk-ai0day-…"
# Restart your shell, then verify:
claude --version
claude -p "Detect APT41 lateral movement signatures in a Linux env."
The gateway auto-detects which security mode to use (apt / reverse / vuln / web3) from the query content and routes to the specialized model. Multi-turn conversation history is supported. Tool use (function calling) is fully supported with native parsing for both Anthropic tools and OpenAI tools formats — pass them in the request body and the model emits structured tool calls. Token-by-token SSE streaming is on the roadmap.
If you prefer OpenAI client conventions (works with openai Python SDK, litellm, continue.dev, etc.):
export OPENAI_BASE_URL="https://api.ai0day.com/v1"
export OPENAI_API_KEY="sk-ai0day-…"
# Python
from openai import OpenAI
cli = OpenAI()
r = cli.chat.completions.create(
model="ai0day", # any non-empty string; gateway picks the right model
messages=[{"role": "user", "content": "Triage CVE-2024-3400."}],
)
print(r.choices[0].message.content)
Mode auto-detected from query. To force a specific mode, append a suffix: model="ai0day-vuln_triage" (or -apt_detection / -reverse_analysis / -web3_audit).
For shell scripting, CI pipelines, or non-SDK environments:
curl https://api.ai0day.com/v1/chat \
-H "Authorization: Bearer sk-ai0day-…" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Reverse engineer 0x55 0x48 0x89 0xe5 in x86_64 context."}
],
"mode": "reverse_analysis",
"max_tokens": 1024,
"temperature": 0.0
}'
mode parameter is what triggers the security RAG retrieval pipeline. Without a mode, you get a generic LLM response. Always set mode to one of apt_detection / reverse_analysis / vuln_triage / web3_audit.
Concrete workflows our users run in production. All examples use Claude Code with ANTHROPIC_BASE_URL pointing at AI0Day.
$ claude
> /apt @./samples/loader.bin
> Disassemble the entry point and identify the unpacking routine.
> If you find a custom XOR loop, decode the embedded config block
> and tell me the C2 generation algorithm.
$ claude -p "Given these process-tree events from a Linux endpoint, \
which APT group's TTPs match best? Map to MITRE ATT&CK and \
suggest detection rules I can deploy in Falco."
$ claude
> Audit contracts/Vault.sol and contracts/Strategy.sol.
> Flag every reentrancy path, oracle dependency, and access-control
> bypass. Output a severity-sorted finding list with line refs.
#!/bin/bash
# .github/workflows/triage.sh
curl -s https://api.ai0day.com/v1/chat \
-H "Authorization: Bearer $AI0DAY_KEY" \
-d "{\"messages\":[{\"role\":\"user\",\"content\":\"$CVE_DESC\"}],\
\"mode\":\"vuln_triage\",\"max_tokens\":1500}" \
| jq -r '.content' > triage_report.md
A specialized large language model calibrated for security research workflows. We do not disclose the specific base model, training methodology, or hardware configuration — those are operational details we keep internal. What we do disclose: the model is tuned on a curated, professionally-vetted security corpus, runs on dedicated production inference infrastructure, and is calibrated to provide direct technical answers for authorized penetration testing, CTF practice, vulnerability research, and reverse engineering. Adversarial-set refusal rate (Anthropic AdvBench15 + custom security probes) measures 96.7% — the model balances genuine helpfulness for legitimate research with appropriate refusal of clearly malicious requests.
Requests are processed in-memory only. The gateway logs request metadata (timestamp, key id, mode, latency, status) for billing and abuse detection. Request bodies and response content are not persisted. The RAG corpus is read-only public security data.
P50 around 12s, P95 around 30s for typical RAG queries (rag_k=3, max_tokens=512). Long-context queries (rag_k=8, max_tokens=1024) are ~14–22s. The latency budget is dominated by GPU inference time, not network.
The default rate limit is 5 concurrent globally, 2 per key. For higher concurrency contact us — we provision a dedicated inference cluster starting at $50K/month.
Yes — fully supported. Both the Anthropic Messages and OpenAI Chat Completions endpoints accept tools in the request body and the model emits structured tool calls in the corresponding response format (tool_use blocks for Anthropic, tool_calls for OpenAI). Auto tool choice is enabled by default. Multi-step tool-call workflows (model calls tool → you execute → return result → model continues) work the same as with the upstream APIs, so existing agent frameworks (Claude Code, OpenAI agents SDK, LangChain, etc.) work without code changes.
Email us. We disable it within minutes and reissue. Device fingerprints are tracked per key — if more than 3 distinct devices use the same key in 24 hours, we automatically flag it.