API Documentation
Toxly Content Safety API
Classify user-generated content with structured decisions, risk scores, policy matches, category scores, and privacy-conscious moderation logs.
https://dashboard.toxly.netCore concepts
Project
A workspace for one product, environment, or customer. Projects own API keys, policies, logs, usage, and limits.
API key
A secret token that authenticates API requests. Keys are stored hashed and can be revoked without deleting the project.
Policy
A list of category thresholds and actions. Policies decide whether content is allowed, warned, reviewed, blocked, masked, or escalated.
Provider
The classifier behind moderation. Toxly can use fast rules now and is prepared for Ollama, vLLM, and other model providers.
Quickstart
- Create an account in the dashboard.
- Create a project for your product or environment.
- Create an API key and copy it once.
- Send text to
POST /v1/moderate/text. - Use
decision,allowed, andrisk_scorein your app logic.
curl -X POST https://dashboard.toxly.net/v1/moderate/text \
-H "Content-Type: application/json" \
-H "X-Toxly-Key: txly_xxxxx" \
-d '{"text":"message from your app"}'
Authentication
All public API requests require the X-Toxly-Key header. Never expose this key in frontend code. Call Toxly from your backend, serverless function, worker, or trusted infrastructure.
X-Toxly-Key: txly_xxxxxxxxx
X-Toxly-Keytxly_...POST /v1/moderate/text
Moderates one text input and returns a structured policy decision. This is the primary endpoint for comments, chat messages, prompts, bios, support tickets, and other user-generated text.
https://dashboard.toxly.net/v1/moderate/text
Minimal request
{
"text": "hello world"
}
Request with policy and metadata
{
"text": "message from a user",
"policy_id": 12,
"metadata": {
"source": "comment",
"user_id": "usr_123",
"locale": "en"
}
}
Request fields
Response format
Every successful moderation response follows the same shape, so you can route decisions consistently inside your product.
{
"request_id": "mod_xxx",
"allowed": false,
"decision": "block",
"risk_score": 0.82,
"categories": {
"toxicity": 0.74,
"hate": 0.82,
"harassment": 0.12,
"violence": 0.0,
"self_harm": 0.0,
"sexual": 0.0,
"minor_safety": 0.0,
"spam": 0.0,
"scam": 0.0,
"pii": 0.02,
"jailbreak": 0.0,
"prompt_injection": 0.0
},
"matched_rules": [
{"category": "hate", "threshold": 0.70, "decision": "block"}
],
"reason": "Policy threshold matched.",
"latency_ms": 220
}
Decisions
Categories
Scores are floats between 0.0 and 1.0. Higher values mean higher confidence or higher risk for that category.
Policies
Policies convert classifier scores into product decisions. A policy rule compares one category score against a threshold and applies an action when the threshold is exceeded.
{
"rules": [
{"category": "hate", "threshold": 0.70, "action": "block"},
{"category": "self_harm", "threshold": 0.65, "action": "escalate"},
{"category": "pii", "threshold": 0.80, "action": "mask"}
]
}
If multiple rules match, Toxly chooses the strongest relevant action based on the policy evaluation order.
Default policy
New projects start with a default policy designed for a strict MVP moderation flow.
Integration examples
JavaScript fetch
const response = await fetch("https://dashboard.toxly.net/v1/moderate/text", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-Toxly-Key": process.env.TOXLY_API_KEY
},
body: JSON.stringify({ text: "message from your app" })
});
const result = await response.json();
if (!result.allowed) {
// block, review, mask, or escalate in your product flow
}
Python requests
import os
import requests
response = requests.post(
"https://dashboard.toxly.net/v1/moderate/text",
headers={
"Content-Type": "application/json",
"X-Toxly-Key": os.environ["TOXLY_API_KEY"],
},
json={"text": "message from your app"},
timeout=5,
)
result = response.json()
print(result["decision"], result["risk_score"])
Backend routing example
switch (result.decision) {
case "allow":
publishContent();
break;
case "warn":
showUserWarning();
break;
case "review":
sendToModerationQueue();
break;
case "block":
rejectContent();
break;
case "mask":
redactContent();
break;
case "escalate":
triggerSafetyWorkflow();
break;
}
Rate limits
Rate limits are enforced per API key and project plan. If the limit is exceeded, Toxly returns 429.
X-RateLimit-Limit: 60 X-RateLimit-Remaining: 42 X-RateLimit-Reset: 1760000000
Logs & privacy
Toxly stores moderation results for debugging and usage visibility, but avoids retaining the complete original text in moderation logs.
AI providers
Toxly uses a provider interface so moderation can move between fast rules, local models, and OpenAI-compatible endpoints.
Dummy provider
Fast rule-based checks for immediate local testing and predictable fallback behavior.
Ollama provider
Prepared for local model inference through Ollama. Useful for self-hosted or private deployments.
vLLM provider
Prepared for OpenAI-compatible chat completions served by vLLM or similar infrastructure.
OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_MODEL=llama-guard3:1b VLLM_BASE_URL=http://localhost:8000/v1 VLLM_MODEL=toxly-guard VLLM_API_KEY=local-key
Dashboard
The dashboard is available at dashboard.toxly.net. It is designed for project management and operational moderation workflows.
Errors
Best practices
- Call Toxly from your backend, not directly from a browser.
- Use separate projects for production, staging, and experiments.
- Start strict for high-risk surfaces like public chat and profiles.
- Route
reviewto a moderation queue instead of silently allowing it. - Escalate high self-harm risk into a human-reviewed safety workflow.
- Do not log API keys, full user text, passwords, or secrets.
- Use timeouts and fallback behavior in your own application.
Roadmap
The current API focuses on text moderation. Image moderation is planned with the same response shape so product logic can stay consistent.
POST /v1/moderate/image multipart/form-data: image: file text: optional string