Knowledge Bases
Knowledge bases store Q&A data your agents retrieve from during calls using semantic search. They are shared org-wide and attached to specific agents viaknowledge_base_ids on the agent object.
1. Simple FAQ
Create KB → add entries manually → attach to agent
2. Website crawl
Create KB with source_url → auto-crawled on create → scheduled refresh
3. Document upload
Create KB → upload PDF/DOCX/TXT → parsed into entries async
4. Hybrid
Crawl your docs site, then add manual corrections for edge cases
The knowledge base object
{
"public_id": "kb_00000001",
"name": "Product FAQ",
"description": "Common product questions",
"status": "active",
"source_url": "https://help.yourcompany.com",
"crawl_frequency": "weekly",
"auto_refresh_enabled": true,
"last_crawled_at": "2026-01-14T06:00:00Z",
"last_refresh_status": "idle",
"max_depth": 3,
"max_pages": 100,
"entry_count": 247,
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-01-14T06:01:00Z"
}Field notes
statusAlways "active". Reserved for future use.last_refresh_status"idle" | "crawling" | "done" | "failed" — reflects the most recent crawl or upload operationauto_refresh_enabledIf false, crawl_frequency is ignored and no scheduled re-crawls occur. Manual POST /refresh still works./v1/knowledge-basescurl -X POST https://api.staffifyai.com/v1/knowledge-bases \
-H "Authorization: Bearer sfy_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Help Center",
"source_url": "https://help.yourcompany.com",
"crawl_frequency": "weekly",
"auto_refresh_enabled": true,
"max_depth": 3,
"max_pages": 200
}'Note: setting source_url on create triggers an immediate crawl. Changing it to a different URL via PATCH deletes all crawled entries and re-crawls from scratch. PATCHing with the same URL that is already set does nothing (no re-crawl, no entry deletion). Setting source_url to null removes the source URL and sets status to idle.
| crawl_frequency | Schedule |
|---|---|
| "weekly" | Re-crawls every 7 days |
| "biweekly" | Re-crawls every 14 days |
| "monthly" | Re-crawls every 30 days |
| null | No scheduled auto-refresh — manual only via POST /refresh |
/v1/knowledge-bases/:kbId/entriesAdd Q&A entries manually. Send an array of up to 100 objects. Max 8,000 characters combined per entry.
curl -X POST https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/entries \
-H "Authorization: Bearer sfy_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '[
{ "q": "What are your opening hours?", "a": "Monday-Friday 9am-6pm CET." },
{ "q": "Do you offer free shipping?", "a": "Yes, free shipping over EUR 50." },
{ "q": "How do I cancel my order?", "a": "Email [email protected] within 24h." }
]'
# Response
{ "added": 3 }Response is { "added": N } — the count of entries inserted. Embeddings are generated automatically; entries that fail embedding get one automatic retry. Entries with no embedding can be retried later via POST /entries/:entryId/retry-embedding.
If your org has reached the plan entry limit, the request returns 403 with an error message indicating the current count and plan cap.
/v1/knowledge-bases/:kbId/entries/uploadUpload a document and extract Q&A entries asynchronously. Accepts .pdf, .docx, .doc, .txt (max 50MB). Returns a job_id to poll.
curl -X POST https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/entries/upload \ -H "Authorization: Bearer sfy_live_YOUR_KEY" \ -F "[email protected]" # Response { "job_id": "42", "status": "queued" }
Full job response fields
job_idstringstatuswaiting | active | completed | failed | delayedprogressobject — present while activeresultpresent when completed (includes entries_added)errorpresent when failedUpload errors (synchronous, before job is queued)
400No file uploaded400Unsupported file type (must be pdf, docx, doc, or txt)400File exceeds 50 MB limit400"File content does not match its declared type" — magic bytes validation failed (e.g. a .pdf extension on a non-PDF binary)404Knowledge base not found/v1/knowledge-bases/:kbId/searchSemantic similarity search across entries. Use to test what your agent will retrieve for a given query.
| Param | Default | Description |
|---|---|---|
| q | required | Search query text |
| limit | 3 | Max results (1–20) |
| min_similarity | 0.35 | Minimum similarity threshold (0–1). Raise to 0.70+ for strict matches. |
curl "https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/search?q=return+policy&limit=5" \
-H "Authorization: Bearer sfy_live_YOUR_KEY"
# Response
{
"results": [
{ "id": 1, "question": "What is your return policy?", "answer": "30 days...", "similarity": 0.9124 },
{ "id": 7, "question": "Can I return a sale item?", "answer": "Yes, if...", "similarity": 0.7831 }
],
"query": "return policy",
"total_entries": 247
}min_similarity default is 0.35 (good for broad retrieval). Raise to 0.70+ for strict matching where you only want highly relevant results./v1/knowledge-bases/:kbId/refreshTrigger an immediate re-crawl of the configured source_url. Rate limited to once per 12 hours. Returns 409 if already crawling. Returns 429 if within the cooldown window -- the response body includes available_in_hours indicating how many hours remain before you can refresh again (e.g. {"error": "...", "available_in_hours": 4.5}).
curl -X POST https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/refresh \
-H "Authorization: Bearer sfy_live_YOUR_KEY"
# Response
{ "job_id": "44", "status": "queued" }Other endpoints
/v1/knowledge-basesList all KBs. Query params: limit (default 20, max 100), after (cursor). Response: { knowledge_bases: [...], next_cursor }.
/v1/knowledge-bases/:kbIdGet a single KB. Response: { knowledge_base: { ... } }.
/v1/knowledge-bases/:kbIdUpdate KB settings (same fields as create). Only fields you send are updated. Response: { knowledge_base: { ... } }.
/v1/knowledge-bases/:kbIdDelete KB and all its entries. Also detaches the KB from any agents that reference it. Response: { deleted: true, id: N }.
/v1/knowledge-bases/:kbId/entriesList entries. Query params: limit (default 50, max 100), after (cursor). Response: { entries: [{ id, question, answer, source_type, created_at, has_embedding, usage_count }], next_cursor, has_more, total }.
/v1/knowledge-bases/:kbId/entries/:entryIdUpdate an entry. Body: { q?, a? } — at least one required. Re-generates the embedding automatically. Response: { id, question, answer, has_embedding }.
/v1/knowledge-bases/:kbId/entries/:entryIdDelete a single entry. Response: { deleted: true, id: N }.
/v1/knowledge-bases/:kbId/entries/crawlOne-off URL crawl (does not change source_url). Body: { url, max_pages: 1–500 (default 50), max_depth: 1–5 (default 2) }. Returns { job_id, status: "queued" }.
/v1/knowledge-bases/:kbId/jobs/:jobIdPoll upload or crawl job status. Response: { job_id, status, progress, result?, error? }.
/v1/knowledge-bases/:kbId/entries/:entryId/retry-embeddingRetry failed embedding for an entry where has_embedding is false. No-op (returns { retried: false }) if embedding already exists. Response: { retried: true | false }.