API Reference

Knowledge Bases

Knowledge bases store Q&A data your agents retrieve from during calls using semantic search. They are shared org-wide and attached to specific agents viaknowledge_base_ids on the agent object.

1. Simple FAQ

Create KB → add entries manually → attach to agent

2. Website crawl

Create KB with source_url → auto-crawled on create → scheduled refresh

3. Document upload

Create KB → upload PDF/DOCX/TXT → parsed into entries async

4. Hybrid

Crawl your docs site, then add manual corrections for edge cases

The knowledge base object

{
  "public_id":            "kb_00000001",
  "name":                 "Product FAQ",
  "description":          "Common product questions",
  "status":               "active",
  "source_url":           "https://help.yourcompany.com",
  "crawl_frequency":      "weekly",
  "auto_refresh_enabled": true,
  "last_crawled_at":      "2026-01-14T06:00:00Z",
  "last_refresh_status":  "idle",
  "max_depth":            3,
  "max_pages":            100,
  "entry_count":          247,
  "created_at":           "2026-01-01T00:00:00Z",
  "updated_at":           "2026-01-14T06:01:00Z"
}

Field notes

statusAlways "active". Reserved for future use.

last_refresh_status"idle" | "crawling" | "done" | "failed" — reflects the most recent crawl or upload operation

auto_refresh_enabledIf false, crawl_frequency is ignored and no scheduled re-crawls occur. Manual POST /refresh still works.

POST/v1/knowledge-bases

curl -X POST https://api.staffifyai.com/v1/knowledge-bases \
  -H "Authorization: Bearer sfy_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Help Center",
    "source_url": "https://help.yourcompany.com",
    "crawl_frequency": "weekly",
    "auto_refresh_enabled": true,
    "max_depth": 3,
    "max_pages": 200
  }'

Note: setting source_url on create triggers an immediate crawl. Changing it to a different URL via PATCH deletes all crawled entries and re-crawls from scratch. PATCHing with the same URL that is already set does nothing (no re-crawl, no entry deletion). Setting source_url to null removes the source URL and sets status to idle.

crawl_frequency	Schedule
"weekly"	Re-crawls every 7 days
"biweekly"	Re-crawls every 14 days
"monthly"	Re-crawls every 30 days
null	No scheduled auto-refresh — manual only via POST /refresh

POST/v1/knowledge-bases/:kbId/entries

Add Q&A entries manually. Send an array of up to 100 objects. Max 8,000 characters combined per entry.

curl -X POST https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/entries \
  -H "Authorization: Bearer sfy_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '[
    { "q": "What are your opening hours?", "a": "Monday-Friday 9am-6pm CET." },
    { "q": "Do you offer free shipping?",  "a": "Yes, free shipping over EUR 50." },
    { "q": "How do I cancel my order?",    "a": "Email [email protected] within 24h." }
  ]'

# Response
{ "added": 3 }

Response is { "added": N } — the count of entries inserted. Embeddings are generated automatically; entries that fail embedding get one automatic retry. Entries with no embedding can be retried later via POST /entries/:entryId/retry-embedding.

If your org has reached the plan entry limit, the request returns 403 with an error message indicating the current count and plan cap.

POST/v1/knowledge-bases/:kbId/entries/upload

Upload a document and extract Q&A entries asynchronously. Accepts .pdf, .docx, .doc, .txt (max 50MB). Returns a job_id to poll.

curl -X POST https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/entries/upload \
  -H "Authorization: Bearer sfy_live_YOUR_KEY" \
  -F "[email protected]"

# Response
{ "job_id": "42", "status": "queued" }

Full job response fields

job_idstring

statuswaiting | active | completed | failed | delayed

progressobject — present while active

resultpresent when completed (includes entries_added)

errorpresent when failed

Upload errors (synchronous, before job is queued)

400No file uploaded

400Unsupported file type (must be pdf, docx, doc, or txt)

400File exceeds 50 MB limit

400"File content does not match its declared type" — magic bytes validation failed (e.g. a .pdf extension on a non-PDF binary)

404Knowledge base not found

GET/v1/knowledge-bases/:kbId/search

Semantic similarity search across entries. Use to test what your agent will retrieve for a given query.

Param	Default	Description
q	required	Search query text
limit	3	Max results (1–20)
min_similarity	0.35	Minimum similarity threshold (0–1). Raise to 0.70+ for strict matches.

curl "https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/search?q=return+policy&limit=5" \
  -H "Authorization: Bearer sfy_live_YOUR_KEY"

# Response
{
  "results": [
    { "id": 1, "question": "What is your return policy?", "answer": "30 days...", "similarity": 0.9124 },
    { "id": 7, "question": "Can I return a sale item?",   "answer": "Yes, if...", "similarity": 0.7831 }
  ],
  "query": "return policy",
  "total_entries": 247
}

min_similarity default is 0.35 (good for broad retrieval). Raise to 0.70+ for strict matching where you only want highly relevant results.

POST/v1/knowledge-bases/:kbId/refresh

Trigger an immediate re-crawl of the configured source_url. Rate limited to once per 12 hours. Returns 409 if already crawling. Returns 429 if within the cooldown window -- the response body includes available_in_hours indicating how many hours remain before you can refresh again (e.g. {"error": "...", "available_in_hours": 4.5}).

curl -X POST https://api.staffifyai.com/v1/knowledge-bases/kb_00000001/refresh \
  -H "Authorization: Bearer sfy_live_YOUR_KEY"

# Response
{ "job_id": "44", "status": "queued" }

Other endpoints

GET

/v1/knowledge-bases

List all KBs. Query params: limit (default 20, max 100), after (cursor). Response: { knowledge_bases: [...], next_cursor }.

GET

/v1/knowledge-bases/:kbId

Get a single KB. Response: { knowledge_base: { ... } }.

PATCH

/v1/knowledge-bases/:kbId

Update KB settings (same fields as create). Only fields you send are updated. Response: { knowledge_base: { ... } }.

DELETE

/v1/knowledge-bases/:kbId

Delete KB and all its entries. Also detaches the KB from any agents that reference it. Response: { deleted: true, id: N }.

GET

/v1/knowledge-bases/:kbId/entries

List entries. Query params: limit (default 50, max 100), after (cursor). Response: { entries: [{ id, question, answer, source_type, created_at, has_embedding, usage_count }], next_cursor, has_more, total }.

PATCH

/v1/knowledge-bases/:kbId/entries/:entryId

Update an entry. Body: { q?, a? } — at least one required. Re-generates the embedding automatically. Response: { id, question, answer, has_embedding }.

DELETE

/v1/knowledge-bases/:kbId/entries/:entryId

Delete a single entry. Response: { deleted: true, id: N }.

POST

/v1/knowledge-bases/:kbId/entries/crawl

One-off URL crawl (does not change source_url). Body: { url, max_pages: 1–500 (default 50), max_depth: 1–5 (default 2) }. Returns { job_id, status: "queued" }.

GET

/v1/knowledge-bases/:kbId/jobs/:jobId

Poll upload or crawl job status. Response: { job_id, status, progress, result?, error? }.

POST

/v1/knowledge-bases/:kbId/entries/:entryId/retry-embedding

Retry failed embedding for an entry where has_embedding is false. No-op (returns { retried: false }) if embedding already exists. Response: { retried: true | false }.

← Agents Calls →