Knowledge Bank Documentation

Build intelligent applications with the GuruCloud Knowledge Bank. Semantic search, multi-dimensional schemas, batch ingestion, and MCP server integration — all through a clean REST API and Python SDK.

Base URL: https://www.gurucloudai.com/api/v1/kb  •  All responses use a {"data": ...} envelope.

Authentication

All API requests require a KB API key passed as a Bearer token. API keys start with kb_ and can be created from the KB Dashboard.

HTTP Header Authorization: Bearer kb_your_api_key_here

Scopes

API keys can have one or more scopes that control access:

ScopeAccess
readSearch, list entries, get schema, get stats, MCP definition
writeAdd/update/delete entries, batch ingest
adminModify schema, delete KB, manage dimensions, generate PATs

MCP Authentication

MCP endpoints accept both KB API keys (kb_...) and Personal Access Tokens (PATs) as Bearer tokens. You can use your existing KB API key directly — no need to generate a separate token. If you prefer a dedicated MCP-only token, use the generate-pat endpoint.

Installation

Shell pip install gurucloud-kb

Requires Python 3.10+. The only dependency is httpx.

Quick Start

Python from gurucloud_kb import GuruCloudClient client = GuruCloudClient(api_key="kb_your_api_key") # List your Knowledge Banks kbs = client.list_kbs() # Work with a specific KB kb = client.get_kb("your-kb-uuid") # Search results = kb.search("how does authentication work?") # Add an entry kb.add_entry({ "dimensions": { "content": "Auth uses JWT tokens with RS256 signing.", "useful_for": "Understanding auth architecture", } }) # Batch ingest kb.ingest([ {"dimensions": {"content": "Entry 1"}}, {"dimensions": {"content": "Entry 2"}}, ]) # Get MCP server definition for agent injection mcp_def = kb.get_mcp_server_definition()
Python (async) from gurucloud_kb import AsyncGuruCloudClient async with AsyncGuruCloudClient(api_key="kb_your_api_key") as client: # All methods are awaitable kb = await client.get_kb("your-kb-uuid") results = await kb.search("how does auth work?") await kb.ingest([ {"dimensions": {"content": "Entry 1"}}, {"dimensions": {"content": "Entry 2"}}, ]) mcp_def = await kb.get_mcp_server_definition()

Sync Client

GuruCloudClient

The main entry point. Supports context manager for automatic cleanup.

MethodReturnsDescription
list_kbs()list[KBInfo]List all KBs owned by the user
get_kb(kb_id)KnowledgeBankGet a KB object with pre-bound methods
create_kb(name, ...)KnowledgeBankCreate a new KB
update_kb(kb_id, ...)KBInfoUpdate name/description
delete_kb(kb_id)dictDelete a KB (admin scope)
get_mcp_server_definition(kb_id)MCPServerDefinitionGet MCP server URL and tools
create_api_key(name, ...)APIKeyInfoCreate a new API key
list_api_keys()list[APIKeyInfo]List all API keys
delete_api_key(key_id)dictDelete an API key

Async Client

AsyncGuruCloudClient

Identical API to GuruCloudClient, but all methods are async. Uses httpx.AsyncClient under the hood. Supports async with for automatic cleanup.

Python import asyncio from gurucloud_kb import AsyncGuruCloudClient async def main(): async with AsyncGuruCloudClient(api_key="kb_...") as client: kb = await client.get_kb("my-kb") results = await kb.search("query") print(results) asyncio.run(main())

KnowledgeBank Object

Returned by client.get_kb() or client.create_kb(). All methods are pre-bound to the KB's UUID.

Properties

PropertyTypeDescription
idstrKB UUID
namestrHuman-readable name
descriptionstrKB description
entry_countintNumber of entries
total_queriesintTotal search queries
infoKBInfoFull metadata dict

Methods

MethodScopeDescription
search(query, k=10, threshold=0.5)readSemantic search (string or multi-dim)
add_entry(entry)writeAdd a single entry
ingest(entries, deduplicate=True)writeBatch ingest (max 100)
list_entries(limit, offset)readList entries with pagination
get_entry(entry_id)readGet single entry
update_entry(entry_id, updates)writeUpdate entry dimensions
delete_entry(entry_id)writeDelete an entry
get_schema()readGet dimension schema
update_schema(schema)adminReplace full schema
validate_schema(schema)readValidate without applying
add_dimension(dimension)adminAdd a dimension
remove_dimension(name)adminRemove a dimension
get_mcp_server_definition()readGet MCP server URL and tools
generate_pat(token_name)adminGenerate a PAT for MCP auth
get_mcp_config()readGet .mcp.json snippet
get_mcp_tools()readGet MCP tool definitions
get_stats()readPerformance statistics
refresh()readRe-fetch KB info

The search() method accepts either a simple string or a full multi-dimensional search request.

Simple string search

Python results = kb.search("how does auth work?", k=5, threshold=0.7) for r in results: print(r["content"], r["combined_score"])

Multi-dimensional search

Python results = kb.search({ "dimensions": { "content": {"query": "JWT tokens", "weight": 1.0}, "useful_for": {"query": "debugging auth", "weight": 1.5}, }, "k": 5, "threshold": 0.6, })

Schema Management

Each KB has a dimension schema that defines what search dimensions exist and how they're combined.

Python # Get the current schema schema = kb.get_schema() print(schema["dimensions"]) # list of dimension configs # Add a new dimension kb.add_dimension({ "name": "priority", "dimension_type": "single", "description": "Priority level", "searchable": True, }) # Validate before applying warnings = kb.validate_schema(new_schema) if not warnings: kb.update_schema(new_schema)

Ingestion

Single entry

Python kb.add_entry({ "dimensions": { "content": "The API uses rate limiting at 1000 req/hr per key.", "useful_for": "Understanding API limits", "relevant_systems": ["api", "rate-limiting"], }, "source": "docs", "relevant_file_paths": ["routes/api/kb_api.py"], })

Batch ingest

Python result = kb.ingest( entries=[ {"dimensions": {"content": "Entry 1", "useful_for": "..."}}, {"dimensions": {"content": "Entry 2"}}, # ... up to 100 entries per call ], deduplicate=True, # default: skip near-duplicates ) print(f"Ingested: {result['ingested']}, Errors: {len(result['errors'])}")
Batch ingest uses partial failure handling — successful entries are ingested even if some fail. Check result["errors"] for details.

Events & Conflicts

The Knowledge Bank tracks every deduplication decision made during ingestion. Use events to audit how entries are being deduplicated, view merge reasoning, and inspect conflicts.

Deduplication Events

Each time an entry is ingested, the system compares it against existing entries. The result is one of five actions:

ActionMeaning
newNo duplicates found — entry added as-is
redundantNear-exact duplicate — entry skipped
updatePartial overlap — existing entry updated with merged content
conflictContradicting information — requires review
errorProcessing error occurred
Python # List all dedup events (paginated) events = kb.list_events(limit=50, offset=0) print(f"Total events: {events['total']}") print(f"Action breakdown: {events['action_counts']}") # Filter by action type conflicts = kb.list_events(action="conflict") for evt in conflicts["events"]: print(f" {evt['content_preview']} (score: {evt['max_similarity_score']})") # Get full details of a specific event detail = kb.get_event(conflicts["events"][0]["id"]) print(f"Reasoning: {detail['reasoning']}") print(f"Merged content: {detail['merged_content']}") print(f"Similar entries: {detail['similar_entries']}")

Entry Event Logs

For deeper debugging, entry event logs capture the full processing lifecycle of each entry: queuing, hash checks, deduplication search, LLM decisions, and action execution.

Python # List all event logs for a KB logs = kb.list_event_logs(limit=100) # Filter by event type dedup_logs = kb.list_event_logs(event_type="dedup") # Filter by specific entry entry_logs = kb.list_event_logs(entry_id="pending-entry-uuid") for log in entry_logs["logs"]: status = "OK" if log["success"] else "FAIL" print(f" [{status}] {log['event_type']}/{log['event_name']} ({log['duration_ms']}ms)")
Event logs are append-only and ordered by created_at descending. Use entry_id to reconstruct the complete processing history of a single entry.

MCP Integration

Get everything needed to inject your Knowledge Bank's MCP server into an AI agent. Returns the MCP URL, server name, and available tools. Use your KB API key directly as the Bearer token for MCP requests.

Python mcp_def = kb.get_mcp_server_definition() # Returns: # { # "type": "http", # "url": "https://www.gurucloudai.com/mcp/srv-uuid/mcp", # "server_name": "my-kb", # "available_tools": ["query_knowledge_bank", "report_learning"], # "auth": {"type": "bearer", "note": "Use your KB API key..."}, # } # Inject into your agent's MCP config using your API key: agent_config = { "mcpServers": { mcp_def["server_name"]: { "type": mcp_def["type"], "url": mcp_def["url"], "headers": { "Authorization": f"Bearer {api_key}" } } } } # Or generate a dedicated PAT (requires admin scope): pat_info = kb.generate_pat(token_name="My Agent") # pat_info["token"] is a never-expiring PAT for this MCP server

Error Handling

Python from gurucloud_kb import ( GuruCloudClient, AuthenticationError, NotFoundError, RateLimitError, APIError, ) try: kb = client.get_kb("nonexistent") except AuthenticationError: print("Invalid API key") except NotFoundError: print("KB not found") except RateLimitError: print("Rate limit exceeded - slow down") except APIError as e: print(f"API error {e.status_code}: {e.message}")
ExceptionHTTP StatusWhen
AuthenticationError401Invalid or missing API key
PermissionError403Insufficient scope
NotFoundError404Resource not found
RateLimitError429Rate limit exceeded
APIError*Any other API error
ConnectionErrorNetwork/timeout error

REST API: Knowledge Banks

GET /banks

List all Knowledge Banks. read

GET /banks/{kb_id}

Get a specific Knowledge Bank. read

POST /banks

Create a new Knowledge Bank. write

{ "name": "My KB", "description": "Optional description", "dimension_schema": { ... } // optional }
PATCH /banks/{kb_id}

Update KB name/description. write

DELETE /banks/{kb_id}

Delete a KB and all resources. admin

REST API: Entries

GET /banks/{kb_id}/entries?limit=50&offset=0

List entries with pagination. read

POST /banks/{kb_id}/entries

Add a single entry. write

{ "dimensions": { "content": "The main knowledge content", "useful_for": "What this is useful for" }, "source": "optional-source-label", "relevant_file_paths": ["path/to/file.py"] }
POST /banks/{kb_id}/entries/batch

Batch ingest up to 100 entries. write

{ "entries": [ {"dimensions": {"content": "..."}}, {"dimensions": {"content": "..."}} ], "deduplicate": true }
GET /banks/{kb_id}/entries/{entry_id}

Get a single entry. read

PATCH /banks/{kb_id}/entries/{entry_id}

Update an entry's dimensions. write

DELETE /banks/{kb_id}/entries/{entry_id}

Delete an entry. write

POST /banks/{kb_id}/search

Multi-dimensional semantic search. read

{ "dimensions": { "content": {"query": "search text", "weight": 1.0}, "useful_for": {"query": "context", "weight": 0.5} }, "k": 10, "threshold": 0.5 }

REST API: Schema

GET /banks/{kb_id}/schema

Get the dimension schema. read

PUT /banks/{kb_id}/schema

Replace the full schema. admin

POST /banks/{kb_id}/schema/validate

Validate a schema without applying. read

POST /banks/{kb_id}/schema/dimensions

Add a dimension. admin

DELETE /banks/{kb_id}/schema/dimensions/{name}

Remove a dimension. admin

REST API: MCP

GET /banks/{kb_id}/mcp-config

Get .mcp.json config snippet. read

GET /banks/{kb_id}/mcp-tools

Get MCP tool definitions. read

POST /banks/{kb_id}/mcp-server-definition

Get MCP server URL, name, and available tools. read

Use your KB API key directly as the Bearer token for MCP requests. Alternatively, generate a dedicated PAT via the endpoint below.
POST /banks/{kb_id}/generate-pat

Generate a Personal Access Token for this KB's MCP server. admin

Body (optional): {"token_name": "My Agent"}. Returns token, server_url, token_name, note. Tokens do not expire. Store securely.

GET /banks/{kb_id}/stats

Get performance statistics. read

REST API: Events

GET /banks/{kb_id}/events

List deduplication events. read

Query params: limit (default 50, max 200), offset (default 0), action (filter: new, redundant, update, conflict, error)

// Response { "events": [ { "id": "uuid", "source": "mcp_tools", "content_preview": "Auth uses JWT...", "max_similarity_score": 0.95, "llm_invoked": true, "action": "update", "created_at": "2026-03-01T00:00:00" } ], "total": 42, "action_counts": {"new": 30, "update": 10, "conflict": 2} }
GET /banks/{kb_id}/events/{event_id}

Get full details of a specific event. read

Returns the complete event including reasoning, merged_content, similar_entries, and execution details.

GET /banks/{kb_id}/event-logs

List entry processing event logs. read

Query params: limit, offset, event_type (lifecycle, hash_check, dedup, action), entry_id (pending entry UUID)

REST API: API Keys

API key management endpoints use session authentication (login cookies), not Bearer token auth. These are for the dashboard UI.
POST /api-keys

Create a new API key. Returns the raw key once.

{ "name": "My Key", "scopes": ["read", "write"], "rate_limit_per_hour": 1000, "expires_at": "2027-01-01T00:00:00Z" // optional }
GET /api-keys

List all API keys (masked).

DELETE /api-keys/{key_id}

Delete an API key.