Building a Knowledge Bank: How We Achieved 57% Context Reduction Through Compound Learning

What if AI agents could learn from each other's work? We built a Knowledge Bank system to find out, and the results exceeded our expectations: 57% context reduction, 100% file reading elimination for documented topics, and clear compound knowledge accumulation effects.

57%

Context Reduction

100%

File Reading Eliminated

95%

KB Coverage Achieved

30x

ROI on Gotchas

The Problem: Every Agent Starts from Scratch

When working with AI coding agents, we noticed a frustrating pattern: every agent would explore the same files, discover the same patterns, and learn the same lessons. There was no memory, no accumulation of knowledge, no compound learning effect.

For example, five different agents exploring our Docker architecture would each:

Read the same 4 configuration files (~700 lines)
Parse the same service definitions
Discover the same gotchas (silent container failures, database routing, etc.)
Use ~35,000 context tokens each

This was inefficient and expensive. We needed a system where Agent 2 could benefit from Agent 1's discoveries, and Agent 10 could benefit from all nine agents before it.

The Solution: A Semantic Knowledge Bank

We built a Knowledge Bank using PostgreSQL with the pgvector extension and OpenAI embeddings. The architecture is straightforward but powerful:

Core Components

💾

Storage Layer

PostgreSQL with pgvector extension stores knowledge entries with flexible JSONB metadata and separate embedding vectors for each search dimension.

🔍

Search Layer

Multi-dimensional semantic search using OpenAI embeddings with cosine similarity. Queries search across content, use-cases, systems, and tasks simultaneously.

🔌

MCP Interface

Model Context Protocol tools allow agents to query the KB with natural language and automatically store new discoveries with validation.

Three Types of Knowledge

We categorize knowledge into three types, each stored with metadata for efficient retrieval:

Conceptual: Architecture overviews, system explanations, workflows (metadata.type: NOT SET)
Examples: Validated code patterns with proof (metadata.type: "example")
Gotchas: Common mistakes and edge cases (metadata.type: "gotcha")

How It Works: Query Examples

The Knowledge Bank uses natural language queries across multiple dimensions. Here's what agents actually ask:

🔍 Example Query 1: Understanding Docker Architecture

content: "Docker Compose services and configuration"
useful_for: "Setting up development environment"
systems: "Docker, containers"

✅ Returns: 12 entries covering all 16 services, networking setup, volume configuration, and common initialization gotchas

🔍 Example Query 2: Authentication Implementation

content: "User authentication and session management"
useful_for: "Building login functionality"
num_gotchas: 3

✅ Returns: Custom decorator patterns, Flask-Login integration details, scope-based auth, plus 3 gotchas about common mistakes

The Experiments: Four Agents, Two Topics

We ran controlled A/B tests to measure the compound learning effect. The results were dramatic:

Test 1: Docker Architecture (Fresh Topic)

Metric	Agent 1 (Empty KB)	Agent 2 (With KB)	Improvement
Context Tokens	35,000	15,000	-57%
Files Read	4 files	0 files	-100%
KB Coverage	~30%	95%	+217%

💡 Key Insight: Agent 2 achieved 95% knowledge coverage from KB queries alone, requiring ZERO file reads. This is the compound effect in action.

Key Findings

1. The Compound Effect is Real

With 100 agents on the same topic:

❌ Without Knowledge Bank

Every agent starts from scratch

100 agents × 35K tokens

3.5M tokens

✅ With Knowledge Bank

Agents learn from each other

1 × 35K + 99 × 15K

1.52M tokens

💰 Savings: 56.6% total reduction (1.98M tokens saved)

2. KB Saturation Happens Fast

It takes just 2 agents per topic to reach 95% KB coverage:

Agent 1: Documents 60-70% (discovers fundamentals)
Agent 2: Documents 20-30% (fills gaps)
Agent 3+: Find KB complete, add minimal new knowledge

3. The Gotcha Feature: Learning from Mistakes

Gotchas have exceptional return on investment:

Cost to capture: 2-5 minutes agent time
Savings for next agent: 30-120 minutes debugging time
ROI: 10-30x return on knowledge investment

Why Multi-Dimensional Search Matters

Most vector databases use single-dimensional semantic search—one embedding captures all the content. But agents ask questions from different perspectives: what something does, when to use it, which systems it affects. Our multi-dimensional approach creates separate embeddings for four different aspects, then searches across all of them simultaneously:

Multi-Dimensional Search Architecture

📝 Content

What is this about?

weight: 1.0

🎯 Useful For

When would I need this?

weight: 1.0

🔧 Systems

Which components?

weight: 0.3

⚙️ Tasks

What operations?

weight: 0.3

🔍 Combined Semantic Similarity

Queries match on meaning, not just keywords - even when phrased differently

The Math: Vector Similarity in Action

Each knowledge entry and query is converted into embedding vectors (1536-dimensional arrays of numbers). Similarity is measured using cosine similarity, which calculates the angle between vectors:

Cosine Similarity Formula

similarity = (A · B) / (||A|| × ||B||)

Mathematical range: -1 to +1 • OpenAI embeddings typically: 0.4 to 1.0 • We use threshold ≥ 0.5 for matches

For multi-dimensional search, we combine scores from each dimension using weighted averages:

Final Score = 

            (1.0 × content_similarity) +

            (1.0 × useful_for_similarity) +

            (0.3 × systems_similarity) +

            (0.3 × tasks_similarity)

            2.6 (total weight)

Real Example: Why Dimensions Matter

Let's see how multi-dimensional search finds the right knowledge even when queries don't match exactly:

📚 Stored Knowledge

Content:

"Docker container initialization with entrypoint scripts and health checks"

Useful For:

"Debugging why containers exit silently on startup"

Systems:

"Docker, Docker Compose, containers"

🔍 Agent Query

Content:

"How to troubleshoot application startup failures"

Useful For:

"Fixing services that crash immediately"

Systems:

"Docker"

✅ Similarity Breakdown:

Content Similarity: 0.68

68%

Different words, similar meaning

Useful For Similarity: 0.85

85%

Strong match on use-case!

Systems Similarity: 0.92

92%

Both mention Docker

Final Weighted Score: 0.76

76% - STRONG MATCH ✓

Calculation: (1.0 × 0.68 + 1.0 × 0.85 + 0.3 × 0.92) / 2.6 = 0.76

This entry ranks highly even though the query uses completely different terminology!

💡 Why This Matters: The "useful_for" dimension scored 0.85 despite completely different wording ("debugging silent exits" vs "fixing crashes"). This is the power of semantic embeddings - they understand meaning, not just keywords. A single-dimension search would have missed this match entirely.

Conclusion

Building a Knowledge Bank for AI agents taught us that compound learning is not just possible—it's incredibly effective. With the right architecture and prompt engineering, we achieved:

57% context reduction for fresh topics
100% file reading elimination for documented topics
95%+ KB coverage after just 2 agents per topic
10-30x ROI on gotcha documentation

The future of AI-assisted development isn't just smarter agents—it's agents that learn from each other's work. The compound effect is real, and it's powerful.

Thanks for Reading This Far! 🤔

You probably thought of some challenges with this system: What about outdated knowledge? How do you prevent noise from low-quality entries? What about redundant or conflicting information? And the big one: how do you efficiently query across multiple vector spaces simultaneously with scalable response times as the KB grows to thousands of entries?

These are exactly the challenges we solved in production. Part 2 will cover:

🔄 Knowledge Lifecycle

Versioning, deprecation, and automatic staleness detection

✨ Quality Control

Validation gates, confidence scoring, and deduplication strategies

⚡ Performance at Scale

Index optimization, caching strategies, and query batching

Want to be notified when Part 2 drops? Reach out or join our mailing list to get notified. Or if you have questions or ideas about these challenges, we'd love to hear from you!

Building a Knowledge Bank: How We Achieved 57% Context Reduction Through Compound Learning

The Problem: Every Agent Starts from Scratch

The Solution: A Semantic Knowledge Bank

Core Components

Three Types of Knowledge

How It Works: Query Examples

The Experiments: Four Agents, Two Topics

Test 1: Docker Architecture (Fresh Topic)

Key Findings

1. The Compound Effect is Real

2. KB Saturation Happens Fast

3. The Gotcha Feature: Learning from Mistakes

Why Multi-Dimensional Search Matters

The Math: Vector Similarity in Action

Real Example: Why Dimensions Matter

Conclusion

Thanks for Reading This Far! 🤔

References

Tags:

About the Author

The Dangers of 'Vibe Coding': Why...

Ready to Transform Your Business?

The Problem: Every Agent Starts from Scratch

The Solution: A Semantic Knowledge Bank

Core Components

Three Types of Knowledge

How It Works: Query Examples

The Experiments: Four Agents, Two Topics

Test 1: Docker Architecture (Fresh Topic)

Key Findings

1. The Compound Effect is Real

2. KB Saturation Happens Fast

3. The Gotcha Feature: Learning from Mistakes

Why Multi-Dimensional Search Matters

The Math: Vector Similarity in Action

Real Example: Why Dimensions Matter

Conclusion

Thanks for Reading This Far! 🤔

References

Tags:

About the Author

The Dangers of 'Vibe Coding': Why...

Ready to Transform Your Business?

Transform Your Business with AI