Case Study 04

RAG Knowledge Agent

Industry
Real Estate — Brokerage Knowledge Management
Platform
LangGraph + FastAPI
Build time
~1 week
Service
AI Agent Implementation

Portfolio build — a complete, working system built and measured by Altus Initiatives to demonstrate this capability. Performance metrics below are measured from the build; agency-impact figures are projected for a representative brokerage.

The problem

Knowledge-intensive businesses — real estate brokerages included — face a compounding information problem. As the agency grows, so does the volume of questions. As processes evolve, so does the complexity of the answers. And the cost of a wrong answer compounds too: misquoted commission structures, incorrect policy guidance, and missed escalations all have downstream consequences — for the client, for the agent, and for the brokerage's reputation.

Generic AI assistants fail here in a predictable way. They generate confident, fluent responses that aren't grounded in the business's actual data. They hallucinate policy details, invent fee structures, and produce answers that sound authoritative but are wrong. The result is worse than no AI at all — it's misinformation delivered with confidence.

The solution isn't a smarter model. It's a system that retrieves the right information before generating any response, reasons over real business data, knows when to act versus when to escalate, and keeps a human in the loop for high-stakes decisions.

The solution

A knowledge agent that answers questions accurately — grounded in the agency's actual documentation, policies, and client data — and routes consequential actions through human approval before executing them. The system operates across three layers:

Layer 1 — Knowledge Ingestion and Retrieval

The agency's knowledge base — process documentation, policy guides, commission structures, FAQ content — is ingested, structured, and stored for retrieval. The system is designed to surface the right information consistently, even on imprecise or poorly worded queries.

  • Structured chunking — Documents are broken into overlapping segments that preserve boundary context, with each chunk tagged with source metadata for traceable retrieval
  • Hybrid retrieval — Vector similarity search combined with keyword matching, ensuring relevant documents surface even when semantic similarity alone would miss them — critical for exact policy references and specific commission tier labels
  • Reranking — Retrieved candidates are reranked by relevance before passing to the agent, improving precision on ambiguous multi-part queries
  • Query transformation — Vague or poorly formed queries are rewritten into cleaner search terms before retrieval, addressing one of the most common failure modes in knowledge retrieval systems

Layer 2 — Agent Orchestration

The agent is built as a structured graph with three nodes and conditional routing — giving it the ability to reason, retrieve, act, and pause for human approval:

Nodes

  • Reasoning node — The agent reads the full conversation state and decides whether to respond directly, retrieve from the knowledge base, or invoke a data tool
  • Execution node — Runs the requested tool and returns the result to the agent for continued reasoning
  • Approval node — Pauses execution before consequential actions and requires explicit human confirmation before proceeding. The agent cannot route around this gate — it is structural, not advisory

Tools available to the agent

  • Knowledge base search — Hybrid retrieval with reranking and query transformation. The primary tool for all policy, process, and documentation queries
  • Client lookup — Queries the client database by name, email, or account ID. Returns account details and interaction history
  • Commission calculator — A dedicated calculation engine that computes fees based on transaction type, price, and agreement terms — eliminating AI inference on structured financial logic
  • Client communication drafting — Generates professional, context-aware client communications grounded in retrieved knowledge base content
  • Escalation routing — Creates a referral or escalation record with automatic priority assignment. Gated behind the approval node — requires human confirmation before execution

Layer 3 — Service Layer

The agent is served via a backend API with session management — maintaining full conversation context per user across multiple exchanges, supporting concurrent team members without state interference between sessions.

Results

Grounded, not guessed

For a brokerage with deep process documentation and a team asking the same policy and procedure questions repeatedly, this system functions as an always-available knowledge resource — accurate, consistent, and free of the hallucination risk that makes generic AI tools a liability in client-facing contexts.

  • Policy and process questions answered accurately from the agency's own documentation — not from general AI inference. The distinction matters when commission structures, disclosure requirements, and transaction procedures vary by brokerage.
  • Commission calculations handled deterministically — financial figures produced by a structured calculation engine, not estimated by a language model. No arithmetic errors, no hallucinated fee tiers.
  • Consequential actions gated by design — escalation routing and referral creation cannot execute without human confirmation. This is architectural, not a setting that can be toggled off.
  • Multi-step queries handled in a single session — an agent can look up a client, retrieve the relevant policy, calculate the applicable fee, and draft a communication in one conversation, without losing context between steps.
  • Response quality consistent regardless of how the question is phrased — query transformation means imprecise inputs still retrieve the right information.

Architecture

Inbound Query (API)
        │
        ▼
Session Memory Lookup
        │
        ▼
┌─────────────────────────────────────────────────────────┐
│              Agent Graph                                │
│                                                         │
│   Reasoning Node                                        │
│         │                                               │
│         ▼ Route after reasoning                         │
│         │                                               │
│    ┌────┴──────────────────┐                            │
│    │                       │                            │
│    ▼                       ▼                            │
│  Execution Node       Approval Node                     │
│  (tool execution)     (human-in-the-loop gate)          │
│    │                       │                            │
│    ▼ Route after tool      ▼ Route after approval       │
│    │                       │                            │
│    └──────────┬────────────┘                            │
│               │                                         │
│               ▼                                         │
│         Reasoning Node (continued)                      │
│               │                                         │
│               ▼                                         │
│         Final Response                                  │
└─────────────────────────────────────────────────────────┘
        │
        ▼
Session Memory Update
        │
        ▼
Response delivered

Tools: knowledge base search · client lookup · commission calculator · client communication drafting · escalation routing

Full architecture documentation available upon engagement.

Tech stack

ComponentTool
Agent orchestrationLangGraph
LLMOpenAI / Anthropic Claude (configurable)
Knowledge baseStructured document store (FAQ, policy, process documentation)
Vector storeChroma
Retrieval strategyHybrid (vector + keyword) with reranking and query transformation
Backend frameworkFastAPI (Python)
Session memoryPer-session store

Key design decisions

Dedicated calculation tool over AI inference. Commission and fee calculations are handled by a structured calculation engine, not the language model. Language models are unreliable on precise arithmetic and structured financial logic — the error rate is low enough to seem acceptable until it isn't. Separating financial calculations into a deterministic tool eliminates that category of risk entirely on the queries where accuracy matters most.

Human approval as a graph node, not a downstream filter. The approval gate is built into the agent's graph architecture as a first-class node, not added as an afterthought. This means the agent structurally cannot execute consequential actions — escalations, referral routing — without human confirmation. It cannot be configured around, and it does not depend on the agent correctly deciding to ask. For any system that takes actions with real-world consequences, this is the correct architecture.

Query transformation before retrieval. Retrieval quality is only as good as the query going in. Vague, incomplete, or imprecise queries are rewritten into cleaner search terms before they reach the knowledge base. This addresses one of the most common failure modes in knowledge retrieval — poor results on reasonable but imprecise inputs — without requiring the user to learn how to phrase questions correctly.

Hybrid retrieval over vector-only. Semantic similarity search misses exact matches — specific policy clause references, commission tier labels, procedure names. Combining vector search with keyword matching ensures both meaning-based and term-based relevance are captured on every retrieval pass. In a compliance-sensitive context like real estate, missing an exact policy reference is not an acceptable failure mode.

Production considerations

This system is built for deployment readiness. A full production rollout includes the following standard upgrades, delivered as part of the implementation engagement:

  • Session persistence — In-memory session storage is replaced with a durable store for resilience across server restarts and multi-instance deployments.
  • Authentication — API authentication is added to the service layer to prevent unauthorized access.
  • Observability — Full tracing is integrated for visibility into agent reasoning, tool calls, and retrieval quality across every conversation — essential for quality monitoring and continuous improvement.
  • Rate limiting — Per-session rate limiting is implemented to control API costs and prevent abuse.
  • Approval interface — The human-in-the-loop approval gate is connected to a real approval interface — Slack interactive message, web dashboard, or email confirmation — appropriate to the agency's existing workflow.

Want a system like this in your agency?

This is the same architecture we build for clients. The first step is a 30-minute discovery call — no pitch, no commitment.

Book a discovery call

View all case studies