Case Study 06

Real-Estate Prospecting Agent

Industry: Real Estate — Agencies & Brokerages
Platform: Google Cloud Run (Python)
Build time: ~1 week
Service: AI Agent Implementation

Portfolio build — a complete, working system built by Altus Initiatives to demonstrate this capability, and the same agent Altus runs on its own pipeline. Capability figures below are measured from the build; agency-impact figures are projected for a representative brokerage.

The problem

Before you can win a listing, you have to find the right broker to talk to — and that sourcing work is slow, manual, and easy to do inconsistently. Building a usable prospect list means searching multiple places (LinkedIn, Google Maps, Realtor.com, Zillow), opening each brokerage's site, cross-checking volume and reachability, judging whether the business is actually a fit, and confirming the person isn't already in the pipeline. Done by hand, every prospect is several windows and several judgment calls.

The deeper problem is consistency. Qualification criteria that live in someone's head get applied differently on a busy day than a quiet one. A genuinely good prospect gets skipped; a poor-fit one gets added. And the time spent sourcing is time not spent on the revenue activity — outreach and conversations.

Results

55 candidates qualified for $0.67

The agent is deployed and running in production as a scheduled-capable batch job, writing real prospects to the live CRM.

End-to-end qualification at trivial cost. A representative production sweep investigated 55 candidates and added 3 qualified prospects for about $0.67 in AI cost.
Built to production standard, not demo standard. 193 automated tests cover the deterministic core, both AI steps, every integration, and the deliberate-failure matrix (timeouts, bad data, duplicates, provider outages) — the system degrades gracefully and never crashes a run.
Consistent judgment, every time. The qualification rules are applied identically on every candidate by code, not by whoever is doing the sourcing that day. The one genuinely subjective step — the outreach note — is the only thing left to the AI, and it is grounded strictly in extracted facts.
Time returned to revenue work. By removing manual multi-source sourcing and vetting, the owner's time shifts from research to outreach — the activity that actually closes business. (Projected.)

Architecture

A deterministic Python controller drives the whole sweep; the AI is invoked only as two narrow, structured-output steps. Every threshold, count, date, and the cap are plain code.

Search request (cities · sources · max leads)
        │
        ▼
  Sweep controller ──────────────► for each city × source (fixed order)
        │                                   │
        │                                   ▼
        │                          Discover candidates
        │                                   │
        ▼                                   ▼
  Per-candidate pipeline (cost-ordered, cheapest skip first):
        De-dup ─► Read page ─► [AI] Extract & assess ─► Qualifier gates
                                                              │
                              ┌──── not qualified ──► skip + log reasoning
                              │
                              └──── qualified ──► Enrich email ─► [AI] Write note
                                                       │
                                                       ▼
                                            Append to CRM (append-only)
        │
        ▼
  Run report (skips surfaced first) ──► durable storage

Full architecture documentation available upon engagement.

Tech stack

Component	Tool
Agent runtime	Python on Google Cloud Run (batch Job)
AI extraction, assessment, and note-writing	Anthropic Claude Haiku
Source discovery	LinkedIn, Google Maps, Realtor.com, Zillow (via licensed data providers)
Website / profile extraction	Firecrawl
Email enrichment	Apollo + Hunter
CRM + reporting	Google Sheets + Google Drive
Secrets & config	Secret Manager (runtime injection)

Key design decisions

The AI extracts; the code decides. Thresholds, counts, dedup, dates, geography, and the spend cap are all deterministic Python. The model's only jobs are fuzzy extraction (is this the owner? is this signal present?) and writing one grounded note. This is what makes the agent testable, cheap, and predictable — and it is why qualification is consistent rather than vibe-based.

Cheapest skip first. The pipeline is ordered so a duplicate or an obvious reject is dropped before any paid website read, email lookup, or note generation. Spend is bounded by a hard cap on prospects added and a per-source discovery limit, regardless of how large a city is.

Conservative by default, with a human re-add path. When a required signal can't be confidently assessed, the agent skips rather than adds a wrong prospect — and every skip is logged with its reasoning. The run report surfaces those skips first, and a one-command re-add lets a human override a too-cautious call. The safety mechanism is the ruleset plus the audit trail, not a person watching every decision.

Right-sized AI. Structured extraction and short note-writing don't need a frontier model. A fast, lean model handles the whole pipeline, which is why a full sweep costs cents — efficiency designed in, not bolted on.