Skip to content

How Semantic Lead Matching Works

HomeStar uses semantic search and vector embeddings to automatically route leads to the most appropriate agent. This document explains how the system works and why it’s more effective than traditional keyword-based routing.

Traditional real estate lead routing relies on exact keyword matches:

  • Client mentions “downtown condo” → Route to agent with “condo” specialty
  • Client mentions “Eagle” → Route to agent serving “Eagle”
  • Client mentions nothing specific → Route randomly or to primary contact

This approach fails when:

  1. Synonyms differ — “Luxury home” vs “high-end property” vs “executive residence”
  2. Context matters — “First home” (buyer) vs “first rental property” (investor)
  3. Implicit meaning — “Downsizing after kids left” implies senior/empty-nester
  4. Multiple factors — “Family relocating to Eagle for tech job, need good schools”
  1. Agent Profile Creation — Agent describes ideal clients in natural language
  2. Embedding Generation — System converts preferences to numerical vectors
  3. Lead Submission — Client submits inquiry with message
  4. Lead Analysis — System converts inquiry to vector
  5. Similarity Calculation — System measures “distance” between lead and each agent
  6. Routing Decision — Lead goes to agent with highest match score

A vector embedding is a numerical representation of meaning. Instead of matching exact words, the system captures semantic concepts.

Example conversion:

Text: "I'm looking for a luxury home in Eagle with mountain views"
Vector: [0.23, -0.45, 0.67, ..., 0.12] (768 numbers)
Each dimension represents abstract semantic features:
- Property prestige level
- Geographic preferences
- Natural feature interests
- Lifestyle indicators
- Budget signals

Agent preference:

Text: "I specialize in high-end properties in Eagle and Meridian,
particularly homes with mountain views"
Vector: [0.25, -0.42, 0.69, ..., 0.15] (768 numbers)
Very similar to the lead's vector (high match score)

The system measures the mathematical “distance” between these vectors. Similar meaning = close distance = high match score.

Why Semantic Matching Outperforms Keywords

Section titled “Why Semantic Matching Outperforms Keywords”

Lead inquiry: “Looking for my first house, budget around $300k”

Keyword matching:

  • ✗ No exact match for “first-time buyer”
  • ✗ Might route to wrong agent

Semantic matching:

  • ✓ Recognizes “first house” ≈ “first-time buyer”
  • ✓ Routes to agent specializing in entry-level buyers

Lead inquiry: “We’re empty-nesters looking to downsize from our 4-bedroom”

Keyword matching:

  • ✗ Matches “downsize” but misses demographic context
  • ✗ Might route to any downsizing specialist

Semantic matching:

  • ✓ Recognizes “empty-nester” context
  • ✓ Routes to agent specializing in seniors/downsizing
  • ✓ Considers property type transition (large → smaller)

Lead inquiry: “Military family relocating to Boise area, need to close quickly, interested in Eagle or Meridian neighborhoods with good schools”

Keyword matching:

  • ✗ Matches “Eagle” and “Meridian” but misses nuance
  • ✗ Can’t weigh multiple factors

Semantic matching:

  • ✓ Weights: Military (high), Relocation (high), Timeline urgency (medium), School quality (high), Geographic flexibility (Eagle/Meridian both acceptable)
  • ✓ Routes to agent with military + relocation experience in those areas

HomeStar uses transformer-based language models to generate embeddings. These models are trained on massive text datasets to understand:

  • Word relationships (synonyms, antonyms, hierarchies)
  • Contextual meaning (same word, different meanings based on context)
  • Implicit signals (sentiment, urgency, sophistication level)

The system uses cosine similarity to measure how closely two vectors align:

  • Score 1.0 — Identical meaning (perfect match)
  • Score 0.8-1.0 — Very similar (excellent match)
  • Score 0.6-0.8 — Somewhat similar (acceptable match)
  • Score < 0.6 — Different concepts (poor match)

Example scores:

Lead InquiryAgent SpecialtyScoreInterpretation
”Luxury condo downtown""High-end urban properties”0.92Excellent match
”Luxury condo downtown""Luxury homes in suburbs”0.74Partial match (luxury yes, location no)
“Luxury condo downtown""First-time buyer specialist”0.45Poor match

Round-robin problems:

  • Ignores agent expertise
  • Wastes leads on poor fits
  • Frustrates both agents and clients

Semantic matching benefits:

  • Expertise-based routing
  • Higher conversion rates
  • Better client experience

Rules-based limitations:

IF inquiry contains "luxury" AND price > $500k
THEN route to luxury agent
ELSE IF inquiry contains "first time"
THEN route to first-time buyer agent
ELSE
THEN route to primary contact

Problems:

  • Brittle (fails on unexpected phrasing)
  • Maintenance burden (rules grow complex)
  • Can’t handle nuance

Semantic matching:

  • Handles any phrasing
  • No rule maintenance required
  • Captures subtle distinctions

Weighted keywords approach:

luxury: +10 points
first-time: +8 points
investor: +12 points

Problems:

  • Still keyword-dependent
  • Weights are arbitrary and hard to tune
  • No context understanding

Semantic matching:

  • Learns context automatically
  • Adapts to language evolution
  • Weights factors naturally

Agents receive leads that:

  • Match their actual expertise
  • Fit their preferred price points
  • Are in their service areas
  • Involve client types they excel with

Result: Higher conversion rates, less wasted time

Agents don’t need to predict every possible phrasing. The system understands:

  • “Starter home” = “First-time buyer”
  • “Move-up buyer” = “Upsizing”
  • “Executive property” = “Luxury”
  • “Investment property” = “Investor”

As agents update their “ideal leads” descriptions, the system immediately:

  • Generates new embeddings
  • Adjusts matching behavior
  • Routes future leads accordingly

No rule rewriting or admin intervention required.

Clients connect with agents who:

  • Understand their specific needs
  • Have relevant experience
  • Know their preferred areas
  • Work with their budget

Leads go to agents who are genuinely interested in that type of client, leading to:

  • Faster response times
  • More enthusiastic engagement
  • Better initial communication

Clients don’t get passed around between agents looking for the “right fit.” First contact is usually the right contact.

HomeStar uses the Ollama embedding service with open-source transformer models. Key characteristics:

  • 768-dimensional vectors — Balance between expressiveness and performance
  • Self-hosted — No third-party API dependencies
  • Fast generation — Sub-second embedding creation

Embeddings are stored in SQLite with the sqlite-vec extension:

-- Agent embeddings stored directly in agent table
CREATE TABLE agents (
id INTEGER PRIMARY KEY,
name TEXT,
preferences TEXT, -- Natural language description
embedding BLOB -- 768-float vector
);
-- Similarity search
SELECT
agent_id,
vec_distance_cosine(embedding, :lead_embedding) AS score
FROM agents
WHERE lead_recipient = 1 AND active = 1
ORDER BY score DESC
LIMIT 1;
  • Embedding generation: ~50-200ms per text block
  • Similarity search: <10ms across hundreds of agents
  • Scalability: Linear with agent count (easily handles 1000+ agents)
  1. Learning from outcomes — Track which matches led to successful transactions
  2. Seasonal adjustments — Weight vacation home specialists higher in summer
  3. Multi-language support — Generate embeddings for non-English inquiries
  4. Confidence scoring — Surface low-confidence matches for manual review
  • Hybrid approaches — Combine semantic matching with hard constraints (geography, price)
  • Dynamic re-weighting — Adjust importance of factors based on market conditions
  • Explainability — Show agents why they received a particular lead

Semantic lead matching transforms lead routing from a mechanical process (keywords, rules) into an intelligent one (meaning, context, nuance). By understanding what clients actually want—not just what words they use—the system connects them with the agents best equipped to help.

Core principle: Match based on meaning, not just words.