How Semantic Search Works

Semantic search is the technology that lets you type “cozy cottage with mountain views” and find properties described as “charming bungalow with scenic vistas” — even though the exact words don’t match. Understanding how it works helps you use it more effectively.

The Problem with Keyword Search

Traditional search engines look for exact word matches. If you search for “large backyard,” you only get listings that contain those exact words. Properties described with synonyms like “spacious yard,” “expansive lot,” or “oversized outdoor area” won’t appear — even though they’re exactly what you’re looking for.

This forces you to:

Run multiple searches with different phrasings
Learn the specific vocabulary each listing agent uses
Miss great properties because they weren’t described with your exact words

Example scenario:

You search for: “modern kitchen” Missed listings: “contemporary kitchen,” “updated appliances,” “recently renovated,” “chef’s kitchen,” “gourmet kitchen”

You’d need to run six different searches to find all relevant properties!

How Semantic Search Solves This

Semantic search uses vector embeddings — a technique that converts text into mathematical representations that capture meaning, not just words.

Vector Embeddings Explained

Think of embeddings as coordinates in “meaning space.” Words and phrases with similar meanings are placed close together, even if they use different words:

"large backyard"     →  [0.21, 0.89, 0.34, ...]  ← Close together in space
"spacious yard"      →  [0.23, 0.87, 0.36, ...]  ← Similar coordinates
"expansive lot"      →  [0.19, 0.91, 0.32, ...]  ← Nearby position

"tiny closet"        →  [0.81, 0.12, 0.73, ...]  ← Far away (opposite meaning)

When you search for “large backyard,” the system:

Converts your query into a vector
Finds property descriptions with similar vectors
Returns matches ranked by similarity (0-1 score)

Properties described as “spacious yard” score highly because their vector coordinates are close to “large backyard” in meaning space.

What Semantic Search Understands

The system recognizes relationships between concepts:

Synonyms and Variations

“modern kitchen” matches:

“contemporary design”
“updated appliances”
“recently renovated”
“stainless steel fixtures”
“chef’s kitchen with new cabinets”

“family-friendly” surfaces:

Homes near schools
Properties with multiple bedrooms
Yards suitable for play
Cul-de-sac locations (implied safety)
Neighborhoods with parks

Lifestyle Descriptions

“great for entertaining” finds:

Open floor plans
Large kitchens with islands
Outdoor patios or decks
Spacious living areas
Flow between indoor/outdoor spaces

The system infers that “entertaining” correlates with these features, even if the listing never uses the word “entertaining.”

Why It Sometimes Gets It Wrong

Semantic search is powerful but not perfect. Here’s what can go wrong:

Ambiguous Terms

“Cozy” might mean:

Warm and inviting (positive)
Small and cramped (negative)

The system might match both interpretations, requiring you to filter results.

Context-Dependent Meaning

“Great investment” could mean:

Rental income potential
Fixer-upper with upside
Underpriced property
Stable neighborhood with appreciation potential

If you meant “rental property,” you might also get fixer-uppers.

Listing Agent Enthusiasm

Agents often use marketing language like “charming,” “potential,” or “perfect for.” Semantic search treats these as meaningful signals, but they’re sometimes just filler words. This can create noise in results.

Comparing Search Methods

Aspect	Keyword Search	Semantic Search
Precision	High (exact matches)	Lower (includes related concepts)
Recall	Low (misses synonyms)	High (finds all relevant matches)
Best for	Known field values (MLS#, address)	Describing desired features
Weakness	Requires exact wording	Can be too broad

Recommendation: Use semantic search to discover properties, then use filters to narrow by hard requirements (price, beds, location).

The Technology Behind It

For those interested in the technical implementation:

Embedding Generation

Each property listing is processed through the mxbai-embed-large model (running on Ollama), which generates a 1024-dimension vector from the listing text. This happens:

Automatically when new listings are synced from MLSGrid
Via background jobs that process unembedded listings
On-demand for admin-uploaded properties

Vector Storage

Embeddings are stored in SQLite using the sqlite-vec extension, which provides:

Fast k-nearest-neighbors (KNN) search
Distance calculations (cosine similarity)
Index optimization for large datasets

Search Flow

Query conversion: Your search text → vector (via same embedding model)
Similarity search: Find top-N property vectors closest to query vector
Scoring: Calculate cosine similarity (0-1 scale)
Ranking: Sort by similarity, apply any filter constraints
Return: Top matches with similarity scores

Performance: Searches complete in <100ms for databases with 10,000+ properties.

When to Use Semantic Search

Use semantic search when:

You’re describing a lifestyle or feeling (“perfect for working from home”)
You want to explore properties matching a vibe (“mountain retreat”)
You’re unsure of exact terminology (“something with… character?”)
You want to see what’s like something you already know

Use traditional filters when:

You have hard constraints (budget, location, beds)
You need exact matches (MLS number, specific address)
You’re searching structured data (year built, square footage)

Best practice: Combine both! Use filters for requirements, semantic for preferences.

Want to use semantic search effectively?

Search Methods Compared — When to use semantic vs filters
Use Semantic Search — Practical tips for agents

Need technical details?

Search API Reference — API endpoints and parameters