Skip to content

How Semantic Search Works

Semantic search is the technology that lets you type “cozy cottage with mountain views” and find properties described as “charming bungalow with scenic vistas” — even though the exact words don’t match. Understanding how it works helps you use it more effectively.


Traditional search engines look for exact word matches. If you search for “large backyard,” you only get listings that contain those exact words. Properties described with synonyms like “spacious yard,” “expansive lot,” or “oversized outdoor area” won’t appear — even though they’re exactly what you’re looking for.

This forces you to:

  • Run multiple searches with different phrasings
  • Learn the specific vocabulary each listing agent uses
  • Miss great properties because they weren’t described with your exact words

Example scenario:

You search for: “modern kitchen” Missed listings: “contemporary kitchen,” “updated appliances,” “recently renovated,” “chef’s kitchen,” “gourmet kitchen”

You’d need to run six different searches to find all relevant properties!


Semantic search uses vector embeddings — a technique that converts text into mathematical representations that capture meaning, not just words.

Think of embeddings as coordinates in “meaning space.” Words and phrases with similar meanings are placed close together, even if they use different words:

"large backyard" → [0.21, 0.89, 0.34, ...] ← Close together in space
"spacious yard" → [0.23, 0.87, 0.36, ...] ← Similar coordinates
"expansive lot" → [0.19, 0.91, 0.32, ...] ← Nearby position
"tiny closet" → [0.81, 0.12, 0.73, ...] ← Far away (opposite meaning)

When you search for “large backyard,” the system:

  1. Converts your query into a vector
  2. Finds property descriptions with similar vectors
  3. Returns matches ranked by similarity (0-1 score)

Properties described as “spacious yard” score highly because their vector coordinates are close to “large backyard” in meaning space.


The system recognizes relationships between concepts:

“modern kitchen” matches:

  • “contemporary design”
  • “updated appliances”
  • “recently renovated”
  • “stainless steel fixtures”
  • “chef’s kitchen with new cabinets”

“family-friendly” surfaces:

  • Homes near schools
  • Properties with multiple bedrooms
  • Yards suitable for play
  • Cul-de-sac locations (implied safety)
  • Neighborhoods with parks

“great for entertaining” finds:

  • Open floor plans
  • Large kitchens with islands
  • Outdoor patios or decks
  • Spacious living areas
  • Flow between indoor/outdoor spaces

The system infers that “entertaining” correlates with these features, even if the listing never uses the word “entertaining.”


Semantic search is powerful but not perfect. Here’s what can go wrong:

“Cozy” might mean:

  • Warm and inviting (positive)
  • Small and cramped (negative)

The system might match both interpretations, requiring you to filter results.

“Great investment” could mean:

  • Rental income potential
  • Fixer-upper with upside
  • Underpriced property
  • Stable neighborhood with appreciation potential

If you meant “rental property,” you might also get fixer-uppers.

Agents often use marketing language like “charming,” “potential,” or “perfect for.” Semantic search treats these as meaningful signals, but they’re sometimes just filler words. This can create noise in results.


AspectKeyword SearchSemantic Search
PrecisionHigh (exact matches)Lower (includes related concepts)
RecallLow (misses synonyms)High (finds all relevant matches)
Best forKnown field values (MLS#, address)Describing desired features
WeaknessRequires exact wordingCan be too broad

Recommendation: Use semantic search to discover properties, then use filters to narrow by hard requirements (price, beds, location).


For those interested in the technical implementation:

Each property listing is processed through the mxbai-embed-large model (running on Ollama), which generates a 1024-dimension vector from the listing text. This happens:

  • Automatically when new listings are synced from MLSGrid
  • Via background jobs that process unembedded listings
  • On-demand for admin-uploaded properties

Embeddings are stored in SQLite using the sqlite-vec extension, which provides:

  • Fast k-nearest-neighbors (KNN) search
  • Distance calculations (cosine similarity)
  • Index optimization for large datasets
  1. Query conversion: Your search text → vector (via same embedding model)
  2. Similarity search: Find top-N property vectors closest to query vector
  3. Scoring: Calculate cosine similarity (0-1 scale)
  4. Ranking: Sort by similarity, apply any filter constraints
  5. Return: Top matches with similarity scores

Performance: Searches complete in <100ms for databases with 10,000+ properties.


Use semantic search when:

  • You’re describing a lifestyle or feeling (“perfect for working from home”)
  • You want to explore properties matching a vibe (“mountain retreat”)
  • You’re unsure of exact terminology (“something with… character?”)
  • You want to see what’s like something you already know

Use traditional filters when:

  • You have hard constraints (budget, location, beds)
  • You need exact matches (MLS number, specific address)
  • You’re searching structured data (year built, square footage)

Best practice: Combine both! Use filters for requirements, semantic for preferences.


Want to use semantic search effectively?

Need technical details?