Semantic Search

Find relevant memories using natural language

Memoid’s semantic search finds memories based on meaning, not just keywords.

How It Works

  1. Your query is converted to a vector embedding
  2. Memoid finds memories with similar embeddings
  3. Results are ranked by similarity score

This means “What food does the user like?” will match “User loves pizza” even without shared keywords.

Basic Search

results = client.search(
    query="What are the user's preferences?",
    user_id="user_123"
)

for memory in results:
    print(f"{memory.memory} (score: {memory.score:.2f})")

Search Parameters

ParameterTypeDefaultDescription
querystringrequiredNatural language search query
user_idstringoptionalFilter by user
agent_idstringoptionalFilter by agent
limitint10Maximum results
thresholdfloat0.7Minimum similarity score

Filtering

By User

results = client.search(
    query="hobbies",
    user_id="user_123"  # Only this user's memories
)

By Metadata

results = client.search(
    query="preferences",
    user_id="user_123",
    filters={"source": "chat"}
)

Relevance Threshold

Control result quality with the threshold parameter:

# High precision (fewer, more relevant results)
results = client.search(query="...", threshold=0.85)

# High recall (more results, some less relevant)
results = client.search(query="...", threshold=0.5)

Scores range from 0 to 1:

  • 0.9+ — Highly relevant
  • 0.7-0.9 — Relevant
  • 0.5-0.7 — Somewhat related
  • Below 0.5 — Weak match

Search Strategies

Context Retrieval

For chatbots, search with the user’s message:

context = client.search(
    query=user_message,
    user_id=user_id,
    limit=5
)

Question Answering

For specific questions, be explicit:

results = client.search(
    query="What is the user's job title and company?",
    user_id=user_id
)

Topic Exploration

For broad topics, use general queries:

results = client.search(
    query="user career and work history",
    user_id=user_id,
    limit=10,
    threshold=0.6
)

Combining with List

For comprehensive context, combine search with recent memories:

# Semantically relevant
relevant = client.search(query=message, user_id=user_id, limit=5)

# Recently added
recent = client.list(user_id=user_id, limit=5)

# Combine and deduplicate
all_memories = {m.id: m for m in relevant + recent}
context = list(all_memories.values())

Performance Tips

  1. Use specific queries — “user’s favorite food” beats “food”
  2. Set appropriate limits — Don’t retrieve more than needed
  3. Use filters — Narrow scope with user_id, agent_id, metadata
  4. Tune threshold — Balance precision vs recall for your use case