Semantic Search
Semantic search finds sessions based on meaning rather than exact keywords. It uses vector embeddings to match your query against session content, returning results that are conceptually similar even when they use different words.How it works
- When a session is synced, OpenSync generates a 1536-dimension vector embedding from the session’s searchable text using OpenAI’s
text-embedding-3-smallmodel. - The embedding is stored in the
sessionEmbeddingstable alongside a text hash for change detection. - When you search, your query is also converted to an embedding using the same model.
- Convex’s vector search compares the query embedding against all stored embeddings using cosine similarity.
- Results are ranked by similarity score (0.0 to 1.0, where 1.0 is identical).
Embedding model
| Model | Dimensions | Cost | Notes |
|---|---|---|---|
| text-embedding-3-small | 1536 | $0.02 per 1M tokens | Default model used by OpenSync |
What gets embedded
Session embeddings are generated fromsessions.searchableText, which concatenates:
- Session title
- All user message text
- All assistant message text
Embedding storage
Each embedding is stored with these fields:| Field | Type | Description |
|---|---|---|
sessionId | Id | Reference to the parent session |
embedding | float[1536] | The vector embedding |
textHash | string | SHA-256 hash of the source text |
createdAt | number | Timestamp of generation |
textHash field enables idempotency. If a session is re-synced with the same text content, the embedding is not regenerated.
Message-level embeddings
In addition to session-level embeddings, OpenSync generates embeddings for individual messages. These are stored in themessageEmbeddings table and enable finer-grained search within specific conversations.
Examples
Natural language question
Conceptual search
Problem-based search
Requirements
Semantic search requires an OpenAI API key:- Hosted version: Already configured. No action needed.
- Self-hosted: Set the
OPENAI_API_KEYenvironment variable on your Convex deployment.
Using in the dashboard
- Go to the Context tab in the sidebar.
- Select Semantic as the search type.
- Type a natural language query.
- Results appear ranked by similarity score.
Using via API
Tips
- Ask questions naturally. Semantic search works best with complete questions or descriptions, not individual keywords.
- Be specific about the domain. “How to handle auth in a Next.js app” performs better than “auth.”
- Embedding latency. Newly synced sessions may take a few seconds to appear in semantic search results while embeddings are generated asynchronously.
- Score threshold. Results with scores below 0.5 are typically not relevant. The dashboard hides very low-scoring results automatically.
Comparison with full-text
| Aspect | Full-text | Semantic |
|---|---|---|
| Best for | Exact terms, function names, error codes | Conceptual questions, problem descriptions |
| Speed | Faster (index lookup) | Slightly slower (embedding + vector search) |
| Cost | No additional cost | OpenAI embedding cost ($0.02/1M tokens) |
| Requires | Nothing extra | OpenAI API key |