FastEmbed (Sparse Vectors)
Sparse embeddings service providing BM25, SPLADE, and MiniCOIL methods. Sparse vectors complement dense embeddings for hybrid search.
Port: 8083
What it Does
FastEmbed generates sparse vector representations of text:
- BM25 - Traditional keyword-based scoring (multilingual)
- SPLADE - Semantic sparse representations
- MiniCOIL - Contextualized sparse embeddings
Sparse vectors are useful for:
- Hybrid search (combine with dense embeddings)
- Keyword matching with semantic awareness
- Efficient retrieval with inverted indices
API Endpoints
Generate Sparse Embeddings
Endpoint: POST /embed
curl -X POST "http://fastembed-server:8083/embed" \
-H "Content-Type: application/json" \
-d '{
"texts": ["What is machine learning?", "Deep learning tutorial"],
"method": "bm25"
}'Request:
| Parameter | Type | Description |
|---|---|---|
texts | array | List of texts to embed |
method | string | bm25, splade, or minicoil |
Response:
{
"embeddings": [
{"indices": [1, 5, 23, ...], "values": [0.8, 0.5, 0.3, ...]},
{"indices": [2, 8, 15, ...], "values": [0.7, 0.6, 0.4, ...]}
],
"method": "bm25"
}Health Check
curl http://fastembed-server:8083/healthUsage Example
import requests
# Generate BM25 sparse vectors
response = requests.post(
"http://fastembed-server:8083/embed",
json={
"texts": ["machine learning algorithms", "neural network training"],
"method": "bm25"
}
)
result = response.json()
for i, emb in enumerate(result["embeddings"]):
print(f"Text {i}: {len(emb['indices'])} non-zero dimensions")Methods Comparison
| Method | Speed | Semantic | Best For |
|---|---|---|---|
| BM25 | Fast | Low | Keyword search |
| SPLADE | Medium | High | Hybrid search |
| MiniCOIL | Slow | High | Precision retrieval |
Hybrid Search Pattern
Combine sparse and dense vectors for best results:
# 1. Get dense embeddings (from Jina or ColPali)
dense = get_dense_embedding(query)
# 2. Get sparse embeddings
sparse = requests.post(
"http://fastembed-server:8083/embed",
json={"texts": [query], "method": "splade"}
).json()
# 3. Search with both (in your vector DB)
results = vector_db.hybrid_search(
dense_vector=dense,
sparse_vector=sparse["embeddings"][0]
)