GPU Services
IntelligenceBox includes specialized GPU services for AI-intensive operations. MCP servers can call these services to perform document analysis, embeddings generation, entity extraction, and more.
Available Services
| Service | Port | Purpose |
|---|---|---|
| ColPali | 8001 | Document vision embeddings |
| Docling | 8080 | PDF processing and extraction |
| FastEmbed | 8083 | Sparse vector embeddings |
| GLiNER | 8093 | Named entity recognition |
| Jina Embeddings | 8080 | Text and image embeddings |
| Jina Reranker | 8081 | Document reranking |
| Table Extraction | 8098 | PDF table extraction |
Accessing from MCP Servers
GPU services run on the internal Docker network. From your MCP server container, access them at:
http://<service-name>:<port>Example:
// From your MCP server
const response = await fetch("http://colpali-server:8001/embed", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ images: [imageBase64] }),
});Common Patterns
Document Processing Pipeline
- Extract text → Docling (
/process) - Generate embeddings → ColPali (
/embed) or Jina Embeddings (/embeddings/text) - Extract entities → GLiNER (
/predict) - Extract tables → Table Extraction (
/extract-tables)
Search and Retrieval
- Query embedding → Jina Embeddings (
/embeddings/text) - Sparse vectors → FastEmbed (BM25/SPLADE)
- Rerank results → Jina Reranker (
/rerank)