GPU Services
GPU-accelerated microservices for document processing, embeddings, entity extraction, and AI operations.
Architecture
GPU services run as Docker containers on the IntelligenceBox internal network. Each service exposes HTTP endpoints that the main box-server calls for specialized AI operations.
┌──────────────────────────────────────────────────────────┐
│ Docker Network │
│ │
│ ┌─────────────┐ HTTP ┌────────────────────────┐ │
│ │ box-server │ ─────────► │ GPU Service Container │ │
│ │ (API) │ │ (colpali:8001) │ │
│ └─────────────┘ │ (docling:8080) │ │
│ │ (fastembed:8083) │ │
│ │ (gliner:8093) │ │
│ │ (jina:8080) │ │
│ │ (reranker:8081) │ │
│ │ (tables:8098) │ │
│ └────────────────────────┘ │
└──────────────────────────────────────────────────────────┘Common Patterns
HTTP Communication
All GPU services use HTTP POST for operations and return JSON responses:
const response = await fetch(`http://${serviceName}:${port}/endpoint`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
});
const result = await response.json();Error Handling
All clients include timeout mechanisms and error handling:
if (!response.ok) {
const errorText = await response.text();
throw new Error(`Service error (${response.status}): ${errorText}`);
}Available Services
ColPali
:8001Multi-vector document vision embeddings for visual document retrieval
Docling
:8080PDF processing with text, table, and figure extraction
FastEmbed
:8083Sparse embeddings (BM25, SPLADE) for hybrid search
GLiNER
:8093Zero-shot named entity recognition
Jina Embeddings
:8080Multimodal embeddings for text and images
Jina Reranker
:8081Document reranking for improved search relevance
Table Extraction
:8098Extract structured tables from PDF documents
Service Locations
| Service | Internal URL | Client |
|---|---|---|
| ColPali | http://colpali:8001 | colpaliClient.ts |
| Docling | http://docling:8080 | doclingClient.ts |
| FastEmbed | http://fastembed:8083 | fastEmbedClient.ts |
| GLiNER | http://gliner:8093 | glinerClient.ts |
| Jina Embeddings | http://jina:8080 | jinaClient.ts |
| Jina Reranker | http://jina-reranker:8081 | jinaRerankerClient.ts |
| Table Extraction | http://table-extractor:8098 | tableExtractionClient.ts |
