GPU Services

GPU-accelerated microservices for document processing, embeddings, entity extraction, and AI operations.

Architecture

GPU services run as Docker containers on the IntelligenceBox internal network. Each service exposes HTTP endpoints that the main box-server calls for specialized AI operations.

Network Architecture
┌──────────────────────────────────────────────────────────┐
│                    Docker Network                         │
│                                                          │
│  ┌─────────────┐    HTTP    ┌────────────────────────┐  │
│  │ box-server  │ ─────────► │  GPU Service Container │  │
│  │   (API)     │            │  (colpali:8001)        │  │
│  └─────────────┘            │  (docling:8080)        │  │
│                             │  (fastembed:8083)      │  │
│                             │  (gliner:8093)         │  │
│                             │  (jina:8080)           │  │
│                             │  (reranker:8081)       │  │
│                             │  (tables:8098)         │  │
│                             └────────────────────────┘  │
└──────────────────────────────────────────────────────────┘

Common Patterns

HTTP Communication

All GPU services use HTTP POST for operations and return JSON responses:

Client Pattern
const response = await fetch(`http://${serviceName}:${port}/endpoint`, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify(payload),
});

const result = await response.json();

Error Handling

All clients include timeout mechanisms and error handling:

if (!response.ok) {
  const errorText = await response.text();
  throw new Error(`Service error (${response.status}): ${errorText}`);
}

Available Services

Service Locations

ServiceInternal URLClient
ColPalihttp://colpali:8001colpaliClient.ts
Doclinghttp://docling:8080doclingClient.ts
FastEmbedhttp://fastembed:8083fastEmbedClient.ts
GLiNERhttp://gliner:8093glinerClient.ts
Jina Embeddingshttp://jina:8080jinaClient.ts
Jina Rerankerhttp://jina-reranker:8081jinaRerankerClient.ts
Table Extractionhttp://table-extractor:8098tableExtractionClient.ts