DocsGPU ServicesOverview

GPU Services

IntelligenceBox includes specialized GPU services for AI-intensive operations. MCP servers can call these services to perform document analysis, embeddings generation, entity extraction, and more.

Available Services

ServicePortPurpose
ColPali8001Document vision embeddings
Docling8080PDF processing and extraction
FastEmbed8083Sparse vector embeddings
GLiNER8093Named entity recognition
Jina Embeddings8080Text and image embeddings
Jina Reranker8081Document reranking
Table Extraction8098PDF table extraction

Accessing from MCP Servers

GPU services run on the internal Docker network. From your MCP server container, access them at:

http://<service-name>:<port>

Example:

// From your MCP server
const response = await fetch("http://colpali-server:8001/embed", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ images: [imageBase64] }),
});

Common Patterns

Document Processing Pipeline

  1. Extract text → Docling (/process)
  2. Generate embeddings → ColPali (/embed) or Jina Embeddings (/embeddings/text)
  3. Extract entities → GLiNER (/predict)
  4. Extract tables → Table Extraction (/extract-tables)

Search and Retrieval

  1. Query embedding → Jina Embeddings (/embeddings/text)
  2. Sparse vectors → FastEmbed (BM25/SPLADE)
  3. Rerank results → Jina Reranker (/rerank)