DocsGPU ServicesDocling (PDF Processing)

Docling (PDF Processing)

GPU-accelerated PDF processing service using the Docling VLM (Vision Language Model). Extracts text, tables, figures, and markdown from PDF documents.

Port: 8080
Model: GraniteDocling-258M

What it Does

Docling processes PDFs and extracts structured content:

  • Text extraction with document structure preservation
  • Table detection and structured output
  • Figure extraction with bounding boxes
  • Markdown conversion for readable output
  • OCR support for scanned documents

API Endpoints

Process PDF (Base64)

Endpoint: POST /process

curl -X POST "http://docling-server:8080/process" \
  -H "Content-Type: application/json" \
  -d '{
    "pdf_base64": "<base64-encoded-pdf>",
    "source_name": "document.pdf",
    "collection_name": "my-collection",
    "return_markdown": true
  }'

Request:

ParameterTypeDescription
pdf_base64stringBase64-encoded PDF bytes
source_namestringOriginal filename
collection_namestringCollection for organizing
chunk_sizenumberText chunk size (default: 1000)
chunk_overlapnumberOverlap between chunks (default: 100)
return_markdownbooleanInclude markdown output

Response:

{
  "success": true,
  "chunks": [
    {"type": "text", "page_no": 1, "chunk_id": 0, "content": "..."}
  ],
  "tables": [...],
  "figures": [...],
  "markdown": "# Document Title\n\nContent...",
  "metadata": {
    "source_name": "document.pdf",
    "status": "completed"
  }
}

Process PDF (File Upload)

Endpoint: POST /process/upload

curl -X POST "http://docling-server:8080/process/upload" \
  -F "file=@document.pdf" \
  -F "collection_name=my-collection"

Response:

{
  "success": true,
  "markdown": "# Document Title...",
  "metadata": {
    "source_name": "document.pdf",
    "docling_pipeline_enabled": true,
    "ocr_mode": "auto"
  },
  "page_count": 5,
  "processing_time": 2.34
}

Health Check

curl http://docling-server:8080/health

Usage Example

import requests
import base64
 
# Read PDF file
with open("contract.pdf", "rb") as f:
    pdf_b64 = base64.b64encode(f.read()).decode()
 
# Process document
response = requests.post(
    "http://docling-server:8080/process",
    json={
        "pdf_base64": pdf_b64,
        "source_name": "contract.pdf",
        "return_markdown": True
    }
)
 
result = response.json()
print(result["markdown"])

Configuration

VariableDefaultDescription
DOCLING_ENABLE_PIPELINEtrueFull VLM pipeline; false for fast text-only
DOCLING_USE_CUDAtrueEnable GPU acceleration
DOCLING_CUDA_USE_FLASH_ATTENTION2trueFlash Attention 2

Fast Mode

Set DOCLING_ENABLE_PIPELINE=false for fast text-only extraction (no OCR, no AI):

docker run -e DOCLING_ENABLE_PIPELINE=false docling-server