← Retour Γ  l'accueil

πŸ”Œ Ollama Bridge Service

Service sidecar FastAPI pour rΓ©silience et fiabilitΓ© Ollama

Architecture Sidecar Bridge

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   COSMIC Orchestrator (Main API)  β”‚
β”‚         Port 8100                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β”‚ HTTP Request
               β”‚ (internal network)
               β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  πŸ”Œ Ollama Bridge    β”‚
    β”‚  FastAPI Service     β”‚
    β”‚  Port 8200           β”‚
    β”‚                      β”‚
    β”‚  β€’ Retry Logic       β”‚
    β”‚  β€’ Circuit Breaker   β”‚
    β”‚  β€’ Health Checks     β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β”‚ Resilient Connection
               β”‚ (with retries)
               β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  🏠 Ollama Host      β”‚
    β”‚  Mac Studio M2       β”‚
    β”‚  Port 11434          β”‚
    β”‚                      β”‚
    β”‚  β€’ deepseek-r1:1.5b  β”‚
    β”‚  β€’ nomic-embed-text  β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Failure Handling               β”‚
β”‚  ────────────────               β”‚
β”‚  1. Request fails β†’ Retry       β”‚
β”‚  2. 3 retries fail β†’ Circuit    β”‚
β”‚  3. Circuit open β†’ Fast fail    β”‚
β”‚  4. After 60s β†’ Half-open test  β”‚
β”‚  5. Success β†’ Circuit closed    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

4 Γ‰tapes du Process

1

Request Interception

API Gateway intercepte les requΓͺtes Ollama

  • FastAPI endpoint exposure
  • Request validation
  • Authentication check
  • Rate limiting
2

Retry Logic

Gestion intelligente des Γ©checs temporaires

  • Exponential backoff (2s β†’ 4s β†’ 8s)
  • Max 3 retry attempts
  • Jitter randomization
  • Request idempotency
3

Circuit Breaker

Protection contre les dΓ©faillances en cascade

  • Failure threshold: 5 errors/30s
  • Half-open state after 60s
  • Health check endpoints
  • Automatic recovery
4

Response Processing

Normalisation et validation des rΓ©ponses

  • JSON schema validation
  • Error handling
  • Metric collection
  • Logging & tracing

SpΓ©cifications Techniques

Service Port
8200
FastAPI HTTP
Max Retries
3
with backoff
Circuit Threshold
5/30s
errors per window
Recovery Time
60s
half-open state
Request Timeout
30s
per attempt
Uptime
100%
production validated

πŸ’‘ Avantages ClΓ©s

  • RΓ©silience: Retry logic Γ©limine les Γ©checs temporaires
  • Protection: Circuit breaker Γ©vite les cascades de pannes
  • DΓ©couplage: API principale indΓ©pendante de la stabilitΓ© Ollama
  • Monitoring: MΓ©triques dΓ©taillΓ©es sur santΓ© et performance
  • Production-ready: 100% uptime validΓ© en charge rΓ©elle

Configuration Docker

cosmic-ollama-bridge: image: tiangolo/uvicorn-gunicorn-fastapi:python3.11 container_name: cosmic-ollama-bridge ports: - "8200:8200" environment: OLLAMA_HOST: host.docker.internal:11434 MAX_RETRIES: 3 CIRCUIT_THRESHOLD: 5 CIRCUIT_TIMEOUT: 60 networks: - cosmic-network restart: unless-stopped