← Retour Γ  l'accueil

πŸ“Š Monitoring Stack

ObservabilitΓ© production avec Prometheus + Grafana + Alertmanager

Architecture Monitoring

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   COSMIC Services (Instrumented)         β”‚
β”‚   ────────────────────────────────        β”‚
β”‚   β€’ Orchestrator API (8100)              β”‚
β”‚   β€’ Ollama Bridge (8200)                 β”‚
β”‚   β€’ /metrics endpoints                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β”‚ Scrape every 15s
               β”‚ (pull-based collection)
               β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  πŸ“Š Prometheus       β”‚
    β”‚  Time-series DB      β”‚
    β”‚  Port 9090           β”‚
    β”‚                      β”‚
    β”‚  β€’ Metrics storage   β”‚
    β”‚  β€’ Query engine      β”‚
    β”‚  β€’ Alert rules       β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚               β”‚
        β–Ό               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ“ˆ Grafana  β”‚  β”‚  πŸ”” Alert-   β”‚
β”‚  Dashboard   β”‚  β”‚  manager     β”‚
β”‚  Port 3100   β”‚  β”‚  Port 9093   β”‚
β”‚              β”‚  β”‚              β”‚
β”‚  Visualize   β”‚  β”‚  Notify      β”‚
β”‚  metrics     β”‚  β”‚  on issues   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
                         β–Ό
                  Email/Slack/PagerDuty

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Key Metrics Tracked            β”‚
β”‚  ──────────────────             β”‚
β”‚  β€’ API latency (p50/p95/p99)    β”‚
β”‚  β€’ Error rate percentage        β”‚
β”‚  β€’ Embedding generation time    β”‚
β”‚  β€’ DuckDB query performance     β”‚
β”‚  β€’ Ollama bridge uptime         β”‚
β”‚  β€’ Cost tracking (Claude usage) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

4 Piliers ObservabilitΓ©

1

Metrics Collection

Prometheus scrape endpoints toutes les 15s

  • API response times
  • Ollama bridge health
  • DuckDB query latency
  • Embedding generation metrics
2

Data Storage

Time-series database avec rΓ©tention configurable

  • 15-day retention policy
  • Automatic downsampling
  • Disk-based persistence
  • Query-optimized indexes
3

Visualization

Dashboards Grafana en temps rΓ©el

  • System health overview
  • Query performance tracking
  • Error rate monitoring
  • Cost analysis (Claude vs Ollama)
4

Alerting

Alertmanager pour notifications proactives

  • API latency > 500ms threshold
  • Error rate > 1% alert
  • Ollama downtime detection
  • Email/Slack integration

MΓ©triques Production Actuelles

API Uptime
100%
no downtime
Avg Latency
78ms
p50 response time
Error Rate
0%
118/118 success
Bridge Health
100%
Ollama stable
Scrape Interval
15s
Prometheus
Retention
15d
time-series data

Dashboards Disponibles

πŸ“Š System Overview

Vue d'ensemble santΓ© globale

  • Total requests/s
  • P95 latency trends
  • Error rate timeline
  • Service health status

πŸ” Query Performance

Analyse dΓ©taillΓ©e performance requΓͺtes

  • Neural vs Symbolic vs Hybrid breakdown
  • DuckDB query times
  • Embedding generation latency
  • Context size distribution

πŸ’° Cost Tracking

Monitoring coΓ»ts Claude vs Ollama

  • API tokens consumed
  • Estimated monthly cost
  • Ollama local savings
  • Cost per query type

πŸ”Œ Infrastructure

MΓ©triques infrastructure Docker

  • Container CPU/memory usage
  • Network I/O rates
  • Disk space utilization
  • Ollama bridge uptime

πŸ’‘ Avantages ClΓ©s

  • VisibilitΓ©: MΓ©triques temps rΓ©el sur toute la stack
  • ProactivitΓ©: Alertes avant que l'utilisateur dΓ©tecte problΓ¨me
  • Debugging: Historique 15j pour root cause analysis
  • Optimisation: Identifier goulots performance
  • Cost Control: Tracking prΓ©cis coΓ»ts Claude vs Ollama

Accès aux Services

Prometheus
http://localhost:9090
MΓ©triques brutes + Query explorer
Grafana
http://localhost:3100
Dashboards visuels (admin/admin)
Alertmanager
http://localhost:9093
Configuration alertes