Structum Observability Stack

Versione: 0.1.0 (Alpha) Ultimo Aggiornamento: 2025-01-12 Stato: Alpha


Indice

  1. Panoramica Architetturale

  2. Logging System

  3. Metrics System

  4. Context Propagation

  5. Plugin Observability (Enterprise)

  6. Setup Guide

  7. Integrazione Web Frameworks

  8. Best Practices

  9. Troubleshooting

  10. API Reference


1. Panoramica Architetturale

1.1 Dual-Mode Design

Structum Observability segue lo stesso pattern della configurazione: funziona out-of-the-box con fallback semplice, scala verso enterprise con plugin.

┌─────────────────────────────────────────────────────────┐
│              APPLICATION CODE                           │
│   log.info("msg"), metrics.increment("counter")        │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│           OBSERVABILITY FACADE (Core)                   │
│  • get_logger() → LoggerInterface                       │
│  • get_metrics() → MetricsInterface                     │
└────────────┬──────────────────────┬─────────────────────┘
             │                      │
    ┌────────▼────────┐    ┌────────▼────────┐
    │   LOGGING       │    │    METRICS      │
    └────────┬────────┘    └────────┬────────┘
             │                      │
    ┌────────▼────────┐    ┌────────▼────────┐
    │  Mode Detection │    │  Mode Detection │
    └────────┬────────┘    └────────┬────────┘
             │                      │
       ┌─────┴─────┐          ┌─────┴─────┐
       │           │          │           │
┌──────▼──┐  ┌─────▼────┐  ┌─▼──────┐  ┌─▼─────────┐
│Fallback │  │Enterprise│  │ No-Op  │  │Prometheus │
│(stdlib) │  │(Structlog│  │(dummy) │  │ (Plugin)  │
└─────────┘  └──────────┘  └────────┘  └───────────┘

1.2 Mode Detection (Automatico)

Il sistema rileva automaticamente quale backend usare:

Logging:

# All'import di structum_lab.logging
try:
    import structlog
    _LOGGER_BACKEND = "structlog"  # Enterprise Mode
except ImportError:
    _LOGGER_BACKEND = "stdlib"     # Fallback Mode

Metrics:

# All'import di structum_lab.monitoring
try:
    from prometheus_client import Counter, Histogram
    _METRICS_BACKEND = "prometheus"
except ImportError:
    _METRICS_BACKEND = "noop"

⚠️ Nota: La detection avviene all’import-time, non runtime. Se installi plugin dopo aver importato, serve restart.


2. Logging System

2.1 API Unificata

Tutti i logger implementano LoggerInterface:

from typing import Protocol, Any

class LoggerInterface(Protocol):
    """Contratto per logging backend."""
    
    def debug(self, message: str, **kwargs: Any) -> None: ...
    def info(self, message: str, **kwargs: Any) -> None: ...
    def warning(self, message: str, **kwargs: Any) -> None: ...
    def error(self, message: str, **kwargs: Any) -> None: ...
    def critical(self, message: str, **kwargs: Any) -> None: ...

Uso:

from structum_lab.logging import get_logger

log = get_logger(__name__)

# Funziona IDENTICO in Fallback e Enterprise mode
log.info("User logged in", user_id=123, ip_address="192.168.1.1")

2.2 Fallback Mode (Default)

Attivazione: Automatica se structlog non è installato.

Implementazione: Wrapper sottile attorno a logging.Logger stdlib.

Output:

2025-01-11 10:30:45 INFO     myapp.service: User logged in

Caratteristiche:

  • ✅ Zero dipendenze extra

  • ✅ Compatibile con stdlib logging (può coesistere)

  • ❌ **kwargs finiscono in extra dict (non visibili di default)

  • ❌ Formato fisso, non customizzabile

  • ❌ No structured output (JSON)

Configurazione:

from structum_lab.logging import configure_logging

# Setup fallback logger
configure_logging(
    level="INFO",               # Log level globale
    format="text",              # "text" o "json" (limitato in fallback)
    handlers=["console"],       # Output destinations
    capture_warnings=True       # Cattura warnings.warn()
)

Accesso kwargs in Fallback:

import logging

# Custom formatter per mostrare kwargs
formatter = logging.Formatter(
    '%(asctime)s %(levelname)s %(name)s: %(message)s [%(user_id)s]'
)

# Gli extra kwargs sono accessibili come %(nome)s
log.info("User login", extra={"user_id": 123})  # ⚠️ Sintassi diversa da Enterprise

2.3 Enterprise Mode (Plugin)

Attivazione: Automatica quando structum-observability è installato.

Implementazione: Powered by structlog con processors ottimizzati.

Output:

{
  "timestamp": "2025-01-11T10:30:45.123456Z",
  "level": "info",
  "logger": "myapp.service",
  "message": "User logged in",
  "user_id": 123,
  "ip_address": "192.168.1.1",
  "request_id": "req-abc123",
  "process": 12345,
  "thread": "MainThread"
}

Caratteristiche:

  • ✅ Output JSON strutturato (machine-readable)

  • kwargs integrati nel log event (first-class)

  • ✅ Context automatico (request_id, user, correlation_id)

  • ✅ Performance: ~3x più veloce di stdlib per high-throughput

  • ✅ Processor chain customizzabile

Configurazione:

from structum_lab.logging import configure_logging

configure_logging(
    level="INFO",
    format="json",                    # Structured JSON
    processors=[                      # Custom processor chain
        "timestamp",                  # ISO8601 timestamp
        "add_log_level",              # Normalizza "level" key
        "add_logger_name",            # Aggiunge "logger" key
        "contextvars",                # Injetta request_id, user, etc
        "stack_info",                 # Traceback su error
        "json_renderer"               # Final JSON encoding
    ],
    context_class="dict",             # Formato context interno
    wrapper_class="BoundLogger"       # Logger class customizzata
)

Processors Disponibili:

Processor

Funzione

timestamp

Aggiunge ISO8601 timestamp

add_log_level

Normalizza level name

add_logger_name

Aggiunge nome logger

contextvars

Injetta context da contextvars

stack_info

Cattura stack trace

exception_formatter

Formatta exceptions

json_renderer

Encode JSON finale

console_renderer

Pretty-print per dev

2.4 Confronto Dettagliato

Feature

Fallback Mode

Enterprise Mode

Backend

logging (stdlib)

structlog

Output Format

Testo leggibile

JSON strutturato

kwargs Handling

extra dict (nascosto)

First-class fields

Performance

Standard

~3x più veloce

Context Injection

Manuale

Automatico (contextvars)

Customization

Limitata (Formatter)

Completa (Processors)

Dependencies

Zero

structlog>=23.1.0

Use Case

CLI, scripts, dev

Produzione, microservizi

Parsing Tools

Grep, awk

jq, ELK, Splunk

2.5 Logger Naming Convention

# ✅ Usa __name__ per auto-hierarchy
log = get_logger(__name__)
# myapp.services.user → Logger("myapp.services.user")

# ✅ Namespace esplicito per component
log = get_logger("myapp.database")

# ❌ Non usare stringhe hardcoded generiche
log = get_logger("logger")  # Non tracciabile

2.6 Verificare Mode Attivo

from structum_lab.logging import get_logger_backend

backend = get_logger_backend()
print(f"Logging backend attivo: {backend}")
# Output: "stdlib" o "structlog"

3. Metrics System

3.1 API Unificata

from typing import Protocol, Any

class MetricsInterface(Protocol):
    """Contratto per metrics backend."""
    
    def increment(
        self, 
        name: str, 
        value: int = 1, 
        tags: dict[str, str] | None = None
    ) -> None:
        """Incrementa counter."""
    
    def gauge(
        self, 
        name: str, 
        value: float, 
        tags: dict[str, str] | None = None
    ) -> None:
        """Imposta gauge value."""
    
    def timing(
        self, 
        name: str, 
        value: float, 
        tags: dict[str, str] | None = None
    ) -> None:
        """Registra durata (convertito in histogram)."""
    
    def histogram(
        self, 
        name: str, 
        value: float, 
        tags: dict[str, str] | None = None
    ) -> None:
        """Registra distribuzione valori."""

Uso:

from structum_lab.monitoring import get_metrics

metrics = get_metrics("myapp")

# Counter: solo aumenta
metrics.increment("requests.total", tags={"method": "GET", "status": "200"})

# Gauge: può aumentare/diminuire
metrics.gauge("connections.active", 42)

# Timing: durata operazione
metrics.timing("request.duration", 0.123, tags={"endpoint": "/api/users"})

# Histogram: distribuzione valori
metrics.histogram("payload.size_bytes", 1024)

3.2 No-Op Mode (Default)

Attivazione: Automatica se prometheus-client non è installato.

Comportamento:

  • Tutti i metodi sono no-op (non fanno nulla)

  • Nessun errore sollevato

  • ⚠️ Nessun warning che metriche sono ignorate

Come rilevare:

from structum_lab.monitoring import get_metrics_backend

backend = get_metrics_backend()
if backend == "noop":
    print("⚠️ WARNING: Metrics are disabled (No-Op mode)")

3.3 Prometheus Mode (Plugin)

Attivazione: Automatica quando structum-observability è installato.

Registry: Usa prometheus_client.REGISTRY globale di default.

Conversione Metriche:

API Call

Prometheus Type

Note

increment()

Counter

Solo aumenta

gauge()

Gauge

Può aumentare/diminuire

timing()

Histogram

Con buckets auto-configurati

histogram()

Histogram

Distribuzione custom

Namespace Prefix:

metrics = get_metrics("myapp")
metrics.increment("requests.total")

# Prometheus metric name:
# myapp_requests_total
#
# Pattern: {namespace}_{metric_name}

Tags → Labels:

metrics.increment(
    "requests.total",
    tags={"method": "GET", "status": "200"}
)

# Prometheus:
# myapp_requests_total{method="GET", status="200"} 1

3.4 Metriche Built-in di Structum

Configuration Provider (structum.config)

Se usi DynaconfConfigProvider, ottieni automaticamente:

# Counter: operazioni config
structum_config_operations_total{
    operation="get|set|has",
    status="success|error",
    cache="hit|miss"
}

# Histogram: latenza operazioni
structum_config_operation_duration_seconds_bucket{
    operation="get|set|has"
}

# Gauge: cache size
structum_config_cache_size

# Gauge: cache hit rate (0.0-1.0)
structum_config_cache_hit_rate

Esempio query Prometheus:

# Operations per second
rate(structum_config_operations_total[5m])

# P99 latency
histogram_quantile(
    0.99,
    rate(structum_config_operation_duration_seconds_bucket[5m])
)

# Cache effectiveness
structum_config_cache_hit_rate > 0.8

3.5 Best Practices

✅ Naming Convention:

# Pattern: namespace.component.metric_name
metrics = get_metrics("myapp.api")
metrics.increment("requests.total")  # myapp_api_requests_total

# ❌ Evita prefissi ridondanti
metrics.increment("myapp_api_requests_total")  # diventa myapp_api_myapp_api_...

✅ Tags vs Metric Names:

# ✅ Usa tags per dimensioni
metrics.increment("requests.total", tags={"method": "GET", "status": "200"})
metrics.increment("requests.total", tags={"method": "POST", "status": "201"})

# ❌ Non creare metriche separate
metrics.increment("requests.get.200")
metrics.increment("requests.post.201")

✅ Cardinality Control:

# ✅ Bassa cardinalità (controllato)
tags = {"method": "GET", "status": "200"}  # ~10 methods * ~10 status = 100 series

# ❌ Alta cardinalità (esplosione combinatoria)
tags = {"user_id": user_id, "request_id": req_id}  # Milioni di time series!

✅ Counter vs Gauge:

# ✅ Counter: valori monotonically increasing
metrics.increment("requests_total")      # Solo aumenta
metrics.increment("errors_total")        # Solo aumenta

# ✅ Gauge: valori che fluttuano
metrics.gauge("connections_active", 42)  # Può salire/scendere
metrics.gauge("queue_size", 100)         # Può salire/scendere

# ❌ Errore: gauge per contatore
metrics.gauge("requests_total", total_requests)  # Perde rate info

4. Context Propagation

4.1 Cos’è il Context?

Il context è un insieme di chiavi-valori che si propaga automaticamente attraverso:

  • Chiamate di funzione

  • Thread (con limitazioni)

  • Log events

  • Metrics tags (opzionale)

Esempio:

HTTP Request → request_id="abc123", user="alice"
    ↓
Service Layer
    ↓ (context si propaga)
Database Call
    ↓ (context presente nei log)
Log: "Query executed" + request_id="abc123" + user="alice"

4.2 Implementazione (contextvars)

Structum usa contextvars (stdlib Python 3.7+) per context propagation.

⚠️ Importante: contextvars è thread-local ma async-safe.

from contextvars import ContextVar

# Definisci context vars
request_id_var: ContextVar[str] = ContextVar("request_id", default=None)
user_var: ContextVar[str] = ContextVar("user", default=None)

4.3 Impostare Context

Manualmente:

from structum_lab.logging import set_context, get_logger

log = get_logger(__name__)

# Imposta context per questa request
set_context(request_id="req-abc123", user="alice")

# Tutti i log successivi includeranno questi valori
log.info("Processing order")
# {... "request_id": "req-abc123", "user": "alice", ...}

log.info("Order completed", order_id=456)
# {... "request_id": "req-abc123", "user": "alice", "order_id": 456, ...}

Con Context Manager:

from structum_lab.logging import bind_context, get_logger

log = get_logger(__name__)

with bind_context(request_id="req-xyz", user="bob"):
    log.info("Inside context")  # Ha request_id e user
    
log.info("Outside context")  # request_id e user non presenti

Con Decorator:

from structum_lab.logging import with_context

@with_context(user="system")
def background_job():
    log.info("Job started")  # Automaticamente ha user="system"

4.4 Leggere Context

from structum_lab.logging import get_context

context = get_context()
print(context)  # {"request_id": "...", "user": "..."}

# Accesso singolo valore
request_id = context.get("request_id")

4.5 Clearare Context

from structum_lab.logging import clear_context

clear_context()  # Rimuove tutto il context
clear_context("request_id")  # Rimuove solo request_id

4.6 Thread Safety

✅ Async-Safe:

import asyncio
from structum_lab.logging import set_context, get_logger

log = get_logger(__name__)

async def handler():
    set_context(request_id="req-123")
    log.info("Async handler")  # Context OK
    
    await other_async_func()  # Context si propaga
    
asyncio.run(handler())

⚠️ Thread-Local (Non si propaga tra thread):

import threading
from structum_lab.logging import set_context, get_logger

log = get_logger(__name__)

def worker():
    log.info("Worker thread")  # ❌ Context PERSO (thread diverso)

set_context(request_id="req-456")
log.info("Main thread")  # ✅ Context OK

thread = threading.Thread(target=worker)
thread.start()

Soluzione per Threading:

from contextvars import copy_context

def worker():
    log.info("Worker")  # ✅ Context presente

set_context(request_id="req-789")

# Copia context nel nuovo thread
ctx = copy_context()
thread = threading.Thread(target=ctx.run, args=(worker,))
thread.start()

5. Plugin Observability (Enterprise)

5.1 Installazione

pip install structum-observability

Dipendenze incluse:

  • structlog>=23.1.0 (logging)

  • prometheus-client>=0.18.0 (metrics)

  • python-json-logger>=2.0.0 (JSON formatting)

5.2 Attivazione Automatica

Il plugin si attiva automaticamente all’import:

# main.py
from structum_lab.logging import get_logger, configure_logging
from structum_lab.monitoring import get_metrics

# ✅ Se structum-observability è installato, usa Enterprise mode
# ❌ Altrimenti, usa Fallback/No-Op mode

log = get_logger(__name__)
metrics = get_metrics("myapp")

⚠️ Nota: Non serve import structum_lab.plugins.observability esplicito.

5.3 Configurazione Enterprise

from structum_lab.logging import configure_logging

configure_logging(
    level="INFO",
    format="json",
    processors=[
        "timestamp",
        "add_log_level",
        "add_logger_name",
        "contextvars",           # ← Context injection
        "stack_info",
        "exception_formatter",
        "json_renderer"
    ],
    context_vars=[               # ← Definisci quali context vars trackare
        "request_id",
        "user",
        "correlation_id",
        "tenant_id"
    ]
)

5.4 Decorator: track_operation

Automatically logs + metrics per operazione:

from structum_lab.plugins.observability import track_operation

@track_operation("process_order")
def process_order(order_id: int):
    # Logic here
    return {"status": "completed"}

# Automaticamente:
# 1. Log "Starting process_order" (con timestamp)
# 2. Metrics: myapp_process_order_total{status="success"}
# 3. Metrics: myapp_process_order_duration_seconds (histogram)
# 4. Log "Completed process_order" (con duration)

Con context:

@track_operation("checkout", include_args=["user_id"])
def checkout(user_id: int, cart_id: int):
    # ...
    pass

# Log includes: user_id=123 (cart_id escluso)

5.5 Tracing Support (Future)

⚠️ Alpha: Supporto per OpenTelemetry tracing in roadmap.

# Future API (non ancora disponibile)
from structum_lab.plugins.observability import trace

@trace(span_name="database_query")
def query_user(user_id):
    # Automatically creates span + exports to OTLP
    pass

6. Setup Guide

6.1 Quick Start (Fallback Mode)

Per: CLI tools, scripts, prototipazione rapida.

# Nessuna installazione extra necessaria
pip install structum-lab
# app.py
from structum_lab.logging import configure_logging, get_logger

# Setup (opzionale, defaults OK)
configure_logging(level="INFO")

log = get_logger(__name__)
log.info("Application started")

Output:

2025-01-11 10:45:00 INFO     __main__: Application started

6.2 Upgrade to Enterprise Mode

# Installa plugin
pip install structum-observability
# app.py (STESSO CODICE, nessun cambiamento)
from structum_lab.logging import configure_logging, get_logger

configure_logging(
    level="INFO",
    format="json"  # ← Ora supportato
)

log = get_logger(__name__)
log.info("Application started")

Output:

{"timestamp": "2025-01-11T10:45:00.123Z", "level": "info", "logger": "__main__", "message": "Application started"}

6.3 Production Setup (Completo)

# config/observability.py
from structum_lab.logging import configure_logging
from structum_lab.monitoring import configure_metrics
import os

def setup_observability():
    """Setup production-grade observability."""
    
    # Environment
    env = os.getenv("ENV", "development")
    log_level = os.getenv("LOG_LEVEL", "INFO")
    
    # Logging
    configure_logging(
        level=log_level,
        format="json" if env == "production" else "console",
        processors=[
            "timestamp",
            "add_log_level",
            "add_logger_name",
            "contextvars",
            "stack_info",
            "exception_formatter",
            "json_renderer" if env == "production" else "console_renderer"
        ],
        context_vars=["request_id", "user", "tenant_id"]
    )
    
    # Metrics (se plugin disponibile)
    try:
        configure_metrics(
            namespace="myapp",
            enable_default_metrics=True,  # Process/Runtime metrics
            histogram_buckets=[0.001, 0.01, 0.1, 0.5, 1.0, 5.0, 10.0]
        )
    except ImportError:
        print("⚠️ Metrics plugin not available (No-Op mode)")

# main.py
from config.observability import setup_observability

if __name__ == "__main__":
    setup_observability()
    
    # App code
    from structum_lab.logging import get_logger
    log = get_logger(__name__)
    log.info("Application initialized")

7. Integrazione Web Frameworks

7.1 Flask

# app.py
from flask import Flask, request, g
from structum_lab.logging import get_logger, set_context
from structum_lab.monitoring import get_metrics
import uuid
import time

app = Flask(__name__)
log = get_logger(__name__)
metrics = get_metrics("myapp.api")

@app.before_request
def before_request():
    """Inject context per-request."""
    # Genera request ID
    request_id = request.headers.get("X-Request-ID", str(uuid.uuid4()))
    
    # Imposta context (si propaga in tutti i log)
    set_context(
        request_id=request_id,
        method=request.method,
        path=request.path
    )
    
    # Salva timestamp per latency
    g.start_time = time.perf_counter()
    
    log.info("Request started")

@app.after_request
def after_request(response):
    """Emetti metrics post-request."""
    # Latency
    duration = time.perf_counter() - g.start_time
    
    metrics.timing(
        "request.duration",
        duration,
        tags={
            "method": request.method,
            "endpoint": request.endpoint or "unknown",
            "status": response.status_code
        }
    )
    
    # Counter
    metrics.increment(
        "requests.total",
        tags={
            "method": request.method,
            "status": response.status_code
        }
    )
    
    log.info("Request completed", status=response.status_code, duration=duration)
    
    return response

@app.route("/users/<int:user_id>")
def get_user(user_id):
    log.info("Fetching user", user_id=user_id)
    # ... business logic ...
    return {"user_id": user_id, "name": "Alice"}

@app.route("/metrics")
def metrics_endpoint():
    """Expose Prometheus metrics."""
    from prometheus_client import generate_latest, CONTENT_TYPE_LATEST
    from flask import Response
    
    return Response(generate_latest(), mimetype=CONTENT_TYPE_LATEST)

if __name__ == "__main__":
    from config.observability import setup_observability
    setup_observability()
    
    app.run()

7.2 FastAPI

# app.py
from fastapi import FastAPI, Request
from structum_lab.logging import get_logger, set_context, clear_context
from structum_lab.monitoring import get_metrics
import uuid
import time

app = FastAPI()
log = get_logger(__name__)
metrics = get_metrics("myapp.api")

@app.middleware("http")
async def observability_middleware(request: Request, call_next):
    """Inject context + metrics per request."""
    # Setup context
    request_id = request.headers.get("x-request-id", str(uuid.uuid4()))
    set_context(
        request_id=request_id,
        method=request.method,
        path=request.url.path
    )
    
    log.info("Request started")
    
    # Track timing
    start = time.perf_counter()
    
    try:
        response = await call_next(request)
        duration = time.perf_counter() - start
        
        # Metrics
        metrics.timing(
            "request.duration",
            duration,
            tags={
                "method": request.method,
                "path": request.url.path,
                "status": response.status_code
            }
        )
        
        metrics.increment(
            "requests.total",
            tags={"method": request.method, "status": response.status_code}
        )
        
        log.info("Request completed", status=response.status_code, duration=duration)
        
        return response
        
    finally:
        # Cleanup context
        clear_context()

@app.get("/users/{user_id}")
async def get_user(user_id: int):
    log.info("Fetching user", user_id=user_id)
    return {"user_id": user_id, "name": "Bob"}

@app.get("/metrics")
async def metrics_endpoint():
    """Prometheus metrics endpoint."""
    from prometheus_client import generate_latest, CONTENT_TYPE_LATEST
    from fastapi.responses import Response
    
    return Response(content=generate_latest(), media_type=CONTENT_TYPE_LATEST)

if __name__ == "__main__":
    import uvicorn
    from config.observability import setup_observability
    
    setup_observability()
    uvicorn.run(app, host="0.0.0.0", port=8000)

7.3 Django (Middleware)

# middleware.py
from structum_lab.logging import set_context, clear_context, get_logger
from structum_lab.monitoring import get_metrics
import uuid
import time

log = get_logger(__name__)
metrics = get_metrics("myapp.api")

class ObservabilityMiddleware:
    """Django middleware for logging + metrics."""
    
    def __init__(self, get_response):
        self.get_response = get_response
    
    def __call__(self, request):
        # Setup context
        request_id = request.META.get("HTTP_X_REQUEST_ID", str(uuid.uuid4()))
        set_context(
            request_id=request_id,
            method=request.method,
            path=request.path
        )
        
        log.info("Request started")
        start = time.perf_counter()
        
        try:
            response = self.get_response(request)
            duration = time.perf_counter() - start
            
            # Metrics
            metrics.timing(
                "request.duration",
                duration,
                tags={
                    "method": request.method,
                    "status": response.status_code
                }
            )
            
            metrics.increment(
                "requests.total",
                tags={"method": request.method, "status": response.status_code}
            )
            
            log.info("Request completed", status=response.status_code, duration=duration)
            
            return response
            
        finally:
            clear_context()

# settings.py
MIDDLEWARE = [
    # ... other middleware ...
    "myapp.middleware.ObservabilityMiddleware",
]

8. Best Practices

8.1 Logging Best Practices

✅ Structured Data over String Formatting:

# ❌ String formatting (perde struttura)
log.info(f"User {user_id} logged in from {ip}")

# ✅ Structured kwargs
log.info("User logged in", user_id=user_id, ip_address=ip)

✅ Log Levels:

# DEBUG: Informazioni di sviluppo, verbose
log.debug("Cache lookup", key="user:123", found=True)

# INFO: Eventi business significativi
log.info("Order placed", order_id=456, user_id=123)

# WARNING: Situazioni anomale ma gestite
log.warning("Rate limit approached", user_id=789, current_rate=95)

# ERROR: Errori che richiedono attenzione
log.error("Payment failed", order_id=456, error="Insufficient funds")

# CRITICAL: Failures catastrofici
log.critical("Database connection lost", attempts=3)

✅ Exception Logging:

try:
    risky_operation()
except Exception as e:
    log.error("Operation failed", exc_info=True)  # Include traceback
    # In Enterprise mode, exception viene formattata automaticamente

❌ Evita Log Spam:

# ❌ Log in loop
for item in items:
    log.info("Processing item", item_id=item.id)  # Troppo verbose!

# ✅ Log aggregato
log.info("Processing batch", item_count=len(items))
# ... process ...
log.info("Batch completed", processed=len(items), failed=failed_count)

8.2 Metrics Best Practices

✅ Counter per Eventi:

# ✅ Usa counter per contare eventi
metrics.increment("orders.placed")
metrics.increment("emails.sent", tags={"type": "welcome"})

✅ Gauge per Stati:

# ✅ Usa gauge per valori istantanei
metrics.gauge("queue.depth", current_queue_size)
metrics.gauge("connections.active", active_connections)

✅ Histogram per Distribuzioni:

# ✅ Usa histogram per latency/size
metrics.histogram("request.duration", duration_seconds)
metrics.histogram("response.size_bytes", response_size)

❌ Non Abusare di Tags:

# ❌ Alta cardinalità (milioni di time series)
metrics.increment("views", tags={"user_id": user_id, "page_url": url})

# ✅ Bassa cardinalità (controllato)
metrics.increment("views", tags={"page_type": "article", "category": "tech"})

8.3 Context Best Practices

✅ Imposta Context All’Entrata:

# ✅ Setup context appena possibile
@app.before_request
def setup_context():
    set_context(request_id=generate_id(), user=get_current_user())

✅ Clear Context All’Uscita:

# ✅ Cleanup per evitare leak
@app.after_request
def cleanup_context(response):
    clear_context()
    return response

❌ Non Mutare Context nel Mezzo:

# ❌ Confusione: context cambia a metà request
def process():
    log.info("Start")  # user="alice"
    set_context(user="bob")
    log.info("End")  # user="bob" ← Inconsistente!

# ✅ Usa nested context se necessario
with bind_context(subprocess="worker"):
    log.info("Subprocess work")  # user="alice", subprocess="worker"

9. Troubleshooting

9.1 “Logging non funziona”

Sintomo: log.info() non stampa nulla.

Checklist:

  1. Log level troppo alto?

    configure_logging(level="DEBUG")  # Abbassa il level
    
  2. Handler configurato?

    import logging
    logging.basicConfig()  # Fallback stdlib
    
  3. Stdout buffering?

    python -u app.py  # Unbuffered output
    # o
    export PYTHONUNBUFFERED=1
    

9.2 “kwargs non compaiono nei log”

Causa: Sei in Fallback mode, kwargs finiscono in extra.

Verifica mode:

from structum_lab.logging import get_logger_backend
print(get_logger_backend())  # "stdlib" o "structlog"?

Fix: Installa plugin enterprise:

pip install structum-observability

9.3 “Metriche non vengono raccolte”

Sintomo: /metrics endpoint vuoto o metriche mancanti.

Checklist:

  1. Backend attivo?

    from structum_lab.monitoring import get_metrics_backend
    print(get_metrics_backend())  # "noop" o "prometheus"?
    
  2. Namespace corretto?

    metrics = get_metrics("myapp")
    metrics.increment("test")
    
    # Cerca in Prometheus: myapp_test
    
  3. Registry condiviso?

    from prometheus_client import REGISTRY
    print(list(REGISTRY._collector_to_names.keys()))  # Mostra collectors
    

9.4 “Context non si propaga”

Causa: Threading senza copy_context().

Fix:

from contextvars import copy_context
import threading

def worker():
    log.info("Worker")  # Context presente

set_context(request_id="123")

ctx = copy_context()
thread = threading.Thread(target=ctx.run, args=(worker,))
thread.start()

9.5 “Plugin non si attiva”

Sintomo: structum-observability installato ma usa ancora Fallback.

Causa: Import avvenuto prima dell’installazione.

Fix: Restart del processo Python:

# ❌ Non basta reinstallare
pip install structum-observability
python app.py  # Ancora Fallback

# ✅ Serve restart
pip install structum-observability
# Ferma Python
# Riavvia
python app.py  # Ora usa Enterprise mode

10. API Reference

10.1 Logging API

# Get logger
from structum_lab.logging import get_logger
log = get_logger(name: str) -> LoggerInterface

# Configure
from structum_lab.logging import configure_logging
configure_logging(
    level: str = "INFO",
    format: str = "text",  # "text", "json", "console"
    processors: list[str] | None = None,
    handlers: list[str] = ["console"],
    capture_warnings: bool = True,
    context_vars: list[str] | None = None
) -> None

# Context management
from structum_lab.logging import (
    set_context,
    get_context,
    clear_context,
    bind_context,
    with_context
)

set_context(**kwargs) -> None
get_context() -> dict[str, Any]
clear_context(key: str | None = None) -> None

@bind_context(**kwargs)  # Context manager
@with_context(**kwargs)  # Decorator

# Backend detection
from structum_lab.logging import get_logger_backend
get_logger_backend() -> str  # "stdlib" | "structlog"

10.2 Metrics API

# Get metrics emitter
from structum_lab.monitoring import get_metrics
metrics = get_metrics(namespace: str) -> MetricsInterface

# Configure
from structum_lab.monitoring import configure_metrics
configure_metrics(
    namespace: str,
    enable_default_metrics: bool = True,
    histogram_buckets: list[float] | None = None,
    registry: Any | None = None  # Prometheus Registry
) -> None

# Backend detection
from structum_lab.monitoring import get_metrics_backend
get_metrics_backend() -> str  # "noop" | "prometheus"

10.3 Plugin API

# Decorator
from structum_lab.plugins.observability import track_operation

@track_operation(
    operation_name: str,
    include_args: list[str] | None = None,
    include_result: bool = False,
    metric_tags: dict[str, str] | None = None
)

Changelog

v0.1.0 (Alpha)

  • 📝 Documentazione unificata (logging + metrics + context)

  • ✨ Esempi completi per Flask/FastAPI/Django

  • 🐛 Chiarito mode detection e attivazione plugin

  • 📊 Best practices e troubleshooting guide


Riferimenti


Maintainer: Structum Observability Team
Repository: https://github.com/structum-lab/structum
Issues: https://github.com/structum-lab/structum/issues