MODULO 7.5

💰 Cost Analytics

6
Topicos
~60
Minutos
Deep
Nivel
Source
Tipo
1

📊 Two-Pipeline Architecture

Two separate observability stacks in parallel: 1P pipeline using OpenTelemetry with proto-serialized events to /api/event_logging/batch, and Datadog HTTP-intake for operational monitoring. Events queue before sink attaches using queueMicrotask drain.

💡 Independence

1P and Datadog pipelines completely independent. A killswitch can silence one without affecting the other.

2

💵 Cost Tracking

addToTotalSessionCost() fans out to three destinations: in-memory accumulators, OTel counters, and analytics event log. Cost stored as microdollars (integer) in analytics, raw float for OTel counters.

💡 Session Persistence

saveCurrentSessionCosts() writes snapshot to project config JSON on exit. restoreCostStateForSession() rehydrates only when sessionId matches.

3

📨 Analytics Sink & Queue

Zero-dependency analytics module. Three functions: logEvent, logEventAsync, attachAnalyticsSink. Event queue drains via queueMicrotask when sink attaches. Sink routes through sampling gate, then fans out to Datadog (stripped _PROTO_* keys) and 1P (full payload).

💡 Sink Routing

shouldSampleEvent(): null = log 100%, 0 = drop, 0-1 = log with sample_rate metadata for downstream correction.

4

📱 Datadog Pipeline

Allowlist

~40 events in DATADOG_ALLOWED_EVENTS. Others silently dropped.

HTTP Batch

15s timer or 100 entries, whichever first.

SHA-256 Buckets

User ID hashed into 1 of 30 slots for privacy-preserving unique user counts.

First-Party Only

Skipped for Bedrock, Vertex, Foundry providers.

5

💾 First-Party Pipeline

Built on OpenTelemetry Logs SDK. Dedicated LoggerProvider (NEVER registered globally). BatchLogRecordProcessor buffers then exports to /api/event_logging/batch.

💡 Disk-Backed Retry

Failed events appended to per-session JSONL file. Retried with quadratic backoff (500ms, 2s, 4.5s, 8s...). Max 8 attempts. Survives process crashes.

6

🔒 PII Segregation

TypeScript never-typed markers force explicit 'as' cast at every logging call - visible code-review signal. _PROTO_* key convention: stripped before Datadog, hoisted to dedicated proto fields in 1P exporter.

💡 Provider Isolation

1P LoggerProvider is module-local, never registered globally. Customer OTLP uses global provider. Complete isolation.

📋 Resumo do Modulo

Every token usage flows through addToTotalSessionCost() fanning out to three destinations simultaneously.
Analytics sink is dependency-free with queue-then-drain pattern - no events lost during startup.
Datadog and 1P receive different payloads: Datadog gets _PROTO_* stripped; 1P gets full payload.
never-typed markers create compile-time paper trail - explicit cast required for every logging call.
GrowthBook controls three dimensions: Datadog gate, per-event sampling, and per-sink killswitch.
Failed events durably persisted as JSONL on disk with quadratic backoff retry.
Voltar Proximo