Commit Graph

2 Commits

Author SHA1 Message Date
vincent 0894a86af8 ADR-006 v2.1: final revision, NVIDIA provider keys, reply to 徐聪
v2.1 changes from 2nd-round review:
+ Emergency channel RPM: max(1, max_rpm * 0.1)
+ Queue 503: add Retry-After: 30 header
+ sidecar_backup_success Prometheus metric
+ Startup crypto.py key validation on boot
+ SQLite size limits: 100MB practical, 500MB WAL
+ RPM flow: per-request counting, not token-based
+ SSE streaming: TTFT for avg_latency_ms
+ Merge proxy/retry.py into core/cooldown.py

Added sidecar-v2-nvidia-providers.yaml (11 keys)

Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 15:19:21 +08:00
vincent 82edded30c ADR-006 v2.0: Sidecar V2 architecture revision based on review feedback
Incorporated feedback from 4 reviewers:
- 徐聪: AES key management, emergency channel, concurrency control, DDL indexes
- 陆怀瑾: P0 phase, schedule buffer, deployment topology, V1 compat checklist
- 严维序: SQLite backup, monitoring, cooldown persistence, port plan, rollback
- 沈路明: queue design, health check, per-model RPM decision, key validation, dashboard panels

Key additions:
+ Queue flow control design (FIFO + priority, capacity 500, REJECT overflow)
+ Provider health check (active probe + passive stats hybrid)
+ Per-model RPM decision (Provider-level V2, Model-level V3)
+ Key validation on add (test call with error feedback)
+ AES key management (SIDECAR_ENCRYPTION_KEY env var, backup SOP)
+ Emergency channel (10% RPM during full cooldown)
+ SQLite backup strategy (cron .backup, 7-day retention)
+ SQLite monitoring Prometheus metrics (db_size, wal_size, integrity)
+ Full DDL with indexes (ON CONFLICT, BEGIN IMMEDIATE patterns)
+ Dashboard panel list (5 panels: status, trends, history)
+ V1 compatibility checklist (13 items)
+ V1->V2 migration SOP with rollback plan
+ Deployment topology (systemd + Docker, port plan, firewall)
+ Log aggregation policy (logrotate: 10MB/30days)
+ Schedule revised: 71h/12days (added P0 + buffer)

Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 14:52:39 +08:00