BIZ-46 Phase3: 7项 follow-up 开发完成
1. 架构解耦 — SidecarContext + FastAPI Depends 注入 - 新增 context.py: SidecarContext dataclass 收敛全部全局状态 - server.py: 移除模块级全局变量,lifespan 创建 ctx → app.state.sidecar - webui.py: 移除反向导入 server,改用 Depends(get_context) 2. Prometheus 标签基数治理 — model_id → provider - upstream_latency_seconds / upstream_errors_total label 收敛为 provider - 模型级信息保留在 structlog JSON 日志 3. SSE 快照共享缓存 - 1s TTL 共享 snapshot cache + double-check locking - 多客户端不重复构建快照 4. 部署支撑 - Dockerfile (python:3.12-slim, 非 root 用户, HEALTHCHECK) - systemd service (安全加固, 资源限制) - .env.example (完整环境变量清单) 5. Readiness HTTP Client 复用 - check_upstream() 注入主 http_client,不再每次创建新 client 6. Retreat 并发回归测试 - 5 个测试用例全部通过(死锁检测 + 状态转换 + 并发安全) 7. Dashboard UX 优化 - 队列柱状图 300ms 平滑动画 - SSE 断连 5s 半透明遮罩 - 队列图标题显示总排队数 - 页面加载同步配置 验证: mypy strict 通过 (0 errors), pytest 5/5 通过, server 导入正常 (13 routes) Co-authored-by: multica-agent <github@multica.ai>
This commit is contained in:
@@ -0,0 +1,75 @@
|
||||
"""
|
||||
NVIDIA Sidecar — SidecarContext 依赖注入容器 (§BIZ-46 Phase3)
|
||||
|
||||
将所有模块级全局状态收敛为单一 dataclass,通过 FastAPI app.state 注入,
|
||||
消除 webui.py → server 的反向导入,支持可测试性和多实例扩展。
|
||||
|
||||
设计文档: docs/architecture/BIZ-46_Phase3_Architecture_Design.md §1
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
import httpx
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from nvidia_sidecar.config import SidecarConfig
|
||||
from nvidia_sidecar.rate_limiter import AdaptiveTokenBucket
|
||||
from nvidia_sidecar.priority_queue import PriorityRequestQueue
|
||||
from nvidia_sidecar.metrics import PrometheusMetrics
|
||||
from nvidia_sidecar.health import HealthService
|
||||
|
||||
|
||||
@dataclass
|
||||
class SidecarContext:
|
||||
"""Sidecar 全局运行时上下文 — 所有核心组件的唯一容器。
|
||||
|
||||
通过 ``app.state.sidecar`` 注入 FastAPI,路由通过 ``Depends(get_context)`` 获取。
|
||||
"""
|
||||
|
||||
# ---- 核心组件 ----
|
||||
config: SidecarConfig
|
||||
http_client: httpx.AsyncClient
|
||||
token_bucket: AdaptiveTokenBucket
|
||||
priority_queue: PriorityRequestQueue
|
||||
prometheus: PrometheusMetrics
|
||||
health: HealthService
|
||||
|
||||
# ---- 运行时状态 ----
|
||||
pending_requests: dict[str, tuple["asyncio.Future[Any]", float]] = field(default_factory=dict)
|
||||
"""request_id → (response future, enqueued_at) 的映射。"""
|
||||
|
||||
stats: dict[str, int] = field(default_factory=lambda: {
|
||||
"total_requests": 0,
|
||||
"nvidia_requests": 0,
|
||||
"passthrough_requests": 0,
|
||||
"ratelimited_requests": 0,
|
||||
"queue_full_rejects": 0,
|
||||
"upstream_errors": 0,
|
||||
"start_time": 0,
|
||||
})
|
||||
|
||||
stats_lock: asyncio.Lock = field(default_factory=asyncio.Lock)
|
||||
|
||||
# ---- 缓存 ----
|
||||
snapshot_cache: tuple["dict[str, Any]", float] | None = None
|
||||
"""SSE 快照共享缓存: (data, timestamp)。"""
|
||||
snapshot_cache_lock: asyncio.Lock = field(default_factory=asyncio.Lock)
|
||||
SNAPSHOT_CACHE_TTL: float = 1.0
|
||||
|
||||
# ---- 便捷方法 ----
|
||||
|
||||
async def increment_stat(self, key: str, delta: int = 1) -> None:
|
||||
"""线程安全的统计计数器自增。"""
|
||||
async with self.stats_lock:
|
||||
self.stats[key] = self.stats.get(key, 0) + delta
|
||||
|
||||
@property
|
||||
def uptime_seconds(self) -> int:
|
||||
"""服务运行时长(秒)。"""
|
||||
st = self.stats.get("start_time", 0)
|
||||
return int(time.time() - st) if st else 0
|
||||
Reference in New Issue
Block a user