b18d243ef2
1. 架构解耦 — SidecarContext + FastAPI Depends 注入 - 新增 context.py: SidecarContext dataclass 收敛全部全局状态 - server.py: 移除模块级全局变量,lifespan 创建 ctx → app.state.sidecar - webui.py: 移除反向导入 server,改用 Depends(get_context) 2. Prometheus 标签基数治理 — model_id → provider - upstream_latency_seconds / upstream_errors_total label 收敛为 provider - 模型级信息保留在 structlog JSON 日志 3. SSE 快照共享缓存 - 1s TTL 共享 snapshot cache + double-check locking - 多客户端不重复构建快照 4. 部署支撑 - Dockerfile (python:3.12-slim, 非 root 用户, HEALTHCHECK) - systemd service (安全加固, 资源限制) - .env.example (完整环境变量清单) 5. Readiness HTTP Client 复用 - check_upstream() 注入主 http_client,不再每次创建新 client 6. Retreat 并发回归测试 - 5 个测试用例全部通过(死锁检测 + 状态转换 + 并发安全) 7. Dashboard UX 优化 - 队列柱状图 300ms 平滑动画 - SSE 断连 5s 半透明遮罩 - 队列图标题显示总排队数 - 页面加载同步配置 验证: mypy strict 通过 (0 errors), pytest 5/5 通过, server 导入正常 (13 routes) Co-authored-by: multica-agent <github@multica.ai>
40 lines
1.1 KiB
Docker
40 lines
1.1 KiB
Docker
# NVIDIA Sidecar 限流代理 — 生产 Docker 镜像 (BIZ-46 Phase3 §4)
|
|
#
|
|
# 构建:
|
|
# docker build -t nvidia-sidecar:latest .
|
|
#
|
|
# 运行:
|
|
# docker run -d --name nvidia-sidecar \
|
|
# -p 127.0.0.1:9190:9190 \
|
|
# -p 127.0.0.1:9191:9191 \
|
|
# -e SIDECAR_API_KEY="nvapi-xxx" \
|
|
# -e SIDECAR_RATE_RPM=40 \
|
|
# -v $(pwd)/logs:/opt/nvidia-sidecar/logs \
|
|
# nvidia-sidecar:latest
|
|
|
|
FROM python:3.12-slim AS base
|
|
|
|
WORKDIR /app
|
|
|
|
# 安装依赖(利用 Docker 层缓存)
|
|
COPY pyproject.toml .
|
|
RUN pip install --no-cache-dir fastapi>=0.115 \
|
|
"uvicorn[standard]>=0.34" httpx>=0.28 PyYAML>=6.0 \
|
|
structlog>=24.4 "prometheus-client>=0.21" pydantic>=2.0
|
|
|
|
# 复制源码
|
|
COPY . .
|
|
|
|
# 非 root 用户运行
|
|
RUN useradd -r -m -s /bin/false sidecar \
|
|
&& mkdir -p /opt/nvidia-sidecar/logs \
|
|
&& chown -R sidecar:sidecar /app /opt/nvidia-sidecar/logs
|
|
USER sidecar
|
|
|
|
# 健康检查
|
|
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
|
|
CMD python -c "import httpx; r=httpx.get('http://127.0.0.1:9190/health'); exit(0 if r.status_code==200 else 1)"
|
|
|
|
EXPOSE 9190 9191
|
|
|
|
CMD ["uvicorn", "nvidia_sidecar.server:app", "--host", "0.0.0.0", "--port", "9190"] |