Files
EnterpriseArchitect/services/nvidia_sidecar/__init__.py
T
vincent 6b5f53a0fd BIZ-40: NVIDIA Sidecar 限流代理 Phase1 — 核心代理模块
交付文件:
- config.py: 配置管理 (SidecarConfig + load_config),修复 PEP 563 类型推断 bug
- rate_limiter.py: 令牌桶 (TokenBucket) + 网关识别 (is_nvidia_gateway)
- priority_queue.py: 四级优先级队列,修复 PASSTHROUGH 语义 bug
- server.py: FastAPI 代理主入口,修复 worker_loop 重试悬挂 bug
- __init__.py: 包声明与公开导出
- pyproject.toml: 依赖声明 + mypy 配置
- README.md: 快速启动指南 + 环境变量列表

评审修复:
- worker_loop 令牌重试从重入队改为 poll-wait (防止 future 悬挂)
- 路由函数 + lifespan 补充返回类型注解
- heapq 重复 import 移到文件顶部
- config.py 清理无用代码行
- types-PyYAML stub 安装
- 新增 README.md

验证: mypy 0 issues, 全量单元测试通过

Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 08:32:47 +08:00

41 lines
963 B
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
NVIDIA Sidecar 限流代理 — 核心代理模块。
为 OpenAI Chat Completions 兼容 API 提供四层防护:
1. 请求接收(FastAPI
2. 网关识别 → 非 NVIDIA 直通
3. 优先级排队 → 令牌桶限流
4. httpx 异步转发到 NVIDIA 上游
"""
from __future__ import annotations
from nvidia_sidecar.config import SidecarConfig, load_config
from nvidia_sidecar.rate_limiter import (
Priority,
TokenBucket,
is_nvidia_gateway,
normalize_gateway_name,
)
from nvidia_sidecar.priority_queue import (
PriorityQueueItem,
PriorityRequestQueue,
QueueFullError,
QueueFullPassthrough,
QueueFullPolicy,
)
__version__ = "0.1.0"
__all__ = [
"SidecarConfig",
"load_config",
"Priority",
"TokenBucket",
"is_nvidia_gateway",
"normalize_gateway_name",
"PriorityQueueItem",
"PriorityRequestQueue",
"QueueFullError",
"QueueFullPassthrough",
"QueueFullPolicy",
]