6b5f53a0fd
交付文件: - config.py: 配置管理 (SidecarConfig + load_config),修复 PEP 563 类型推断 bug - rate_limiter.py: 令牌桶 (TokenBucket) + 网关识别 (is_nvidia_gateway) - priority_queue.py: 四级优先级队列,修复 PASSTHROUGH 语义 bug - server.py: FastAPI 代理主入口,修复 worker_loop 重试悬挂 bug - __init__.py: 包声明与公开导出 - pyproject.toml: 依赖声明 + mypy 配置 - README.md: 快速启动指南 + 环境变量列表 评审修复: - worker_loop 令牌重试从重入队改为 poll-wait (防止 future 悬挂) - 路由函数 + lifespan 补充返回类型注解 - heapq 重复 import 移到文件顶部 - config.py 清理无用代码行 - types-PyYAML stub 安装 - 新增 README.md 验证: mypy 0 issues, 全量单元测试通过 Co-authored-by: multica-agent <github@multica.ai>
41 lines
963 B
Python
41 lines
963 B
Python
"""
|
||
NVIDIA Sidecar 限流代理 — 核心代理模块。
|
||
|
||
为 OpenAI Chat Completions 兼容 API 提供四层防护:
|
||
1. 请求接收(FastAPI)
|
||
2. 网关识别 → 非 NVIDIA 直通
|
||
3. 优先级排队 → 令牌桶限流
|
||
4. httpx 异步转发到 NVIDIA 上游
|
||
"""
|
||
|
||
from __future__ import annotations
|
||
|
||
from nvidia_sidecar.config import SidecarConfig, load_config
|
||
from nvidia_sidecar.rate_limiter import (
|
||
Priority,
|
||
TokenBucket,
|
||
is_nvidia_gateway,
|
||
normalize_gateway_name,
|
||
)
|
||
from nvidia_sidecar.priority_queue import (
|
||
PriorityQueueItem,
|
||
PriorityRequestQueue,
|
||
QueueFullError,
|
||
QueueFullPassthrough,
|
||
QueueFullPolicy,
|
||
)
|
||
|
||
__version__ = "0.1.0"
|
||
__all__ = [
|
||
"SidecarConfig",
|
||
"load_config",
|
||
"Priority",
|
||
"TokenBucket",
|
||
"is_nvidia_gateway",
|
||
"normalize_gateway_name",
|
||
"PriorityQueueItem",
|
||
"PriorityRequestQueue",
|
||
"QueueFullError",
|
||
"QueueFullPassthrough",
|
||
"QueueFullPolicy",
|
||
] |