sequenceDiagram participant OC as OpenClaw participant GW as API Gateway participant LB as 负载均衡器 participant QM as 队列管理器 participant RL as Rate Limiter participant P as Provider participant CD as Cooldown Detector participant ST as 统计引擎 OC->>GW: POST /v1/chat/completions GW->>LB: 路由到目标池 Note over LB: Weighted RR 5-10s刷新
weight=(max_rpm-current_rpm)/max_rpm LB->>RL: BEGIN IMMEDIATE 事务 检查 RPM + 预占 alt RPM 不足 RL->>QM: 入队等待 超时30s QM-->>RL: 令牌可用 end RL-->>LB: 允许转发 LB->>P: 转发请求 P-->>LB: 响应 alt 200 OK LB->>ST: INSERT ON CONFLICT 记录 usage_logs LB-->>GW: 正常响应 else 429 Too Many Requests LB->>CD: 上报429 CD->>P: 移入冷却池 cooldown_until=now+30s×2^n LB->>LB: 重新选择 Provider B alt Provider B 正常 LB->>P: 转发到 Provider B P-->>LB: 200 OK end alt 主池全部冷却 Note over LB: 降级 Fallback 池
检查即将恢复的Provider
剩余<10s 等待 alt Fallback 可用 LB->>P: 转发 Fallback Provider P-->>LB: 200 OK +降级标记 else Fallback 也全冷却 LB->>P: 紧急通道 1 Provider 10% RPM alt 紧急通道成功 P-->>LB: 200 OK else LB-->>OC: 503 Service Unavailable OC->>OC: OpenClaw 自身 fallback end end end end