Throttle Gemini RAG embedding calls#73
Conversation
The live reload now reaches the current Gemini embedding endpoint, but comprehensive user-data indexing can still outrun provider quota and fail with 429. Serialize embedding requests through a configurable per-process interval and retry transient provider failures with Retry-After aware backoff. Constraint: Gemini embedding quota can reject bursty comprehensive reloads with 429 Rejected: Enable deterministic fallback vectors | would store low-quality vectors and hide provider failures Confidence: high Scope-risk: narrow Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.EmbeddingServiceTest --no-daemon Tested: ./gradlew test --no-daemon Co-authored-by: OmX <omx@oh-my-codex.dev>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Review limit reached
More reviews will be available in 44 minutes and 41 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Why
After PR #71 deployed, comprehensive RAG reload no longer hit the removed endpoint, but UserData indexing failed once Gemini returned 429 Too Many Requests. This keeps the reload and reservation-triggered user-data reindex path from dropping embeddings under burst quota.
Verification
Review notes