feat: v15 AI Infrastructure — provider abstraction, split models, latency, local models, AA benchmarks#162
Merged
seanthimons merged 18 commits intointegrationfrom Mar 20, 2026
Merged
Conversation
…nd migration nudge REVERSION POINT — Phase 1 of v15 Set A complete. Safe to revert to this commit if Phase 2 (Cost Tracker UI + Sidebar Badge) introduces issues. - Add migration 011: oa_usage_log table for tracking API credit usage - Add perform_oa_request() centralized wrapper with header parsing - Replace all req_perform() calls with perform_oa_request() - Add parse_oa_usage_headers() and log_oa_usage() functions - Add openalex_api_key to Settings UI, effective config, and env var support - Add migration nudge banner (dismissible) for email-only users - Add should_show_oa_migration_nudge() helper - Add 43 unit tests across test-oa-usage-tracking.R and test-oa-migration.R Resolves groundwork for #157
…d toast warning REVERSION POINT — Phase 2 of v15 Set A complete. Safe to revert to this commit if Phase 3 (Split VSS/BM25 with RRF Fusion) introduces issues. - Add get_oa_daily_usage() and get_oa_usage_history() query functions - Add oa_budget_percentage(), oa_budget_color(), oa_toast_should_fire() helpers - Add OA usage value box to Cost Tracker tab (daily budget, requests, credit usage) - Add sidebar OA budget badge with green/yellow/red color tiers - Add one-time-per-day toast notification at >= 90% budget consumption - Badge and tracker section hidden for polite-pool users (no API key) - Add openalex_search, openalex_fetch, openalex_topics, query_reformulation to COST_OPERATION_META Continues #157
…tract title bug REVERSION POINT — Phase 3 of v15 Set A complete. Safe to revert to this commit if Phase 4 (Query Reformulation) introduces issues. - Add rrf_merge() function implementing Reciprocal Rank Fusion (k=60) - Replace single ragnar_retrieve() with split ragnar_retrieve_vss() + ragnar_retrieve_bm25() - Add enrich_retrieval_results() shared metadata parser - Fix #159: abstract chunks now show actual paper titles from DB instead of "[Abstract]" - retrieve_with_ragnar() now accepts multiple queries (prep for Phase 4 RAG-Fusion) - Pass con to retrieve_with_ragnar() for abstract title lookup - Add 15 unit tests for RRF merge algorithm Resolves #159, continues #12, #48
REVERSION POINT — Phase 4 of v15 Set A complete. Safe to revert to this commit if Phase 5 (Contextual Chunk Headers) introduces issues. - Add generate_query_variants() for LLM-powered query expansion (3 variants) - Add parse_query_variants() supporting numbered, dashed, and plain line formats - Wire reformulation into search_chunks_hybrid() with config/session params - Log reformulation cost as "query_reformulation" operation - Add Settings toggle: "Query Reformulation (RAG-Fusion)" (enabled by default) - When disabled, falls back to single-query retrieval (still with RRF) - Add 7 unit tests for query variant parsing Continues #12, #48, #142
REVERSION POINT — Phase 5 of v15 Set A complete. Safe to revert to this commit if subsequent changes introduce issues. - Add prepend_contextual_header() to prefix chunks with [Paper Title] or [Title | Section: X] - Modify chunk_with_ragnar() to accept optional paper_title parameter - Prepend paper titles to abstract chunks during rebuild (replaces old [Abstract] approach) - Use filename (minus extension) as document chunk header during rebuild - Add RAGNAR_INDEX_SCHEMA_VERSION (v2) for tracking chunk format changes - Add is_ragnar_store_stale() and mark_ragnar_store_current() for stale detection - Mark stores as current (v2) after successful rebuild - Add 17 unit tests for headers, backward compatibility, and stale detection Completes v15 Set A. Resolves #12, #48. Continues #142, #157.
…endpoints Create R/api_provider.R with unified interface (provider_chat_completion, provider_get_embeddings, provider_list_models, provider_check_health) that wraps any OpenAI-compatible LLM endpoint with automatic duration_ms timing and NULL usage token handling. Big-bang migration of all 15+ call sites across rag.R, slides.R, mod_query_builder.R, _ragnar.R, db.R, mod_document_notebook.R, mod_search_notebook.R, and mod_slides.R from direct chat_completion/ get_embeddings calls to the provider layer. Key changes: - All LLM calls now route through provider_chat_completion/provider_get_embeddings - duration_ms captured for every call via proc.time() - log_cost() accepts optional duration_ms parameter (forward-compat with Phase 2) - estimate_cost() returns $0 for local models (is_local=TRUE) instead of DEFAULT_PRICING - NULL usage tokens default to 0 (graceful handling for local models) - mirai workers source api_provider.R and config.R for async reindex tasks
Tests cover: provider config creation, usage normalization (NULL/partial), config bridge, health check (offline), cost estimation (local vs cloud), and log_cost with optional duration_ms parameter. Also adds brainstorm and plan documents for v15 Set B.
… Tracker UI Phase 2 of v15 Set B AI Infrastructure: - Migration 012: add duration_ms column to cost_log - 4 latency query functions (by model, by operation, trend, summary) with p50/p95 percentiles and NULL-safe handling - Latency accordion section in Cost Tracker with value box, per-model table, per-operation table, and sparkline trend - 22 new tests covering all latency queries and edge cases
…operation-based resolution Phase 3 of v15 Set B AI Infrastructure: - Add slot field to all 17 COST_OPERATION_META entries (fast/quality/embedding/NA) - Add resolve_model_for_operation() for centralized model routing - Settings UI: 3 dropdowns (quality, fast optional, embedding) replacing 2 - Fast slot falls back to quality model when not configured - Runtime migration: chat_model → quality_model for existing users - Migrate all call sites across rag.R, mod_slides.R, mod_query_builder.R, mod_document_notebook.R, mod_search_notebook.R, db.R - 51 new tests for slot resolution, fallback, migration, COST_OPERATION_META validation
…etection, and stale index extension Phase 4 of v15 Set B AI Infrastructure: - Migration 013: providers table with OpenRouter seeded as default - Provider CRUD: save/get/delete with default-provider protection - Settings UI: Providers section with add/edit/delete/test connection modals - Embedding dimension detection via known table + provider probe fallback - Stale index detection extended to check embedding model mismatch - is_local_provider() helper for zero-cost detection - get_all_available_models() for multi-provider model aggregation - 27 new tests for CRUD, upsert, dimension detection, stale index, local provider
… matching, and smart defaults Phase 5 of v15 Set B AI Infrastructure: - New R/api_artificialanalysis.R: fetch/load/cache AA model data - Bundled snapshot: 14 models with quality, speed, price data - Model ID matching: manual mapping (18 entries) + fuzzy normalization - Model picker enrichment: shows Q:score, tok/s, $/M when AA data available - Smart defaults: cheapest competent for fast, smartest affordable for quality - Settings UI: Model Benchmarks section with refresh button and AA API key - Model info panel enriched with quality, coding, speed, TTFT from AA - DB caching for refreshed AA data with proper JSON round-trip - 37 new tests covering data loading, matching, enrichment, smart defaults, caching
There was a problem hiding this comment.
Pull request overview
This PR completes the v15 AI Infrastructure milestone by introducing a provider abstraction layer (to support OpenRouter + local OpenAI-compatible endpoints), adding split-model routing (fast/quality/embedding), integrating Artificial Analysis benchmark enrichment, and expanding telemetry with latency + OpenAlex usage tracking.
Changes:
- Added provider abstraction (
provider_chat_completion/provider_get_embeddings), provider CRUD (DB-backed), and model slot routing. - Added latency persistence + analytics (migration + queries + Cost Tracker UI).
- Added retrieval quality upgrades (query reformulation parsing, RRF merge, contextual chunk headers) and bundled AA benchmark snapshot + mapping.
Reviewed changes
Copilot reviewed 37 out of 37 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
R/api_provider.R |
Introduces provider abstraction, local/provider helpers, slot resolution, and model aggregation. |
R/api_openalex.R |
Adds OA usage header parsing + centralized request wrapper. |
R/cost_tracking.R |
Adds slots to operation metadata, latency queries, OA usage queries, and cost logging w/ duration. |
R/_ragnar.R |
Switches embedding to provider layer; adds contextual headers + stale-index schema tracking; adds RRF retrieval. |
R/db.R |
Updates hybrid search to use provider embeddings + query reformulation variants + RRF retrieval. |
R/mod_cost_tracker.R |
Adds latency section UI + OpenAlex usage section UI. |
app.R |
Adds OA sidebar badge rendering logic. |
migrations/011_*.sql, migrations/012_*.sql, migrations/013_*.sql |
Adds OA usage log table, latency column, providers table. |
tests/testthat/* |
Adds unit/integration coverage for RRF merge, query reformulation parsing, OA usage tracking, providers, AA, latency. |
README.md, TODO.md, docs/plans/*, docs/brainstorms/* |
Updates documentation/planning to reflect v15 milestone completion. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
R/cost_tracking.R
Outdated
Comment on lines
+612
to
+617
| today <- as.character(Sys.Date()) | ||
| last_fired <- tryCatch(get_db_setting(con, "oa_toast_last_fired_date"), error = function(e) NULL) | ||
|
|
||
| if (!is.null(last_fired) && last_fired == today) return(FALSE) | ||
|
|
||
| TRUE |
Comment on lines
+399
to
+409
| #' Build provider config from effective_config | ||
| #' | ||
| #' Extracts the OpenRouter API key from the effective_config list | ||
| #' and returns a provider_config. This bridges the existing settings | ||
| #' system with the provider layer. | ||
| #' | ||
| #' @param config effective_config list (from mod_settings_server) | ||
| #' @return provider_config for OpenRouter | ||
| provider_from_config <- function(config) { | ||
| api_key <- get_setting(config, "openrouter", "api_key") | ||
| openrouter_provider(api_key) |
Comment on lines
+525
to
531
| body <- tryCatch({ | ||
| perform_oa_request(req, con = NULL, operation = "search") | ||
| }, error = function(e) { | ||
| stop_api_error(e, "OpenAlex") | ||
| }) | ||
|
|
||
| body <- resp_body_json(resp) | ||
| parse_search_response(body) |
| -- Track OpenAlex API usage from response headers | ||
| -- Supports the new freemium API key model (Feb 2026) | ||
| CREATE TABLE IF NOT EXISTS oa_usage_log ( | ||
| id VARCHAR PRIMARY KEY DEFAULT (gen_random_uuid()::VARCHAR), |
Comment on lines
+185
to
+192
| provider_list_models <- function(provider) { | ||
| req <- build_provider_request(provider, "models") | ||
|
|
||
| resp <- tryCatch({ | ||
| req_perform(req) | ||
| }, error = function(e) { | ||
| return(data.frame(id = character(), name = character(), stringsAsFactors = FALSE)) | ||
| }) |
Comment on lines
+116
to
+124
| cost <- estimate_cost(model, | ||
| result$usage$prompt_tokens %||% 0, | ||
| result$usage$completion_tokens %||% 0) | ||
| log_cost(con, "query_reformulation", model, | ||
| result$usage$prompt_tokens %||% 0, | ||
| result$usage$completion_tokens %||% 0, | ||
| result$usage$total_tokens %||% 0, | ||
| cost, session_id, | ||
| duration_ms = result$duration_ms) |
R/_ragnar.R
Outdated
|
|
||
| # Mark store schema as current | ||
| if (!is.null(con)) { | ||
| mark_ragnar_store_current(con, notebook_id) |
R/_ragnar.R
Outdated
Comment on lines
+171
to
+173
| # Require provider with API key (needed for embed function) | ||
| if (is.null(provider) || is.null(provider$api_key) || nchar(provider$api_key) == 0) { | ||
| stop("Provider with API key required to create/open ragnar store for embedding") |
R/db.R
Outdated
| # Attach embed function for query vectorization (ragnar_retrieve needs it) | ||
| if (!is.null(store) && !is.null(api_key) && nchar(api_key) > 0) { | ||
| store@embed <- make_embed_function(api_key, embed_model) | ||
| has_provider <- !is.null(provider) && !is.null(provider$api_key) && nchar(provider$api_key) > 0 |
R/mod_search_notebook.R
Outdated
Comment on lines
515
to
537
| @@ -531,7 +533,7 @@ mod_search_notebook_server <- function(id, con, notebook_id, config, notebook_re | |||
| ) | |||
| result | |||
| }, notebook_id = notebook_id, documents = documents, abstracts = abstracts, | |||
| api_key = api_key, embed_model = embed_model, interrupt_flag = interrupt_flag, | |||
| provider = provider, embed_model = embed_model, interrupt_flag = interrupt_flag, | |||
| progress_file = progress_file, app_dir = app_dir) | |||
…back provider_from_config() now resolves the default provider from the DB providers table instead of always returning OpenRouter. This unblocks local models (Ollama, LM Studio) for all LLM operations. - provider_from_config: add con param, resolve DB default, fall back to OpenRouter - build_provider_request: handle NA api_key from DuckDB (not just NULL) - get_ragnar_store: remove hard api_key requirement, allow local providers - search_chunks_hybrid: simplify has_provider guard to !is.null(provider) - estimate_cost: pass is_local flag at all 11 call sites in rag.R - mark_ragnar_store_current: pass embed_model for mismatch detection - async reindex: plumb db_path through mirai worker for stale-index metadata - OA usage: use UTC dates for daily usage/toast dedup - migration 011: fix gen_random_uuid() to DuckDB-native uuid() - api_key guards: update mod_query_builder, mod_search_notebook to allow local providers - Add 23 integration tests verified against live LM Studio (Gemma 3 + nomic embeddings)
This was referenced Mar 20, 2026
- cost_tracking.R: merge HEAD's slot field with integration's refiner_eval entry - api_openalex.R: keep perform_oa_request() (v15 usage tracking), wire through perform_openalex() for verbose logging - TODO.md: combine completed items from both branches
- Add bounds checks on choices/data arrays in provider API responses - Add tryCatch in make_embed_function for provider error context - Add removeModal() before showModal() in slide generation to prevent modal stacking - Wrap resp_body_json in tryCatch in fetch_aa_models for non-JSON fallback
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Complete v15 AI Infrastructure milestone: 5 phases, 14 commits, 215 new tests.
Set A — Retrieval & Usage Tracking
Set B — AI Infrastructure (5 phases)
provider_chat_completion/provider_get_embeddingsinterface, all 15+ call sites migrated,duration_mstiming on every callCOST_OPERATION_METAslots andresolve_model_for_operation(), Settings UI with 3 dropdowns,chat_model→quality_modelmigrationR/api_artificialanalysis.Rwith fetch/load/cache, bundled snapshot (14 models), model ID matching (manual + fuzzy), enriched model picker labels (Q:score, tok/s, $/M), smart defaults algorithm, model info panel enrichmentKey files
R/api_provider.RR/api_artificialanalysis.RR/cost_tracking.RR/mod_settings.RR/mod_cost_tracker.Rmigrations/012_*.sqlmigrations/013_*.sqlTest plan