Skip to content

feat(search): add Jina.ai as web search provider#513

Open
leszek3737 wants to merge 5 commits intomoltis-org:mainfrom
leszek3737:add-jina.ai-search
Open

feat(search): add Jina.ai as web search provider#513
leszek3737 wants to merge 5 commits intomoltis-org:mainfrom
leszek3737:add-jina.ai-search

Conversation

@leszek3737
Copy link
Copy Markdown
Contributor

@leszek3737 leszek3737 commented Mar 28, 2026

Summary

  • Add jina as a third web search provider alongside brave and perplexity
  • Jina Search API (s.jina.ai) accepts gl (country) and hl (language) query params, mapped from the existing country and search_lang
    tool parameters
  • DuckDuckGo fallback error messages and all docs updated to reference JINA_API_KEY

Validation

Completed

  • cargo test -p moltis-tools jina — 17 tests pass
  • cargo check -p moltis-tools — clean
  • biome check --write — no JS changes
  • Config template, configuration.md, docker.md updated

Remaining

  • just format (pinned nightly rustfmt)
  • just lint
  • just test
  • ./scripts/local-validate.sh <PR_NUMBER>

Manual QA

  1. Set JINA_API_KEY and add provider = "jina" to [tools.web.search] in moltis.toml
  2. Run moltis and send a message that triggers web search — verify results are returned
  3. Repeat with country = "pl" / search_lang = "pl" in a search call — verify gl=pl&hl=pl are forwarded
  4. Remove the API key and verify the error hint mentions JINA_API_KEY
  5. With duckduckgo_fallback = true and no key — verify DDG fallback works normally

Adrian Rogala and others added 3 commits March 28, 2026 07:35
Add Jina.ai (s.jina.ai) as a third search provider alongside Brave and
Perplexity. Uses the same shared api_key field — no extra config section
needed. Supports JINA_API_KEY env var for key resolution.
Add tests for HTTP error paths, malformed JSON responses, env var
resolution, cache key isolation, and api_key_candidates dispatch.
- Add country (gl) and search_lang (hl) query params for Jina provider
- Update DuckDuckGo fallback error messages to mention JINA_API_KEY
- Update doc comment on WebSearchTool to include Jina
- Trim content field in parse_jina_results for consistency
- Update configuration.md with Jina provider table and env vars
- Update docker.md to mention Jina API key
- Update config template api_key comment to include Jina
- Add test for gl/hl param forwarding
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 28, 2026

Greptile Summary

This PR adds Jina (s.jina.ai) as a third web search provider alongside Brave and Perplexity. The implementation follows established patterns: a new SearchProvider::Jina config variant, an API client method (search_jina) that encodes the query in the URL path and forwards gl/hl locale params, a JSON response parser, and a solid 17-test suite covering unit, mock-server, and edge cases.

Key changes:

  • crates/config/src/schema.rs — adds Jina variant to SearchProvider enum
  • crates/tools/src/web_search.rs — full provider implementation with JinaSearchResponse / JinaSearchResult structs, from_config branch, search + parse methods, and comprehensive tests
  • Config template, configuration.md, and docker.md updated consistently

Issues found:

  • The count query parameter forwarded to s.jina.ai is not listed in Jina's public SERP API docs (only an offset pagination param is documented). If the API silently ignores count, the max_results config setting will have no effect for Jina searches, and parse_jina_results has no client-side truncation to compensate
  • schema.rs api_key field doc comment still references only "Brave Search API key" — was not updated alongside template/docs
  • parse_jina_results returns an empty vec on deserialization failure without logging, making future response-format changes hard to diagnose

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style or speculative correctness concerns.

The implementation is well-structured and follows existing provider patterns closely. The count parameter concern is speculative (Jina's docs are incomplete; the parameter may work in practice), and the other findings are minor doc/observability improvements. No P0 or confirmed P1 defects found.

crates/tools/src/web_search.rs — verify whether count is actually supported by the Jina SERP API and add client-side truncation if uncertain.

Important Files Changed

Filename Overview
crates/tools/src/web_search.rs Core Jina provider implementation: new structs, from_config branch, search_jina/search_jina_with_base_url methods, parse_jina_results, and 17 tests. Two concerns: count query param may be silently ignored by Jina SERP API (no client-side truncation fallback), and parse errors are swallowed without a log warning.
crates/config/src/schema.rs Adds Jina variant to SearchProvider enum; the api_key field doc comment still only references Brave.
crates/config/src/template.rs Config template updated to document Jina as a valid provider option alongside Brave and Perplexity.
docs/src/configuration.md Adds Jina to the provider table with env var and supported params noted; all mentions of provider options updated correctly.
docs/src/docker.md Docker deployment docs updated to include JINA_API_KEY examples in both -e flag and [env] config block.

Sequence Diagram

sequenceDiagram
    participant LLM
    participant WebSearchTool
    participant Cache
    participant JinaAPI as s.jina.ai

    LLM->>WebSearchTool: execute({ query, count, country, search_lang })
    WebSearchTool->>Cache: cache_get(provider:key_state:query:count)
    alt Cache hit
        Cache-->>WebSearchTool: cached result
    else Cache miss
        alt api_key present
            WebSearchTool->>JinaAPI: GET /{encoded_query}?count=N&gl=..&hl=..
            Note over WebSearchTool,JinaAPI: Authorization: Bearer {JINA_API_KEY}
            alt HTTP 2xx
                JinaAPI-->>WebSearchTool: { data: [ {title, url, content}, ... ] }
                WebSearchTool->>WebSearchTool: parse_jina_results(body)
                WebSearchTool-->>LLM: { provider:"jina", query, results }
            else HTTP error
                JinaAPI-->>WebSearchTool: error status + body
                WebSearchTool-->>LLM: Err("Jina Search API returned {status}")
            end
        else api_key empty AND fallback_enabled
            WebSearchTool->>WebSearchTool: search_duckduckgo(query, count)
            WebSearchTool-->>LLM: DDG results
        else api_key empty AND !fallback_enabled
            WebSearchTool-->>LLM: { error: "not configured", hint: "Set JINA_API_KEY" }
        end
        WebSearchTool->>Cache: cache_set(key, result)
    end
Loading

Comments Outside Diff (1)

  1. crates/config/src/schema.rs, line 1590-1596 (link)

    P2 Stale api_key doc comment still refers only to Brave

    The field comment still says "Brave Search API key" but this same field is now also used to configure the Jina API key. Template, docs, and docker.md were all updated to mention this dual use, but the Rust doc comment was missed.

Reviews (1): Last reviewed commit: "fix(search): add Jina gl/hl params, upda..." | Re-trigger Greptile

- Fix stale api_key doc comment in schema.rs to mention Jina
- Add warn! log on Jina response deserialization failure
- Add client-side .take(max_results) truncation in parse_jina_results
  since Jina SERP API does not document a count parameter
- Add test for client-side truncation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant