Skip to content

fix(tools): rank natural-language tool search#260

Open
dom-mp wants to merge 1 commit into
NimbleBrainInc:mainfrom
dom-mp:fix-natural-language-tool-search-clean
Open

fix(tools): rank natural-language tool search#260
dom-mp wants to merge 1 commit into
NimbleBrainInc:mainfrom
dom-mp:fix-natural-language-tool-search-clean

Conversation

@dom-mp
Copy link
Copy Markdown
Contributor

@dom-mp dom-mp commented May 20, 2026

Summary

Fixes #33.
nb__search previously used literal substring matching over the full query, so natural-language queries like todo task create failed unless that exact phrase appeared contiguously in a tool name or description.
This PR adds deterministic tokenized ranking for tool discovery:

  • Tokenizes tool source/name/description on separators like -, _, and spaces.
  • Matches multi-term queries across source names, tool names, and descriptions.
  • Ranks fuller query-term coverage above partial matches.
  • Preserves empty-query browse behavior.
  • Reuses the same ranked search for invalid-tool-name suggestions.
    This allows queries like todo task create and todo board to discover synapse-todo-board__create_board_task.

Testing

  • bun test test/unit/system-tools.test.ts
  • bun run check
  • bun run format:check
  • bun run lint

Tokenize tool names and descriptions so nb__search can match multi-term queries like "todo task create" across hyphenated source names, tool names, and descriptions.
Reuse the ranked search for invalid-tool suggestions and add regressions for todo-board discovery.
Fixes NimbleBrainInc#33
@mgoldsborough
Copy link
Copy Markdown
Contributor

Thanks for picking this up! The tokenization + multi-term matching is the right shape for the bug, and the new tests cover the three motivating queries from the issue cleanly. Two small things I'd want to address before merging — both are narrow fixes, not asking for a rework.

1. Single-word prefix queries regress. The old substring filter matched "greet" against tool "greeting"; the new ranker tokenizes both into atomic terms ({"greeting"} ∌ "greet"), so matchedTerms === 0 and the tool gets dropped at search-ranking.ts:72. The name.includes(normalizedQuery) bonus a few lines up is actually dead — it raises the score, but the if (matchedTerms === 0) continue gate drops the tool before the score is ever read. This regresses common short queries: "auth"authenticate_user, "config"configure, "doc"documents. One-line fix:

if (matchedTerms === 0 && !name.includes(normalizedQuery) && !description.includes(normalizedQuery)) continue;

Worth a regression test (query: "greet" finds "greeting").

2. The system-tools handler is uncapped. registry.ts:236 already does .slice(0, 5) on the suggestion path, but system-tools.ts:163 returns every match. Multi-term queries that the old filter would have returned 0 for can now return 50+ tools, each with full description. Suggest capping at 25 and adjusting the message: Found N tool(s) (showing top 25).

Tests, typecheck, and lint all pass locally. Nice clean module separation — having search-ranking.ts as a pure function makes both fixes one-liners.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nb__search matches literal prefixes, not natural-language tool queries

2 participants