Skip to content

Conversation

@heaven00
Copy link
Collaborator

@heaven00 heaven00 commented Nov 2, 2025

This is an experimental testing strategy to use descriptions of the tools and curated queries for each tool, what this will allow us to measure is:

  • An indicator of whether there the tool is likely to be picked up for a certain kind of query
  • detect possible overlaps in descriptions that can reduce the effectiveness of the right tool being used.

There are possibly better ways like using https://sbert.net/examples/sentence_transformer/applications/semantic-search/README.html and improving our tool descriptions to contain examples etc. but this is meant to showcase the strategy.

It's also functional in its current form too for us to improve upon and have a discussion on :)

@heaven00 heaven00 requested a review from zilto November 2, 2025 15:57
@heaven00 heaven00 self-assigned this Nov 2, 2025
@heaven00 heaven00 marked this pull request as draft November 2, 2025 15:57
@zilto
Copy link
Collaborator

zilto commented Nov 12, 2025

I really like the direction of this! I think it's a great occasion for dog-fooding dlt-hub/dlt. I'll open a separate repository to build the eval pipelines :)

@zilto zilto closed this Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants