Skip to content

refactor(search): remove composable search taskset (migrated to research-environments v1)#1854

Open
hallerite wants to merge 2 commits into
mainfrom
feat/search-v1
Open

refactor(search): remove composable search taskset (migrated to research-environments v1)#1854
hallerite wants to merge 2 commits into
mainfrom
feat/search-v1

Conversation

@hallerite

@hallerite hallerite commented Jun 24, 2026

Copy link
Copy Markdown
Member

What

Removes the v0 composable search taskset family (QUEST / OpenSeeker / REDSearcher) from verifiers. The search environments are migrated to the new v1 taskset/harness and now live entirely in research-environments (PrimeIntellect-ai/research-environments#530): a harness-agnostic search-v1 taskset + the rlm-search-v1 agent env.

Deletes verifiers/envs/experimental/composable/tasksets/search/ (and its make_search_taskset / make_quest_taskset / make_openseeker_taskset / make_redsearcher_taskset factories).

Why

The v1 port reproduces v0 behavior at parity (scoring is byte-identical; verified by a live old-vs-new comparison — see #530), so the v0 composable copy is redundant. Keeping search in one place (research-environments, on v1) avoids drift.

Impact / migration

Downstream v0 rlm_search usage (the composable env that imported …composable.tasksets.search) should migrate to rlm-search-v1. No other code in verifiers imports the composable search taskset.

🤖 Generated with Claude Code


Note

Medium Risk
Large removal of public experimental APIs; breakage only for code still importing composable search from verifiers, but the surface area and vendored QUEST runtime make this a significant delete rather than a trivial cleanup.

Overview
Removes the experimental composable search taskset from verifiers now that equivalent v1 search support lives in research-environments (search-v1 + rlm-search-v1).

The deleted tree under verifiers/envs/experimental/composable/tasksets/search/ included the make_search_taskset dispatcher and three backends—QUEST (objective eval scripts + vendored obj_task_eval, open-ended rubric judging), OpenSeeker (binary LLM semantic judge), and REDSearcher (exact-match shortcut + BROWSECOMP-style judge)—plus their READMEs and public exports from search/__init__.py.

Callers that used v0 composable search or rlm_search wired to these factories should switch to the v1 stack in research-environments instead of importing from verifiers.

Reviewed by Cursor Bugbot for commit c30669f. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Remove composable search taskset migrated to research-environments v1

Deletes the README.md for the composable search taskset, which has been migrated to research-environments v1.

Macroscope summarized c30669f.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 80c3286. Configure here.

Comment thread environments/search_v1/search_v1/_base.py Outdated
Comment thread environments/search_v1/search_v1/openseeker.py Outdated
Comment thread environments/search_v1/search_v1/quest/quest.py Outdated
Comment thread environments/search_v1/search_v1/quest/obj_task_eval/eval_toolkit.py Outdated
Comment thread environments/search_v1/search_v1/quest/obj_task_eval/eval_toolkit.py Outdated
@macroscopeapp

macroscopeapp Bot commented Jun 24, 2026

Copy link
Copy Markdown

Approvability

Verdict: Needs human review

Diff is too large for automated approval analysis. A human reviewer should evaluate this PR.

No code changes detected at c30669f. Prior analysis still applies.

You can customize Macroscope's approvability policy. Learn more.

@hallerite hallerite changed the title feat(search-v1): port QUEST/OpenSeeker/REDSearcher search tasksets to v1 refactor(search): remove composable search taskset (migrated to research-environments v1) Jun 24, 2026
hallerite and others added 2 commits June 24, 2026 05:29
Ports the composable (v0) search taskset family to one harness-agnostic
`vf.Taskset` (`search-v1`) with a `backend` config selecting QUEST /
OpenSeeker / REDSearcher. QUEST's obj_task_eval evaluator (16 files) and
open_ended.py are vendored byte-identical to v0; OpenSeeker/REDSearcher
judge prompts, parse, exact-match and normalization match v0 exactly.
The agent writes /task/answer.txt; scoring reads it from the live runtime.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rch-environments v1)

The QUEST/OpenSeeker/REDSearcher search tasksets move to the new v1
taskset/harness in research-environments (search-v1 + rlm-search-v1).
Removes the v0 composable search taskset family from verifiers; the v1
port lives entirely in research-environments.

Note: this removes the v0 `make_search_taskset` family; downstream v0
`rlm_search` usage should migrate to `rlm-search-v1`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant