-
Notifications
You must be signed in to change notification settings - Fork 112
feat: multi-turn search R1 example #914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling that the retrieval server needs to be moved into nemo_rl/environments/search/
, cuz SearchEnv
seems to be coupled with this particular retrival server with its particular (HTTP) API design.
|
||
```bash | ||
uv pip install -U cmake | ||
git clone https://github.com/facebookresearch/faiss.git thirdparty/faiss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the existing convention, it might make sense to make faiss a submodule of this repo.
|
||
The following instructions install Faiss-GPU from source. Please refer to [this](https://github.com/facebookresearch/faiss/blob/main/INSTALL.md) for more details. | ||
|
||
```bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, it seems there are enough shell commands in here to automate them into a script.
examples/searchR1/readme.md
Outdated
``` | ||
local_dir=./data/searchR1 | ||
uv run --active python searchr1_download.py --local_dir $local_dir | ||
cat $local_dir/part_* > $local_dir/e5_Flat.index | ||
gzip -d $local_dir/wiki-18.jsonl.gz | ||
uv run --active python searchr1_dataset.py --local_dir $local_dir | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intend to replace this with bash prepare.sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, fixed.
numpy==1.26.4 | ||
transformers | ||
datasets | ||
pyserini | ||
huggingface_hub | ||
# faiss-gpu-cu12 | ||
uvicorn | ||
fastapi | ||
torch==2.6.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy==1.26.4 | |
transformers | |
datasets | |
pyserini | |
huggingface_hub | |
# faiss-gpu-cu12 | |
uvicorn | |
fastapi | |
torch==2.6.0 | |
pyserini | |
huggingface_hub | |
# faiss-gpu-cu12 | |
uvicorn | |
fastapi |
Only include dependencies that are not already presented in pyproject.toml
return corpus | ||
|
||
|
||
def read_jsonl(file_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you annonate the types and add docstrings for all the functions in this file?
tokenizer: TokenizerType, | ||
max_seq_length: int, | ||
idx: int, | ||
) -> DatumSpec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstrings to the functions in this file.
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# Adapted from https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/examples/search/searchr1_dataset.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is just a side note; no need to address anything)
Both files are from NovaSky-AI/SkyRL
, but somehow this one has noticeably better coding style than examples/searchR1/retrieval_server.py
INITIAL_RETRY_DELAY = 1 | ||
|
||
|
||
class SearchEnvConfig(TypedDict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstring for all classes and functions/methods in this file.
answer: Optional[str] | ||
|
||
|
||
def call_search_api( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def call_search_api( | |
def _call_search_api( |
This seems more like an internal helper function to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
|
||
@ray.remote | ||
class SearchEnv(EnvironmentInterface[SearchEnvMetadata]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add unit test for this. I'm suspecting that it's not gonna be easy to actually bring up a 2-GPU retrival server for unit tests, hence you might need to mock requests.session
for unit testing this.
@soodoshll will you be able to address comments ASAP |
@soodoshll is currently working on #883 which, as I understand, has a much higher priority over this Search-R1 example (i.e., a nice-to-have). If the priority should be the reverse, we can adapt accordingly. |
A general issue is that the retrieval server might run in an environment (quite) different from the nemo-rl env. For example, faiss depends on numpy 1.x |
Signed-off-by: Qidong Su <[email protected]> update Signed-off-by: Qidong Su <[email protected]> upd Signed-off-by: Qidong Su <[email protected]> stash Signed-off-by: Qidong Su <[email protected]> fix many things Signed-off-by: Qidong Su <[email protected]> stash Signed-off-by: Qidong Su <[email protected]> clean Signed-off-by: Qidong Su <[email protected]> clean Signed-off-by: Qidong Su <[email protected]> update copyright Signed-off-by: Qidong Su <[email protected]> update copyright Signed-off-by: Qidong Su <[email protected]> clean fix Signed-off-by: Qidong Su <[email protected]> fix Signed-off-by: Qidong Su <[email protected]> fix Signed-off-by: Qidong Su <[email protected]>
Signed-off-by: Qidong Su <[email protected]>
Signed-off-by: Qidong Su <[email protected]>
What does this PR do ?
Address #657
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use this
Before your PR is "Ready for review"
Pre checks:
Additional Information