Skip to content

Avoid shell invocation for local log tailing#2580

Closed
resolvicomai wants to merge 1 commit into
PrimeIntellect-ai:mainfrom
resolvicomai:codex/no-shell-tail
Closed

Avoid shell invocation for local log tailing#2580
resolvicomai wants to merge 1 commit into
PrimeIntellect-ai:mainfrom
resolvicomai:codex/no-shell-tail

Conversation

@resolvicomai
Copy link
Copy Markdown

@resolvicomai resolvicomai commented May 21, 2026

Summary

  • add a small helper to start tail/sed log streaming with argv lists instead of shell strings
  • use it for local RL and SFT log tailing
  • cover both plain tailing and torchrun-prefix stripping with unit tests, including paths containing shell metacharacters

Testing

  • PYTHONPATH=src uv run --no-project --with pytest --with psutil --with setproctitle --with loguru pytest -q tests/unit/utils/test_process.py
  • uv run --no-project --with ruff ruff check src/prime_rl/utils/process.py src/prime_rl/entrypoints/sft.py src/prime_rl/entrypoints/rl.py tests/unit/utils/test_process.py
  • python3 -m py_compile src/prime_rl/utils/process.py src/prime_rl/entrypoints/sft.py src/prime_rl/entrypoints/rl.py tests/unit/utils/test_process.py
  • git diff --check

Note: I used --no-project on macOS because the checked-in uv lockfile only supports Linux environments.


Note

Low Risk
Low risk: only changes how local launcher log streaming spawns tail/sed, reducing shell-injection exposure while keeping core training/orchestration execution unchanged.

Overview
Updates local rl and sft launchers to stream logs via a new start_tail_processes() helper instead of Popen(..., shell=True).

Adds start_tail_processes() in utils/process.py to run tail -F (optionally piping through sed to strip torchrun rank prefixes) using argv lists, and includes unit tests covering both modes and log paths containing shell metacharacters.

Reviewed by Cursor Bugbot for commit ddf3b39. Bugbot is set up for automated code reviews on this repo. Configure here.

@mikasenghaas
Copy link
Copy Markdown
Member

hey! not sure what the purpose of this pr is, can you explain the current painpoint?

@resolvicomai
Copy link
Copy Markdown
Author

Thanks for asking. The pain point is local launcher robustness around log paths.

Both local entrypoints currently build shell strings for log streaming (tail -F ..., and tail | sed in SFT). If the configured output/log path contains a single quote or shell metacharacters, shell parsing can break; with user-controlled config paths, that can also turn into unintended shell execution.

This PR keeps the training/orchestration processes unchanged. It only starts the log tailing pipeline with argv lists so the log path is passed literally, and the tests cover a path containing shell metacharacters plus the torchrun-prefix stripping pipeline.

If this defensive hardening is not worth the extra helper in this codebase, I can close or reduce the patch.

@mikasenghaas
Copy link
Copy Markdown
Member

@resolvicomai if we can remove the tests form the pr i'd be okay with merging this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants