Skip to content

Agent install bootstrap uses set -o pipefail but runs under sh/dash — install fails on line 1 #341

@xdotli

Description

@xdotli

Summary

benchflow eval create --agent <js-agent> --sandbox docker fails at the agent-install step for any task whose container image has dash as /bin/sh (e.g. an ubuntu:24.04 base). The Node/JS agent bootstrap script begins with set -o pipefail, which dash does not support, so the script aborts on line 1.

Repro

benchflow eval create --tasks-dir <dir with an ubuntu-based task> --agent gemini --sandbox docker

Result:

benchflow.models.AgentInstallError: Agent gemini install failed (rc=2)

agent/install-stdout.txt in the job dir shows:

$ set -o pipefail; export DEBIAN_FRONTEND=noninteractive; BF_NODE_DIR=/opt/benchflow/node; ...
sh: 1: set: Illegal option -o pipefail

Root cause

src/benchflow/agents/registry.py — the _NODE_INSTALL template starts with "set -o pipefail; ". The install command is executed in the sandbox under /bin/sh, which on Debian/Ubuntu images is dash; dash's set has no -o pipefail, so the whole script exits non-zero immediately.

Notably src/benchflow/sandbox/setup.py:108 already handles exactly this — shell = "/bin/bash" if "pipefail" in body else "/bin/sh" — but the agent-install path (agents/install.py) does not apply that bash-detection.

Suggested fix

Either:

  • run the agent-install script with /bin/bash when it contains bash-isms (reuse the setup.py:108 pattern), or
  • drop set -o pipefail from _NODE_INSTALL — the script already uses && chaining for error propagation, so pipefail is not load-bearing.

Environment

benchflow @ main, Docker 29.1.3, task base image ubuntu:24.04, host Ubuntu 22.04.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions