Skip to content

Delegation serializes tokens back to string for re-parsing (lossy round-trip) #116

@ldayton

Description

@ldayton

Context

The bash_join fix for #109 corrected the immediate parse error by re-quoting tokens before joining them into a delegation string. This fixed uv, env, arch, and caffeinate handlers.

However, the fix papers over a deeper architectural issue: delegation works by serializing tokens back into a bash string and re-parsing it.

The round-trip problem

Original command:  uv run python -c "print('hello')"
                          │
                   Parable parse + _strip_quotes
                          │
                          ▼
Tokens:            ['python', '-c', "print('hello')"]
                          │
                   bash_join (serialize back to string)
                          │
                          ▼
Reconstructed:     python -c 'print('"'"'hello'"'"')'
                          │
                   Parable parse again
                          │
                          ▼
Re-parsed tokens:  ['python', '-c', "print('hello')"]

This works today, but it relies on bash_quote perfectly inverting whatever _strip_quotes did during the first parse. That's a coupling between the parser's quote-stripping and bash_quote's reconstruction that could break silently if either side changes, or if a token contains quoting constructs that bash_quote can't faithfully reconstruct (e.g., ANSI-C $'...' quotes, or nested quoting edge cases).

Proposed fix

Let Classification accept tokens directly so delegating handlers can skip the serialize/reparse cycle:

@dataclass(frozen=True)
class Classification:
    action: Literal["allow", "ask", "delegate"]
    inner_command: str | None = None
    inner_tokens: list[str] | None = None  # new: bypass re-parsing
    ...

When inner_tokens is set, _analyze_simple_command uses them directly instead of re-parsing inner_command. Handlers that currently do:

inner_cmd = bash_join(inner_tokens)
return Classification("delegate", inner_command=inner_cmd)

Would become:

return Classification("delegate", inner_tokens=inner_tokens)

inner_command would remain for cases where the inner command is a raw string (e.g., extracted from bash -c '...' where it genuinely needs parsing).

Scope

Delegating handlers that would benefit: uv, env, arch, caffeinate, xargs, fd, script, shell, kubectl, docker.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions