Skip to content

Discuss: FindFilesStd / FindFiles path-result semantics (dataDir prefix in matches) #3063

Description

@tomjn

Context

While reviewing the macOS draft PR #3060, one cross-platform (non-macOS-specific) change stood out as worth discussing on its own rather than riding along in a platform PR.

In rts/System/FileSystem/FileSystem.cpp, Impl::FindFilesStd() currently emits each match using the full iterated path:

auto entryPathStr = entry.path().generic_u8string();
...
matches.emplace_back(Impl::StoreUTF8AsString(entryPathStr));

That full path includes the dataDir prefix that was only supposed to be used to locate the search directory. #3060 proposes reconstructing each result as dir + <path relative to dir> instead, so the dataDir root never leaks into results:

const std::string dirRelPrefix = FileSystem::ForwardSlashes(dirStr);
...
const std::u8string entryRelStr = entry.path().lexically_relative(dirFullPath).generic_u8string();
std::string entryPathStr = dirRelPrefix + Impl::StoreUTF8AsString(entryRelStr);

(plus a header doc comment formalising the contract: result is absolute iff dir is absolute; dataDir is never part of the result).

Why this needs discussion, not a quiet port

This changes the observable result of FileSystem::FindFiles for every caller on every platform, not just macOS. It's either:

  • a genuine latent bug fix (callers that pass a non-empty dataDir getting the root spliced into their results), or
  • a behaviour change that could regress callers that currently rely on (or are unaffected by) the existing output.

Questions to resolve before adopting:

  1. Is the dataDir-prefix-in-results behaviour actually wrong today, or do all current callers pass dataDir = "" (so it never manifests)?
  2. Which call sites would change output, and are any of them sensitive (archive scanning, VFS path matching, Lua VFS.DirList, etc.)?
  3. Should this be paired with a small test covering both the relative-dir and absolute-dir cases described in the proposed header contract?

Flagging here so it can be decided on its merits independently of the macOS bring-up. The other clearly-portable bits of #3060 (Mac crash-handler arm64 symbolication, the <X11/Xlib.h> guard) are being split into their own focused PRs; the ArchiveScanner empty-pool guard from #3060 is already in master via #2961.

Ref: #3060

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions