Skip to content

Conversation

@BrianLang
Copy link

Hierarchical .rscignore Implementation

Summary

This PR implements hierarchical .gitignore-style behavior for .rscignore files as discussed in #1178, making them work like .gitignore files where parent directory patterns affect subdirectories and allowing general wildcard matching as in .gitignore files.

Key Changes

Core Behavior Change

  • Before: .rscignore patterns only affected files in the same directory and use literal string matching to select ignored files.
  • After: .rscignore patterns inherit to subdirectories and support most .gitignore supported patterns (see below)
# Example: Root .rscignore with "*.log"
project/
├── .rscignore          # "*.log"
├── app.log             # ✅ Ignored (same as before)
├── src/
│   ├── debug.log       # ✅ NOW ignored (new behavior)
│   └── main.R          # ✅ Preserved
└── tests/
    └── error.log       # ✅ NOW ignored (new behavior)

SUPPORTED Features

Basic Patterns

  • *.log (wildcard patterns)
  • temp* (prefix patterns)
  • cache/ (directory patterns)
  • /README (root-relative patterns)

Hierarchical Behavior

  • Parent .rscignore affects subdirectories
  • Child .rscignore overrides parent patterns
  • !pattern (negation patterns work across directories)

Advanced Patterns

  • **/cache.tmp (matches at any depth)
  • **/logs/*.log (complex globstar patterns)
  • # comments and empty lines
  • Pattern precedence (most specific wins)

Examples That Work

# Root .rscignore
*.log
!important.log
cache/
**/node_modules/

# src/.rscignore  
!debug.log
temp*

NOT SUPPORTED Features

Advanced Pattern Syntax

  • [abc] (character classes)
  • [a-z] (character ranges)
  • *.{js,css} (brace expansion)
  • file*.txt (escape sequences)

Implementation Details

New Functions Added

  • collectHierarchicalPatterns() - Collects patterns from current directory up to project root
  • applyRscignorePatterns() - Enhanced to use hierarchical patterns with fallback
  • applyRscignorePatternsLegacy() - Preserves old behavior during transition

Modified Functions

  • bundleFiles() - Now passes bundle root context
  • ignoreBundleFiles() - Uses hierarchical pattern collection
  • All recursive file functions - Thread bundle root through call chain

Pattern Precedence Rules

  • Child overrides parent: Subdirectory .rscignore patterns take precedence
  • Negation support: !pattern in child can un-ignore parent's pattern
  • Standard .gitignore behavior: Follows established gitignore precedence rules

Lifecycle Management

Breaking Change Strategy

Following RStudio/Posit patterns for managed breaking changes:

# Default: New hierarchical behavior
options(rsconnect.rscignore.legacy = FALSE)  # Default

# Temporary: Legacy behavior with deprecation warning
options(rsconnect.rscignore.legacy = TRUE)   # Deprecated

Test Coverage

  • Hierarchical inheritance behavior
  • Pattern precedence rules (child overrides parent)
  • Negation patterns across directories
  • Gitignore-style patterns (wildcards, directory patterns)
  • Error handling and graceful fallback
  • Legacy vs hierarchical behavioral differences
  • Edge cases and parameter validation

Files Modified

Core Implementation

  • R/bundleFiles.R - Main hierarchical logic and lifecycle management
  • man/*.Rd - Documentation for new exported functions

Test Suite

  • tests/testthat/test-hierarchical-rscignore.R - Comprehensive hierarchical behavior tests
  • tests/testthat/_snaps/bundleFiles.md - Updated snapshots for new behavior

Documentation

  • NEWS.md - Suggestion for breaking change announcement and migration guide

Implement full gitignore pattern syntax (wildcards, negation, relative paths,
double-asterisk) while maintaining backward compatibility and existing
hierarchical behavior. Includes comprehensive test suite.
Introduce breaking changes to .rscignore files, enabling hierarchical pattern matching similar to .gitignore. Patterns in parent directories now affect subdirectories, allowing for more intuitive file ignoring. Added migration guidance and temporary compatibility option for legacy behavior. Update documentation and tests to reflect these changes.
Move existing tests for .rscignore functionality, including gitignore-style patterns and hierarchical behavior, to a new dedicated test file.

Revert test-bundleFiles.R to original state.
R/bundleFiles.R Outdated
if (legacy_mode) {
# Issue deprecation warning for legacy behavior
lifecycle::deprecate_warn(
when = "1.6.0",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unclear when this should be.

Comment on lines +326 to +329
"Directory-scoped .rscignore patterns are deprecated.",
"Hierarchical .gitignore-style behavior is now the default.",
"Update your .rscignore files to use '!' negation patterns if needed.",
"Set `options(rsconnect.rscignore.legacy = FALSE)` to remove this warning."
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy for feedback on what this message would say

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since people will only see this if they've explicitly opted into the legacy behavior, I think we don't need to tell them what option to use to silence it. Could we instead include a link to the documentation that describes the change and how to update your .rscignore files?

Update to be more specific about what patterns are and are not supported in the new implementation.

Refresh package documentation.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy to integrate this (include these) into the test-bundleFiles.R to be consistent with normal approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be great, thanks!

Copy link
Contributor

@karawoo karawoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @BrianLang, thank you so much for taking this on! This feels really on the right track and I appreciate the effort and the thorough test coverage.

I know this PR is a draft still, but I hope you don't mind me chiming in early with some comments. Feel free to disregard for now if you're still working on things, and you can tag me back in whenever you're ready for another look.

Comment on lines -130 to +143
ignoreFiles = TRUE
ignoreFiles = TRUE,
bundle_root = rootDir
) {
# generate a list of files at this level
if (is.null(contents)) {
contents <- list.files(dir, all.files = TRUE, no.. = TRUE)
}
if (ignoreFiles) {
contents <- ignoreBundleFiles(dir, contents)
contents <- ignoreBundleFiles(dir, contents, bundle_root)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the new bundle_root argument to this function or could we pass the existing rootDir argument to ignoreBundleFiles instead?

Comment on lines +316 to +321
# Handle NULL case
if (is.null(legacy_mode)) {
legacy_mode <- FALSE
}

if (legacy_mode) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Handle NULL case
if (is.null(legacy_mode)) {
legacy_mode <- FALSE
}
if (legacy_mode) {
if (isTRUE(legacy_mode)) {

this way would be just a little simpler

Comment on lines +326 to +329
"Directory-scoped .rscignore patterns are deprecated.",
"Hierarchical .gitignore-style behavior is now the default.",
"Update your .rscignore files to use '!' negation patterns if needed.",
"Set `options(rsconnect.rscignore.legacy = FALSE)` to remove this warning."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since people will only see this if they've explicitly opted into the legacy behavior, I think we don't need to tell them what option to use to silence it. Could we instead include a link to the documentation that describes the change and how to update your .rscignore files?

Comment on lines +338 to +340
if (is.null(bundle_root)) {
bundle_root <- dir # Fallback for backward compatibility
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to have the bundle_root argument default to dir instead of setting it here?

Comment on lines +342 to +358
tryCatch({
patterns <- collectHierarchicalPatterns(dir, bundle_root)
if (length(patterns) > 0) {
contents <- applyIgnorePatterns(contents, patterns, dir)
}

# Always exclude .rscignore files themselves
contents <- setdiff(contents, ".rscignore")
return(contents)

}, error = function(e) {
# Fallback to directory-scoped behavior on error
warning("Error in hierarchical pattern processing: ", e$message, call. = FALSE)
warning("Falling back to directory-scoped patterns", call. = FALSE)
return(applyDirectoryScopedPatterns(contents, dir))
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this tryCatch() has the potential to obfuscate the behavior a little too much. I think we should go ahead and fail fast if an error occurs, and the error message can point people to the rsconnect.rscignore.legacy option.

# Later patterns override earlier patterns
for (pattern in patterns) {
for (file in file_list) {
# Hierarchical pattern matching logic (formerly in matchPatternHierarchical)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Hierarchical pattern matching logic (formerly in matchPatternHierarchical)

seems like this references a previous implementation that's not relevant anymore

Comment on lines +669 to +689
if (pattern$relative && startsWith(pattern$raw, "/")) {
# Get the directory containing this pattern's .rscignore file
pattern_dir <- pattern$source_dir

# Get the directory containing the current file
if (grepl("/", file)) {
file_dir <- file.path(current_dir, dirname(file))
} else {
file_dir <- current_dir
}

# Normalize paths for comparison
pattern_dir <- normalizePath(pattern_dir, mustWork = FALSE)
file_dir <- normalizePath(file_dir, mustWork = FALSE)

# If directories match, compare just the filename
if (pattern_dir == file_dir) {
file_basename <- basename(file)
matches <- matchPattern(file_basename, pattern, current_dir)
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to make this logic part of matchPattern?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be great, thanks!

Comment on lines +175 to +178
expect_false("temp.txt" %in% result) # Root: ignored
expect_true("src/temp.txt" %in% result) # Child: un-ignored (overrides parent)
expect_false("src/utils/temp.txt" %in% result) # Grandchild: re-ignored (overrides parent negation)
expect_true("src/main.R" %in% result) # Unaffected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome 🎉

Comment on lines +16 to +21
#' * You can exclude additional files by listing them in a `.rscignore`
#' file. This file follows .gitignore-style syntax with one pattern per line.
#' Patterns support wildcards (* and ?), directory patterns (dir/), and
#' negation (!pattern). Character classes ([abc], [0-9]) and brace expansion
#' ({a,b,c}, {1..3}) are not currently supported. Patterns in parent
#' directories affect subdirectories hierarchically.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' * You can exclude additional files by listing them in a `.rscignore`
#' file. This file follows .gitignore-style syntax with one pattern per line.
#' Patterns support wildcards (* and ?), directory patterns (dir/), and
#' negation (!pattern). Character classes ([abc], [0-9]) and brace expansion
#' ({a,b,c}, {1..3}) are not currently supported. Patterns in parent
#' directories affect subdirectories hierarchically.
#' * You can exclude additional files by listing them in a `.rscignore`
#' file. This file follows .gitignore-style syntax with one pattern per line.
#' Patterns support wildcards (* and ?), directory patterns (dir/), and
#' negation (!pattern). Character classes (`[abc]`, `[0-9]`) and brace
#' expansion (`{a,b,c}`, `{1..3}`) are not currently supported. Patterns in
#' parent directories affect subdirectories hierarchically.

The brackets here were causing r cmd check complaints, which should be fixed by code formatting them.


❯ checking Rd cross-references ... WARNING
  Missing link(s) in Rd file 'listDeploymentFiles.Rd':
    ‘abc’ ‘0-9’
  
  See section 'Cross-references' in the 'Writing R Extensions' manual.
  
  Found the following Rd file(s) with Rd \link{} targets missing package
  anchors:
    listDeploymentFiles.Rd: abc, 0-9
  Please provide package anchors for all Rd \link{} targets not in the
  package itself and the base packages.

❯ checking Rd files ... NOTE
  checkRd: (-1) listDeploymentFiles.Rd:47: Lost braces; missing escapes or markup?
      47 | ({a,b,c}, {1..3}) are not currently supported. Patterns in parent
         |  ^
  checkRd: (-1) listDeploymentFiles.Rd:47: Lost braces; missing escapes or markup?
      47 | ({a,b,c}, {1..3}) are not currently supported. Patterns in parent
         |           ^

@BrianLang
Copy link
Author

Thanks so much for the thorough review, i appreciate the time you took, I will address them in the coming days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants