Skip to content

fix: stop s3contents .s3keep 403s on read-only prefixes#148

Merged
Tianhao-Gu merged 1 commit intomainfrom
fix/s3keep-403-log-noise
Mar 27, 2026
Merged

fix: stop s3contents .s3keep 403s on read-only prefixes#148
Tianhao-Gu merged 1 commit intomainfrom
fix/s3keep-403-log-noise

Conversation

@Tianhao-Gu
Copy link
Copy Markdown
Contributor

@Tianhao-Gu Tianhao-Gu commented Mar 27, 2026

Summary

  • s3contents init() calls mkdir("") which PUTs .s3keep at each managed prefix root. For tenant/group prefixes where users have read-only access (e.g. tenant-sql-warehouse/nmdc/), this causes 403 errors that flood MinIO logs.
  • The previous patch wrapped mkdir in try/except — this caught the crash but the PUT still happened, generating 403 log entries on every S3ContentsManager init.
  • This fix replaces mkdir("") with pass entirely, since MMS already creates .s3keep markers during provisioning, and the genericmanager.py patch handles directories without .s3keep via DUMMY_CREATED_DATE fallback.

Context

"PUT /cdm-lake/tenant-sql-warehouse/nmdc/.s3keep HTTP/1.1" 403
"PUT /cdm-lake/tenant-sql-warehouse/kescience/.s3keep HTTP/1.1" 403
"PUT /cdm-lake/tenant-general-warehouse/kbase/.s3keep HTTP/1.1" 403

User agent aiobotocore/3.3.0 confirms these come from s3contents (Jupyter S3 file browser).

Test plan

  • Rebuild spark_notebook image with this patch
  • Open Jupyter file browser as a user with read-only group access
  • Verify no new .s3keep 403 entries appear in MinIO logs
  • Verify tenant directories still display correctly in the file browser

s3contents init() calls mkdir("") which PUTs a .s3keep file at the root
of each managed prefix. For tenant/group prefixes where users have
read-only access, this causes 403 errors that fill MinIO logs.

The previous patch wrapped mkdir in try/except, which caught the error
but still sent the PUT request (generating 403 log entries). This change
skips mkdir entirely since:
- MMS already creates .s3keep markers during user/group provisioning
- The genericmanager.py patch handles dirs without .s3keep via
  DUMMY_CREATED_DATE fallback
Copilot AI review requested due to automatic review settings March 27, 2026 17:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the s3contents patching script used in the container image build to prevent init-time .s3keep writes that generate repeated 403s for read-only tenant/group prefixes, reducing MinIO log noise while relying on existing provisioning and the genericmanager fallback behavior.

Changes:

  • Removes the self.mkdir("") call from s3contents s3_fs.py init by patching it to a no-op (pass) instead of wrapping in try/except.
  • Expands the rationale in patch_s3fs_init() docstring to explain why skipping .s3keep creation is safe and desirable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown

Coverage report

This PR does not seem to contain any modification to coverable code.

@Tianhao-Gu Tianhao-Gu merged commit 3e294ee into main Mar 27, 2026
10 checks passed
@Tianhao-Gu Tianhao-Gu deleted the fix/s3keep-403-log-noise branch March 27, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants