Skip to content

S3 FUSE home directories and API-based credential fetching#138

Open
Tianhao-Gu wants to merge 6 commits intomainfrom
feature/s3_mount
Open

S3 FUSE home directories and API-based credential fetching#138
Tianhao-Gu wants to merge 6 commits intomainfrom
feature/s3_mount

Conversation

@Tianhao-Gu
Copy link
Copy Markdown
Contributor

Summary

  • Mount user home directories via s3fs-fuse (cdm-lake/users-general-warehouse/{user}/workspace/home/{user}/) in entrypoint.sh
  • Fetch MinIO credentials from governance API (GET /credentials) in entrypoint instead of reading from local file cache — needed because the FUSE mount requires credentials before the home directory exists
  • Skip recursive chown on FUSE-mounted home directory (s3fs handles ownership via uid/gid mount options)
  • Update refresh.py to use _get_credentials_cache_path() helper instead of inline Path.home() / CREDENTIALS_CACHE_FILE
  • Remove unused credential cache tests (cache functions still exist but are no longer the primary credential path)

Test plan

  • Verify entrypoint.sh fetches credentials from governance API with retry logic
  • Verify s3fs mount succeeds with correct permissions (uid/gid)
  • Verify notebook startup works with FUSE-mounted home directory
  • Verify refresh_spark_environment() clears credential cache correctly
  • Run full test suite: cd notebook_utils && uv run pytest (839 passed)

Takes main's credential caching (file-lock based), relative imports, error handling,
deprecation warnings, list_user_names, and regenerate_policies. Removes Polaris-specific
code (polaris credential cache, get_polaris_credentials, httpx-based rotate) that is
not in main's governance client v0.0.14.
Copilot AI review requested due to automatic review settings March 16, 2026 02:43
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 16, 2026

Coverage report

This PR does not seem to contain any modification to coverable code.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates notebook startup to support S3 FUSE–backed home directories by fetching MinIO credentials from the governance API early in container startup, and adjusts related utilities/tests to align with the new startup flow.

Changes:

  • Update scripts/entrypoint.sh to fetch credentials from the governance API with retries and mount /home/$NB_USER (and a shared subdir) via s3fs.
  • Remove recursive chown of the home directory to avoid expensive operations on FUSE mounts.
  • Update refresh/test code to use the credential cache path helper and simplify credential tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
scripts/entrypoint.sh Fetch credentials from governance API, set AWS endpoint env, and mount S3-backed home/share via s3fs.
configs/jupyter_docker_stacks_hooks/11-setup_env.sh Remove recursive chown -R of the home directory to avoid slow FUSE traversal.
notebook_utils/berdl_notebook_utils/minio_governance/operations.py Update get_minio_credentials() docstring to reflect API-based credential sourcing.
notebook_utils/berdl_notebook_utils/refresh.py Use _get_credentials_cache_path() helper when clearing credential cache.
notebook_utils/tests/minio_governance/test_operations.py Remove credential cache helper tests and adjust get_minio_credentials() tests toward API behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Resolve conflicts in test_operations.py:
- Keep main's comprehensive test classes for coverage
- Fix list_user_names tests to patch correct module-level function
- Add missing Path and _write_credentials_cache imports
Resolve conflicts in operations.py, test_operations.py, and
entrypoint.sh by keeping the feature/s3_mount versions (retry logic
for credential fetch, s3fs FUSE mounts).
Copilot AI review requested due to automatic review settings March 23, 2026 18:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the notebook container startup flow to (1) fetch MinIO credentials early via the governance API and (2) mount S3-backed “home” directories via s3fs-fuse, avoiding slow recursive ownership fixes on FUSE mounts.

Changes:

  • Add governance API credential fetch + retry logic to entrypoint.sh and use it to drive s3fs mounts for /home/$NB_USER and a global shared directory.
  • Remove recursive chown of the home directory in the docker-stacks hook to avoid expensive FUSE/S3 object traversal.
  • Introduce/adjust local credential cache helpers + locking in minio_governance/operations.py, and update/add tests around cache/lock behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 14 comments.

File Description
scripts/entrypoint.sh Fetches credentials from governance API with retries and mounts S3 paths into home via s3fs.
configs/jupyter_docker_stacks_hooks/11-setup_env.sh Skips recursive chown to avoid FUSE-triggered S3 HEAD storms.
notebook_utils/berdl_notebook_utils/minio_governance/operations.py Adds credential cache path/helpers and file-locking around cache usage.
notebook_utils/tests/minio_governance/test_operations.py Updates credential tests and adds coverage for cache/lock error handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +84 to +86
local s3fs_opts="passwd_file=/tmp/.s3fs-passwd"
s3fs_opts="${s3fs_opts},url=${MINIO_URL}"
s3fs_opts="${s3fs_opts},use_path_request_style"
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s3fs is configured with url=${MINIO_URL}, but MINIO_URL is constructed earlier in the script by defaulting to http:// when no scheme is provided. Since that scheme selection currently ignores MINIO_SECURE, secure deployments that set MINIO_SECURE=true but provide a bare host/port will still use http here. Consider deriving MINIO_URL’s scheme from MINIO_SECURE (as done in configs/jupyter_server_config.py) so s3fs/boto3 consistently use https when configured.

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +112
# Create s3fs password file
echo "${AWS_ACCESS_KEY_ID}:${AWS_SECRET_ACCESS_KEY}" > /tmp/.s3fs-passwd
chmod 600 /tmp/.s3fs-passwd

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/tmp/.s3fs-passwd contains long-lived access credentials; it’s created with 0600 but never removed. If s3fs doesn’t need it after the mount is established, remove it (or install an EXIT trap to clean it up) to reduce the window for credential exposure in the container filesystem.

Copilot uses AI. Check for mistakes.
Comment on lines +62 to 63
# requests for every object in S3.

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finalization step now skips the recursive chown unconditionally. If the S3 FUSE mount is not active (e.g., missing creds, mount failure, or FUSE disabled), files/directories created in /home/$NB_USER by this hook may remain root-owned and break runtime writes. Consider conditionally chown only when $HOME_DIRECTORY is not a FUSE mount (e.g., via findmnt/stat -f -c %T) or only chown the specific local paths created here when not mounted.

Suggested change
# requests for every object in S3.
# requests for every object in S3.
#
# However, if $HOME_DIRECTORY is *not* a FUSE mount (for example, when the S3
# mount is disabled or failed), files and directories created by this hook may
# be owned by root and thus unwritable by $NB_USER. In that case, fix ownership
# only for the specific paths created here, not the entire home directory.
# Detect filesystem type of the home directory. On FUSE-based mounts, %T
# typically reports a value starting with "fuse" (e.g., "fuseblk", "fuse").
FS_TYPE="$(stat -f -c %T "$HOME_DIRECTORY" 2>/dev/null || echo unknown)"
# If the home directory is not on a FUSE filesystem, correct ownership of the
# specific paths created/managed by this hook.
case "$FS_TYPE" in
fuse* )
# On FUSE (e.g., s3fs), rely on mount-time uid/gid options; do not chown.
;;
* )
NB_UID="$(id -u "$NB_USER")"
NB_GID="$(id -g "$NB_USER")"
# Only chown paths this script created or modified, to avoid expensive
# operations on large home directories.
chown "$NB_UID:$NB_GID" \
"$HOME_DIRECTORY/.bash_profile" \
"$CUSTOM_PROFILE"
chown -R "$NB_UID:$NB_GID" \
"$JUPYTER_DIR" \
"$IPYTHON_STARTUP_DIR"
;;
esac

Copilot uses AI. Check for mistakes.
Comment on lines +180 to +200
cache_path = _get_credentials_cache_path()
lock_path = cache_path.with_suffix(".lock")

# Use file locking to prevent concurrent access
with open(lock_path, "w") as lock_file:
try:
# Acquire exclusive lock (blocks until available)
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)

# Try to load from cache first (double-check after acquiring lock)
cached_credentials = _read_cached_credentials(cache_path)
if cached_credentials:
credentials = cached_credentials
else:
# No cache or cache corrupted, fetch fresh credentials
client = get_governance_client()
api_response = get_credentials_credentials_get.sync(client=client)
if isinstance(api_response, CredentialsResponse):
credentials = api_response
_write_credentials_cache(cache_path, credentials)
else:
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says credentials are fetched from the governance API "instead of reading from local file cache" and that the API is the cache. However get_minio_credentials() now reads/writes a local credentials cache file and only calls the API on a cache miss. Either update the PR description to reflect that a local cache is still used, or remove the local cache path from this function to match the stated design.

Copilot uses AI. Check for mistakes.
Comment on lines +245 to +256
with open(lock_path, "w") as lock_file:
try:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
_write_credentials_cache(cache_path, api_response)
finally:
pass

try:
lock_path.unlink(missing_ok=True)
except OSError:
pass

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rotate_minio_credentials() also unconditionally opens the lock file for writing. If the cache directory/home is not writable, rotation will raise after the API call succeeds, preventing env vars from being updated. Similar to get_minio_credentials(), consider handling OSError around lock/cache operations and still updating env vars even when caching/locking fails.

Suggested change
with open(lock_path, "w") as lock_file:
try:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
_write_credentials_cache(cache_path, api_response)
finally:
pass
try:
lock_path.unlink(missing_ok=True)
except OSError:
pass
try:
with open(lock_path, "w") as lock_file:
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
_write_credentials_cache(cache_path, api_response)
try:
lock_path.unlink(missing_ok=True)
except OSError:
# Failure to remove the lock file is non-fatal
pass
except OSError as exc:
# If we cannot write to the cache/lock, log and continue so env vars are still updated
logging.warning(
"Failed to update MinIO credentials cache at %s: %s",
cache_path,
exc,
)

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +171
Calls GET /credentials on the governance API, which returns DB-cached credentials
without rotating. No local file cache needed — the API is the cache.
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_minio_credentials docstring says "No local file cache needed — the API is the cache", but the implementation now explicitly reads/writes a local credentials cache file before calling the API. Please update the docstring (and/or the implementation) so the documented behavior matches the actual caching logic.

Suggested change
Calls GET /credentials on the governance API, which returns DB-cached credentials
without rotating. No local file cache neededthe API is the cache.
Credentials are read from a local cache file when available; otherwise, the
function calls GET /credentials on the governance API (which returns
DB-cached credentials without rotating) and updates the local cache.

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +124
with open(cache_path, "w") as f:
json.dump(credentials.to_dict(), f)
except (OSError, TypeError):
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The credentials cache file stores the MinIO secret key in plaintext, but _write_credentials_cache doesn’t set restrictive permissions (e.g., 0600). Depending on the container/user umask this can create group/world-readable credentials. Consider writing with explicit mode 0o600 (or chmod after write) and/or placing it under a private directory like ~/.cache with controlled permissions.

Suggested change
with open(cache_path, "w") as f:
json.dump(credentials.to_dict(), f)
except (OSError, TypeError):
# Use os.open with explicit 0o600 mode to avoid relying on the process umask
fd = os.open(str(cache_path), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600)
with os.fdopen(fd, "w") as f:
json.dump(credentials.to_dict(), f)
except (OSError, TypeError):
# On any error, silently skip caching to preserve existing behavior

Copilot uses AI. Check for mistakes.
Comment on lines +183 to +204
# Use file locking to prevent concurrent access
with open(lock_path, "w") as lock_file:
try:
# Acquire exclusive lock (blocks until available)
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)

# Try to load from cache first (double-check after acquiring lock)
cached_credentials = _read_cached_credentials(cache_path)
if cached_credentials:
credentials = cached_credentials
else:
# No cache or cache corrupted, fetch fresh credentials
client = get_governance_client()
api_response = get_credentials_credentials_get.sync(client=client)
if isinstance(api_response, CredentialsResponse):
credentials = api_response
_write_credentials_cache(cache_path, credentials)
else:
raise RuntimeError("Failed to fetch credentials from API")
finally:
# Lock is automatically released when file is closed
pass
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_minio_credentials() always opens the lock file with open(lock_path, "w") before any API call. If the home directory is not writable (read-only, permission issues, etc.), this will raise and prevent credential fetching entirely, even though the API call could still succeed. Consider catching OSError around lock/cache setup and falling back to a no-cache/no-lock path (fetch from API directly) when locking isn’t possible.

Suggested change
# Use file locking to prevent concurrent access
with open(lock_path, "w") as lock_file:
try:
# Acquire exclusive lock (blocks until available)
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
# Try to load from cache first (double-check after acquiring lock)
cached_credentials = _read_cached_credentials(cache_path)
if cached_credentials:
credentials = cached_credentials
else:
# No cache or cache corrupted, fetch fresh credentials
client = get_governance_client()
api_response = get_credentials_credentials_get.sync(client=client)
if isinstance(api_response, CredentialsResponse):
credentials = api_response
_write_credentials_cache(cache_path, credentials)
else:
raise RuntimeError("Failed to fetch credentials from API")
finally:
# Lock is automatically released when file is closed
pass
credentials: CredentialsResponse
try:
# Use file locking to prevent concurrent access
with open(lock_path, "w") as lock_file:
try:
# Acquire exclusive lock (blocks until available)
fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
# Try to load from cache first (double-check after acquiring lock)
cached_credentials = _read_cached_credentials(cache_path)
if cached_credentials:
credentials = cached_credentials
else:
# No cache or cache corrupted, fetch fresh credentials
client = get_governance_client()
api_response = get_credentials_credentials_get.sync(client=client)
if isinstance(api_response, CredentialsResponse):
credentials = api_response
_write_credentials_cache(cache_path, credentials)
else:
raise RuntimeError("Failed to fetch credentials from API")
finally:
# Lock is automatically released when file is closed
pass
except OSError:
# If we cannot create or lock the cache file (e.g. read-only home directory),
# fall back to fetching credentials directly from the API without using
# any local cache or file locking.
client = get_governance_client()
api_response = get_credentials_credentials_get.sync(client=client)
if isinstance(api_response, CredentialsResponse):
credentials = api_response
else:
raise RuntimeError("Failed to fetch credentials from API")

Copilot uses AI. Check for mistakes.
Comment on lines 106 to 127
class TestGetMinioCredentials:
"""Tests for get_minio_credentials function."""

@patch("berdl_notebook_utils.minio_governance.operations.get_settings")
@patch("berdl_notebook_utils.minio_governance.operations.os")
@patch("berdl_notebook_utils.minio_governance.operations.get_credentials_credentials_get")
@patch("berdl_notebook_utils.minio_governance.operations.get_governance_client")
def test_fetches_credentials_from_api(self, mock_get_client, mock_get_creds, mock_os, mock_get_settings):
def test_fetches_credentials_from_api(self, mock_get_client, mock_get_creds, mock_os):
"""Test fetches credentials from API and sets env vars."""
mock_client = Mock()
mock_get_client.return_value = mock_client

mock_creds = Mock(spec=CredentialsResponse)
mock_creds.access_key = "new_key"
mock_creds.secret_key = "new_secret"
mock_creds.access_key = "api_key"
mock_creds.secret_key = "api_secret"
mock_get_creds.sync.return_value = mock_creds

result = get_minio_credentials()

assert result == mock_creds
mock_get_creds.sync.assert_called_once_with(client=mock_client)
mock_os.environ.__setitem__.assert_any_call("MINIO_ACCESS_KEY", "new_key")
mock_os.environ.__setitem__.assert_any_call("MINIO_SECRET_KEY", "new_secret")
mock_get_settings.cache_clear.assert_called_once()
mock_os.environ.__setitem__.assert_any_call("MINIO_ACCESS_KEY", "api_key")
mock_os.environ.__setitem__.assert_any_call("MINIO_SECRET_KEY", "api_secret")

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestGetMinioCredentials.test_fetches_credentials_from_api doesn’t isolate the on-disk cache path (Path.home() / .berdl_minio_credentials). Since get_minio_credentials() now reads the cache before calling the API, the test can become order/environment dependent if that file exists on the runner. Patch _get_credentials_cache_path to a tmp_path and _read_cached_credentials to None (or ensure the cache file is removed) so this test deterministically exercises the API-fetch branch.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +26
if [ -n "${GOVERNANCE_API_URL:-}" ] && [ -n "${KBASE_AUTH_TOKEN:-}" ]; then
# Fetch per-user credentials from governance API (idempotent — returns cached).
# Retry up to 5 times — the service may still be starting.
echo "Fetching MinIO credentials from governance API..."
CRED_FETCHED=false
for attempt in 1 2 3 4 5; do
CRED_RESPONSE=$(curl -sL -w "\n%{http_code}" \
-H "Authorization: Bearer ${KBASE_AUTH_TOKEN}" \
"${GOVERNANCE_API_URL}/credentials" 2>/dev/null)
HTTP_CODE=$(echo "$CRED_RESPONSE" | tail -1)
BODY=$(echo "$CRED_RESPONSE" | sed '$d')
if [ "$HTTP_CODE" = "200" ] && [ -n "$BODY" ]; then
export AWS_ACCESS_KEY_ID=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin).get('access_key',''))")
export AWS_SECRET_ACCESS_KEY=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin).get('secret_key',''))")
echo "Loaded credentials from governance API"
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set -x combined with the credential fetch/export will log the auth token and the AWS secret/access keys (both the curl command and the export ... expansions). This is a credential leak in container logs; disable xtrace around the credential fetch/parsing (or run those commands in a subshell with set +x), and avoid placing secrets on the command line where possible.

Copilot uses AI. Check for mistakes.
Tianhao-Gu and others added 2 commits March 23, 2026 13:44
Align with main branch (PR #142): remove local .berdl_minio_credentials
file cache and file locking from get_minio_credentials() and
rotate_minio_credentials(). The governance API already caches
credentials server-side in PostgreSQL.
Copilot AI review requested due to automatic review settings March 26, 2026 21:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +24 to +30
export AWS_ACCESS_KEY_ID=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin).get('access_key',''))")
export AWS_SECRET_ACCESS_KEY=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin).get('secret_key',''))")
echo "Loaded credentials from governance API"
CRED_FETCHED=true
break
fi
echo " Attempt $attempt: HTTP $HTTP_CODE — retrying in 3s..." >&2
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Credential parsing marks CRED_FETCHED=true solely based on HTTP 200 and non-empty body. If the body is invalid JSON or missing keys, the python json.load calls will fail (or return empty strings) but the script will still treat the fetch as successful and proceed with empty AWS_* values. Please validate JSON parsing succeeded and that both access_key/secret_key are non-empty before setting CRED_FETCHED=true / breaking the loop.

Suggested change
export AWS_ACCESS_KEY_ID=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin).get('access_key',''))")
export AWS_SECRET_ACCESS_KEY=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin).get('secret_key',''))")
echo "Loaded credentials from governance API"
CRED_FETCHED=true
break
fi
echo " Attempt $attempt: HTTP $HTTP_CODE — retrying in 3s..." >&2
ACCESS_KEY=$(echo "$BODY" | python3 -c "import sys, json; data = json.load(sys.stdin); print(data.get('access_key', '')) if isinstance(data, dict) else None" 2>/dev/null || echo "")
SECRET_KEY=$(echo "$BODY" | python3 -c "import sys, json; data = json.load(sys.stdin); print(data.get('secret_key', '')) if isinstance(data, dict) else None" 2>/dev/null || echo "")
if [ -n "$ACCESS_KEY" ] && [ -n "$SECRET_KEY" ]; then
export AWS_ACCESS_KEY_ID="$ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$SECRET_KEY"
echo "Loaded credentials from governance API"
CRED_FETCHED=true
break
else
echo " Attempt $attempt: HTTP $HTTP_CODE — received invalid or incomplete credentials payload — retrying in 3s..." >&2
fi
else
echo " Attempt $attempt: HTTP $HTTP_CODE — retrying in 3s..." >&2
fi

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +105
echo "Mounting s3fs: ${bucket_prefix} -> ${mountpoint}"
if s3fs "${bucket_prefix}" "${mountpoint}" -o "${s3fs_opts}" ${S3FS_SSL_OPTS} ${extra_opts}; then
echo "Successfully mounted ${mountpoint}"
else
echo "ERROR: Failed to mount ${mountpoint} — falling back to empty directory" >&2
fi
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If s3fs fails to mount, the script only logs an error and continues, leaving the mountpoint as a root-owned local directory. Subsequent startup steps (e.g., setup_env hook) will create files under /home/$NB_USER that may then be unwritable by the notebook user. Please make mount_s3fs return non-zero on failure and either (a) chown the mountpoint to NB_USER on failure, or (b) abort startup / switch to a clear local-home fallback path.

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +20
CRED_RESPONSE=$(curl -sL -w "\n%{http_code}" \
-H "Authorization: Bearer ${KBASE_AUTH_TOKEN}" \
"${GOVERNANCE_API_URL}/credentials" 2>/dev/null)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With set -x enabled, this curl invocation will log the expanded Authorization: Bearer ${KBASE_AUTH_TOKEN} header to stdout/stderr, potentially exposing the user token in container logs. Please disable xtrace (e.g., set +x) around the credential fetch (and re-enable after) or remove set -x entirely.

Copilot uses AI. Check for mistakes.
Comment on lines +85 to +86
s3fs_opts="${s3fs_opts},url=${MINIO_URL}"
s3fs_opts="${s3fs_opts},use_path_request_style"
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s3fs_opts uses url=${MINIO_URL}, but MINIO_URL is currently built with an unconditional http:// prefix when MINIO_ENDPOINT_URL lacks a scheme. That ignores MINIO_SECURE and can cause HTTPS MinIO endpoints to be contacted over HTTP. Please derive the scheme from MINIO_SECURE when constructing MINIO_URL so s3fs/boto3 use the correct protocol.

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +70
if [[ "${MINIO_SECURE:-True}" != "True" ]]; then
S3FS_SSL_OPTS="-o no_check_certificate"
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MINIO_SECURE handling for s3fs is inconsistent and likely inverted: it defaults to "True", compares case-sensitively to "True", and when it’s not exactly "True" it sets no_check_certificate (which disables TLS cert validation). This can unexpectedly disable certificate validation for values like "true"/"1"/"yes", and it doesn’t clearly separate "use https" vs "skip cert verify". Please parse MINIO_SECURE case-insensitively (as done elsewhere in the repo) and only apply no_check_certificate when using HTTPS and you explicitly want to skip verification.

Suggested change
if [[ "${MINIO_SECURE:-True}" != "True" ]]; then
S3FS_SSL_OPTS="-o no_check_certificate"
# Normalize MINIO_SECURE (case-insensitive); default to secure/verified HTTPS.
MINIO_SECURE_NORMALIZED="${MINIO_SECURE:-true}"
MINIO_SECURE_NORMALIZED="${MINIO_SECURE_NORMALIZED,,}"
# Only disable certificate verification when:
# 1) The endpoint URL is HTTPS, and
# 2) MINIO_SECURE explicitly requests insecure/skip-verify behavior.
if [[ "${MINIO_URL}" =~ ^https:// ]]; then
case "${MINIO_SECURE_NORMALIZED}" in
skip-verify|skip_verify|insecure|allow-insecure|allow_insecure)
S3FS_SSL_OPTS="-o no_check_certificate"
;;
*)
S3FS_SSL_OPTS=""
;;
esac

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +118
s3 = boto3.client('s3', endpoint_url='${MINIO_URL}',
aws_access_key_id='${AWS_ACCESS_KEY_ID}', aws_secret_access_key='${AWS_SECRET_ACCESS_KEY}',
region_name='us-east-1')
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline python3 -c embeds AWS access/secret keys directly in the command string (aws_access_key_id=..., aws_secret_access_key=...). That can leak credentials via process listings and shell debug output. Please rely on the exported AWS_* env vars (boto3 picks them up automatically) or pass credentials via a file/FD with restricted permissions rather than putting secrets in the command line.

Suggested change
s3 = boto3.client('s3', endpoint_url='${MINIO_URL}',
aws_access_key_id='${AWS_ACCESS_KEY_ID}', aws_secret_access_key='${AWS_SECRET_ACCESS_KEY}',
region_name='us-east-1')
s3 = boto3.client('s3', endpoint_url='${MINIO_URL}', region_name='us-east-1')

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants