Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Driver can display the secret content of node-expand-secret-name #1372

Open
mpatlasov opened this issue Sep 15, 2023 · 14 comments · May be fixed by #1462
Open

Driver can display the secret content of node-expand-secret-name #1372

mpatlasov opened this issue Sep 15, 2023 · 14 comments · May be fixed by #1462
Assignees
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@mpatlasov
Copy link

What happened?

An user with permissions to create or modify StorageClass may print any Secret.

The problem comes from a klog print which is not sanitized:

	// Note that secrets are not included in any RPC message. In the past protosanitizer and other log
	// stripping was shown to cause a significant increase of CPU usage (see
	// https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/issues/356#issuecomment-550529004).
	klog.V(4).Infof("%s called with request: %s", info.FullMethod, req)

The comment says that secrets are not included in any RPC message, however there many RPC messages that might include secrets. For example NodeExpandVolume as explained here. If one can add:

parameters:
  csi.storage.k8s.io/node-expand-secret-name: test-secret

to the StorageClass, then after PV resize one can see in logs:

I0915 03:09:16.116829       1 utils.go:66] /csi.v1.Node/NodeExpandVolume called with request: volume_id:"projects/openshift-gce-devel/zones/us-central1-a/disks/pvc-11fd4016-7702-4c0c-9fae-71eddea3e4fa" volume_path:"/var/lib/kubelet/pods/2df06db8-b19a-4b7d-8aab-4d6eb65f0df0/volumes/kubernetes.io~csi/pvc-11fd4016-7702-4c0c-9fae-71eddea3e4fa/mount" capacity_range:<required_bytes:3221225472 > staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/0b0a5be8b1e2d20b1e05a9f1f6d7b1ea36fcf6acfade1a9e448a088e20fe38bf/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > secrets:<key:"password" value:"t0p-Secret0" > secrets:<key:"username" value:"admin" > 

Note that there are 7 more RPC message types with secrets listed here.

GCP CSI driver does not seem to use these features, so the threat may come only from malicious user with elevated permissions who intentinally want to expose secrets.

What did you expect to happen?

Log print to be sanitized. E.g.:

I0915 03:29:10.005165       1 utils.go:81] /csi.v1.Node/NodeExpandVolume called with request: {"capacity_range":{"required_bytes":4294967296},"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/0b0a5be8b1e2d20b1e05a9f1f6d7b1ea36fcf6acfade1a9e448a088e20fe38bf/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_id":"projects/openshift-gce-devel/zones/us-central1-a/disks/pvc-11fd4016-7702-4c0c-9fae-71eddea3e4fa","volume_path":"/var/lib/kubelet/pods/2df06db8-b19a-4b7d-8aab-4d6eb65f0df0/volumes/kubernetes.io~csi/pvc-11fd4016-7702-4c0c-9fae-71eddea3e4fa/mount"}

How can we reproduce it?

  1. Create secret
apiVersion: v1
kind: Secret
metadata:
  name: test-secret
  namespace: default
data:
stringData:
  username: admin
  password: t0p-Secret
  1. Create sc
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: test5
parameters:
  csi.storage.k8s.io/node-expand-secret-name: test-secret
  csi.storage.k8s.io/node-expand-secret-namespace: default
provisioner: pd.csi.storage.gke.io
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
  1. Create pvc/pod

  2. Update csi driver log to Debug

  3. Resize the volume capacity

  4. Check csi driver logs

I0915 03:09:16.116829       1 utils.go:66] /csi.v1.Node/NodeExpandVolume called with request: volume_id:"projects/openshift-gce-devel/zones/us-central1-a/disks/pvc-11fd4016-7702-4c0c-9fae-71eddea3e4fa" volume_path:"/var/lib/kubelet/pods/2df06db8-b19a-4b7d-8aab-4d6eb65f0df0/volumes/kubernetes.io~csi/pvc-11fd4016-7702-4c0c-9fae-71eddea3e4fa/mount" capacity_range:<required_bytes:3221225472 > staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pd.csi.storage.gke.io/0b0a5be8b1e2d20b1e05a9f1f6d7b1ea36fcf6acfade1a9e448a088e20fe38bf/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > secrets:<key:"password" value:"t0p-Secret0" > secrets:<key:"username" value:"admin" > 

Anything else we need to know?

n/a

/cc @msau42 , @mattcary , @cjcullen, @tallclair, @jsafrane

@mpatlasov
Copy link
Author

/assign

I have a patch to sanitize specific RPC req types, but before opening PR I'd like to get feedback on issue severity. The severity is probably depends on possibility to have permissions high enough to create or modify StorageClass, but not as high as cluster admin who can print secrets in a simpler way.

@jsafrane
Copy link
Contributor

We discussed it on today's sig-storage triage meeting and we agreed that it should be fixed by sanitizing requests, but only when debug log level is enabled:

  if klog.V(4).Enabled() {
    klog.Infof("%s called with request: %s", info.FullMethod, sanitize(req))
  }

Then it won't get called in production or wherever performance is required.

@msau42
Copy link
Contributor

msau42 commented Sep 20, 2023

For production we actually use log level 4. The pdcsi driver does not actually require any secrets though? Maybe we need to look at ways to disallow/ignore secrets being passed in the first place.

@jsafrane
Copy link
Contributor

Yes, the CSI driver does not need any secrets, it just logs them and then ignores them. Currently it's not possible to tell CSI sidecars / kubelet what secrets the driver actually uses and I am not sure it's the right direction - it could be error prone to adding new method / secret.

@mattcary
Copy link
Contributor

If there was a CSIDriver field that said whether the driver used secrets at all, then tere could be eg an admission webhook that only let known drivers exposing secrets to be installed. (eg, this driver doesn't use secrets at all).

I understand there'd be backwards compatibility issues with this, but it's an idea.

mpatlasov added a commit to mpatlasov/gcp-pd-csi-driver that referenced this issue Oct 20, 2023
Malicious user can put a secret in request as explained here: kubernetes-sigs#1372.
mpatlasov added a commit to mpatlasov/gcp-pd-csi-driver that referenced this issue Oct 30, 2023
Malicious user can put a secret in request as explained here: kubernetes-sigs#1372.
mpatlasov added a commit to mpatlasov/gcp-pd-csi-driver that referenced this issue Nov 3, 2023
Malicious user can put a secret in request as explained here: kubernetes-sigs#1372.
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 28, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 27, 2024
@msau42
Copy link
Contributor

msau42 commented Mar 9, 2024

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 9, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 7, 2024
@mpatlasov
Copy link
Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 7, 2024
mpatlasov added a commit to mpatlasov/gcp-pd-csi-driver that referenced this issue Aug 27, 2024
Malicious user can put a secret in request as explained here: kubernetes-sigs#1372.
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 5, 2024
@mpatlasov
Copy link
Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 6, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 5, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants