You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ihave provisioned a kubernetes cluster using kubespray of EC2 instances in AWS. After the cluster is successfully provisioned and all nodes are healthy and running, I installed the EBS-CSI driver by following the steps as recommended and then running the cluster.yml ansible playbook.
Initially, the ebs-csi controller pod was in crontroller pod was in crashback loop of state with the ebs-plugin container inside the pod failing. Error was 'CSI-NODE NAME NOT SET'. I was able to fix this issue by adding an env variable into the ebs-csi-controller by editing the deployment. Storage class was created as expected.
When running the sample PVC and pod, in official kubespray githup repo, the pvc was in pending state.
Warning ProvisioningFailed 15m ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 failed to provision volume with StorageClass "ebs-sc-new": rpc error: code = Internal desc = AuthFailure: AWS was not able to validate the provided access credentials
status code: 401, request id: 16eac760-e2e5-4182-a2c1-89cff669f3bd
Warning ProvisioningFailed 14m ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 failed to provision volume with StorageClass "ebs-sc-new": rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
Normal Provisioning 95s (x12 over 16m) ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 External provisioner is provisioning volume for claim "default/ebs-pvc"
Warning ProvisioningFailed 85s (x7 over 15m) ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 failed to provision volume with StorageClass "ebs-sc-new": rpc error: code = DeadlineExceeded desc = context deadline exceeded
Normal ExternalProvisioning 57s (x62 over 16m) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'ebs.csi.aws.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
What did you expect to happen?
The pvc to bound and a volume to be created in AWS for the pod.
How can we reproduce it (as minimally and precisely as possible)?
Provision a kuberenetes cluster on AWS with EC2 instances using kubespray.
To install the ebs-csi-driver:
Uncommented the aws_ebs_csi_enabled option in group_vars/all/aws.yml and set it to true.
Set persistent_volumes_enabled in group_vars/k8s_cluster/k8s_cluster.yml to true.
Attached role to all the EC2 instances to allow all EBS actions
Created and applied secret to provide AWS credentials (access token and key)
Ran cluster.yml playbook.
What happened?
Ihave provisioned a kubernetes cluster using kubespray of EC2 instances in AWS. After the cluster is successfully provisioned and all nodes are healthy and running, I installed the EBS-CSI driver by following the steps as recommended and then running the cluster.yml ansible playbook.
Initially, the ebs-csi controller pod was in crontroller pod was in crashback loop of state with the ebs-plugin container inside the pod failing. Error was 'CSI-NODE NAME NOT SET'. I was able to fix this issue by adding an env variable into the ebs-csi-controller by editing the deployment. Storage class was created as expected.
When running the sample PVC and pod, in official kubespray githup repo, the pvc was in pending state.
https://github.com/kubernetes-sigs/kubespray/blob/master/docs/CSI/aws-ebs-csi.md
Error log of ebs-csi-controller pod:
Warning ProvisioningFailed 15m ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 failed to provision volume with StorageClass "ebs-sc-new": rpc error: code = Internal desc = AuthFailure: AWS was not able to validate the provided access credentials
status code: 401, request id: 16eac760-e2e5-4182-a2c1-89cff669f3bd
Warning ProvisioningFailed 14m ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 failed to provision volume with StorageClass "ebs-sc-new": rpc error: code = Internal desc = RequestCanceled: request context canceled
caused by: context deadline exceeded
Normal Provisioning 95s (x12 over 16m) ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 External provisioner is provisioning volume for claim "default/ebs-pvc"
Warning ProvisioningFailed 85s (x7 over 15m) ebs.csi.aws.com_ebs-csi-controller-75d79769b8-bbftz_1cfa04d6-8ed3-42f2-9834-1dfaa7687054 failed to provision volume with StorageClass "ebs-sc-new": rpc error: code = DeadlineExceeded desc = context deadline exceeded
Normal ExternalProvisioning 57s (x62 over 16m) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'ebs.csi.aws.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
What did you expect to happen?
The pvc to bound and a volume to be created in AWS for the pod.
How can we reproduce it (as minimally and precisely as possible)?
Provision a kuberenetes cluster on AWS with EC2 instances using kubespray.
To install the ebs-csi-driver:
Uncommented the aws_ebs_csi_enabled option in group_vars/all/aws.yml and set it to true.
Set persistent_volumes_enabled in group_vars/k8s_cluster/k8s_cluster.yml to true.
Attached role to all the EC2 instances to allow all EBS actions
Created and applied secret to provide AWS credentials (access token and key)
Ran cluster.yml playbook.
To fix CSI_NODE_NAME env var not set:
kubectl edit deployment.apps/ebs-csi-controller -n kube-system
env:
valueFrom:
fieldRef:
fieldPath: spec.nodeName
OS
Linux 6.5.0-1022-aws x86_64
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Version of Ansible
ansible [core 2.14.17]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.10/dist-packages/ansible
ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0] (/usr/bin/python3)
jinja version = 3.1.2
libyaml = True
Version of Python
Python 3.10.12
Version of Kubespray (commit)
kubespray:v2.25.0
Network plugin used
cilium
Full inventory with variables
[all]
master1 ansible_host=10.0.0.101 ip=10.0.0.101
master2 ansible_host=10.0.4.70 ip=10.0.4.70
master3 ansible_host=10.0.15.218 ip=10.0.15.218
worker1 ansible_host=10.0.21.128 ip=10.0.21.128
worker2 ansible_host=10.0.24.96 ip=10.0.24.96
etcd1 ansible_host=10.0.5.14 ip=10.0.5.14
[kube_control_plane]
master2
master1
[etcd]
etcd1
[kube_node]
worker1
worker2
[calico_rr]
[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr
Command used to invoke ansible
sudo docker run --rm -it --mount type=bind,source=/home/ubuntu/kubespray/inventory/mycluster/,dst=/inventory --mount type=bind,source=/home/ubuntu/.ssh/id_rsa,dst=/root/.ssh/id_rsa --mount type=bind,source=/home/ubuntu/.ssh/id_rsa,dst=/home/ubuntu/.ssh/id_rsa quay.io/kubespray/kubespray:v2.25.0 bash ansible-playbook -i /inventory/inventory.ini cluster.yml --user=ubuntu --become --become-user=root --private-key=/home/ubuntu/.ssh/id_rsa -e kube_network_plugin=cilium --flush-cache
Output of ansible run
PLAY RECAP *************************************************************************************
etcd1 : ok=137 changed=11 unreachable=0 failed=0 skipped=340 rescued=0 ignored=0
master1 : ok=491 changed=14 unreachable=0 failed=0 skipped=950 rescued=0 ignored=1
master2 : ok=540 changed=22 unreachable=0 failed=0 skipped=1040 rescued=0 ignored=1
worker1 : ok=412 changed=15 unreachable=0 failed=0 skipped=638 rescued=0 ignored=1
worker2 : ok=412 changed=15 unreachable=0 failed=0 skipped=633 rescued=0 ignored=1
Thursday 29 August 2024 00:22:10 +0000 (0:00:00.302) 0:08:02.550 *******
container-engine/runc : Download_file | Download item ---------------------------------- 10.46s
container-engine/containerd : Download_file | Download item ---------------------------- 10.18s
container-engine/crictl : Download_file | Download item -------------------------------- 10.09s
container-engine/nerdctl : Download_file | Download item -------------------------------- 9.99s
container-engine/crictl : Extract_file | Unpacking archive ------------------------------ 7.79s
kubernetes/preinstall : Update package management cache (APT) --------------------------- 7.64s
container-engine/nerdctl : Extract_file | Unpacking archive ----------------------------- 6.92s
kubernetes-apps/ansible : Kubernetes Apps | Start Resources ----------------------------- 5.63s
download : Download_file | Download item ------------------------------------------------ 5.38s
kubernetes-apps/ansible : Kubernetes Apps | Lay Down CoreDNS templates ------------------ 4.90s
etcdctl_etcdutl : Download_file | Download item ----------------------------------------- 4.84s
kubernetes-apps/ingress_controller/ingress_nginx : NGINX Ingress Controller | Create manifests --- 4.83s
download : Download | Download files / images ------------------------------------------- 4.57s
kubernetes-apps/ingress_controller/ingress_nginx : NGINX Ingress Controller | Apply manifests --- 4.44s
network_plugin/cilium : Cilium | Create Cilium node manifests --------------------------- 4.28s
container-engine/containerd : Containerd | Unpack containerd archive -------------------- 4.10s
etcdctl_etcdutl : Extract_file | Unpacking archive -------------------------------------- 3.94s
kubernetes-apps/metrics_server : Metrics Server | Create manifests ---------------------- 3.75s
network_plugin/cilium : Cilium | Start Resources ---------------------------------------- 3.69s
container-engine/containerd : Download_file | Create dest directory on node ------------- 3.61s
Anything else we need to know
No response
The text was updated successfully, but these errors were encountered: