Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to connect to csi.sock #56

Open
jlpedrosa opened this issue Mar 31, 2024 · 3 comments
Open

Unable to connect to csi.sock #56

jlpedrosa opened this issue Mar 31, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation upstream Something is broken elsewhere

Comments

@jlpedrosa
Copy link

jlpedrosa commented Mar 31, 2024

I'm unable to boot correctly the CSI driver. More precisely on the HPE part.

The daemon set shows on the csi-node-driver-registrar container:

W0331 13:44:10.456786       1 connection.go:183] Still connecting to unix:///csi/csi.sock
W0331 13:44:20.456542       1 connection.go:183] Still connecting to unix:///csi/csi.sock
W0331 13:44:30.461561       1 connection.go:183] Still connecting to unix:///csi/csi.sock
W0331 13:44:40.456518       1 connection.go:183] Still connecting to unix:///csi/csi.sock
...

On the other container I see a lot of errors, that I don't know if they are fatal (last one feels like?):

+ echo 'starting csi plugin...'
+ exec /bin/csi-driver --endpoint=unix:///csi/csi.sock --node-service --flavor=kubernetes
starting csi plugin...
time="2024-03-31T13:43:11Z" level=info msg="Initialized logging." alsoLogToStderr=true logFileLocation=/var/log/hpe-csi-node.log logLevel=info
time="2024-03-31T13:43:11Z" level=info msg="**********************************************" file="csi-driver.go:54"
time="2024-03-31T13:43:11Z" level=info msg="*************** HPE CSI DRIVER ***************" file="csi-driver.go:55"
time="2024-03-31T13:43:11Z" level=info msg="**********************************************" file="csi-driver.go:56"
time="2024-03-31T13:43:11Z" level=info msg=">>>>> CMDLINE Exec, args: []" file="csi-driver.go:58"
time="2024-03-31T13:43:11Z" level=info msg="got OS details as [redhat 9 2 5.15.0-1049-raspi]\n" file="os.go:95"
time="2024-03-31T13:43:16Z" level=warning msg="Distro section: Ubuntu , not present for deviceType: Nimble , using default config" file="config.go:247"
time="2024-03-31T13:43:16Z" level=info msg="No further iSCSI recommendations are found for this host" file="iscsi.go:399"
time="2024-03-31T13:43:16Z" level=error msg="open /sys/firmware/dmi/tables/DMI: no such file or directory" file="file.go:117"
time="2024-03-31T13:43:16Z" level=error msg="unable to get system information using sysfs as well open /sys/firmware/dmi/tables/DMI: no such file or directory" file="system.go:108"
time="2024-03-31T13:43:16Z" level=error msg="unable to determine if system is running as a virtual machine cannot determine if system is of type virtual machine, parameter not found in system information string: manufacturer" file="multipath.go:182"
time="2024-03-31T13:43:16Z" level=error msg="unable to determine if multipath is required cannot determine if system is of type virtual machine, parameter not found in system information string: manufacturer" file="multipath.go:202"
time="2024-03-31T13:43:16Z" level=error msg="Failed to execute CLI handler, Err: Unable to configure multipathd service, err cannot determine if system is of type virtual machine, parameter not found in system information string: manufacturer" file="csi-driver.go:62"

I see multiple things that look wrong/
The distro is actually Ubuntu, but the log says: "got OS details as [redhat 9 2 5.15.0-1049-raspi]\n but later we get Distro section: Ubuntu , not present for deviceType: Nimble , using default config" file="config.go:247"

Indeed the path /sys/firmware/dmi/ does not exist

The "server" is a raspberry pi, also BOOTED through ISCSI. I hope we can fix this before you actually release 2.4.1?

Also heads UP, i modified the kubelet path, the one created by default was incorrect, contained double slash //

This is the deployment:

  values:
    hpe-csi-driver:
      kubeletRootDir: "/var/lib/kubelet"
    service:
      type: LoadBalancer
      port: 8080
    ingress:
      enabled: true
      className: contour
      annotations:
        cert-manager.io/cluster-issuer: letsencrypt-prod
        kubernetes.io/tls-acme: "true"
        kubernetes.io/ingress.class: contour
      hosts:
        - host: mydomain
          paths:
            - path: /
              pathType: ImplementationSpecific
      tls:
      - secretName: truenas-csp-tls
        hosts:
        - mydomain
@datamattsson
Copy link
Collaborator

Hi! Thanks for filing this issue. I'm digging around and finding a few skeletons.

The workaround you need to apply here is to install the Helm chart and disable node configuration and conformance.

To apply hpe-csi-driver parameters to the TrueNAS CSP chart you prefix the values with hpe-csi-driver, like --set hpe-csi-driver.disableNodeConformance --set hpe-csi-driver.disableNodeConfiguration. Somehow I've missed to document this and I'm not in front of a computer to verify this.

There won't be a 2.4.1 of the TrueNAS CSP, we're working on 2.4.2 of the CSI driver as we uncovered an issue with 3PAR that needs immediate attention.

It won't include any fix for this issue so the workaround will be to manually install and configure iSCSI/multipath and disable node configuration/conformance.

@datamattsson datamattsson added the documentation Improvements or additions to documentation label Mar 31, 2024
@jlpedrosa
Copy link
Author

Hi @datamattsson

Thanks for the help, now the errors are gone. Let me know if you need me to run any tests to solve it in a more permanent way or if the recommended way is to disable those, then probably there should be docs about the packages required?

Thanks again!

@datamattsson
Copy link
Collaborator

Let me know if you need me to run any tests to solve it in a more permanent way or if the recommended way is to disable those, then probably there should be docs about the packages required?

Here's the docs for the "disable" parameters: https://scod.hpedev.io/csi_driver/operations.html#manual_node_configuration

The permanent fix for this issue is that the error that is raised shouldn't be fatal. I'll file an internal JIRA for this.

@datamattsson datamattsson added the upstream Something is broken elsewhere label May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation upstream Something is broken elsewhere
Projects
None yet
Development

No branches or pull requests

2 participants