Skip to content

update troubleshooting guide #216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

a-mccarthy
Copy link
Collaborator

Updates to troubleshooting guide to include more details on common issues.

To do:

  • Align on formatting for presenting troubleshooting info. Do we want to stick with current format: Issue, Observation, Root Cause, Actions? I feel like its a bit too rigid of a for some of the new content. For example, a root cause may depend on the users system, so that section could lead to more confusion. Also to me issue/observation are very similar sections, and could be removed in favor of having the information as the fist paragraph under the section title.

Signed-off-by: Abigail McCarthy <[email protected]>
@@ -306,3 +624,17 @@ EFI Secure Boot is currently not supported with GPU Operator
:class: h4

Disable EFI Secure Boot on the server.

File an issue
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tariq1890 do we want to point folks to the github repo for filing issues if they can't find their issue on this page? Is the must gather script the best way we want folks to share info about their system in a github issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the troubleshooting guide is not meant to be an exhaustive list of gpu-operator issues. Users could either file a GitHub issue or they are free to contribute to the docs if they feel the troubleshooting guide could add more issues

Copy link

Documentation preview

https://nvidia.github.io/cloud-native-docs/review/pr-216

@tariq1890
Copy link
Contributor

Do we want to stick with current format: Issue, Observation, Root Cause, Actions?

I am definitely open to a better format. This was the format I had gone with when creating the first draft of the troubleshooting guide. If there is a better and more reader-friendly format, we should definitely switch to that.

@tariq1890
Copy link
Contributor

Thanks for this PR, @a-mccarthy !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants