Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HAMi-WebUI无法监控导节点gpu资源情况 #5

Open
lsj-x opened this issue Oct 24, 2024 · 6 comments
Open

HAMi-WebUI无法监控导节点gpu资源情况 #5

lsj-x opened this issue Oct 24, 2024 · 6 comments

Comments

@lsj-x
Copy link

lsj-x commented Oct 24, 2024

问题描述:资源总览、节点管理等页面无数据,hami-webui 8000端口的metrics指标也没有hami开头的指标数据集

原因:WebUI 里面节点列表是根据gpu=on这单个标签 select 出来的,如果gpu节点打的标签不是gpu=on,则无法查询到数据
7c3cd1be622d668aa8c818d977a55ec

解决办法:给gpu节点打上gpu=on的标签

改进建议:得和 HAMi 一样,把gpu节点标签gpu=on做成可配置

@lsj-x lsj-x changed the title HAMi-WebUI无法监控导节点gpu资源情 HAMi-WebUI无法监控导节点gpu资源情况 Oct 24, 2024
@yule-sun
Copy link

您好,最上面的儀錶盤數據都是0是什麼原因,我這邊部署之後也是全部都是0情況。
另外想問下,vGPU數量和算力是按照什麼標準來計算的呢?
image

@liujunfei980
Copy link

問題描述:Hami監控界面沒有數據顯示,如圖所示
image

@mel3c
Copy link

mel3c commented Nov 8, 2024

image ”hami_“ Which exporter collects the indicator data at the beginning? Nowhere described in the documentation

@liujunfei980
Copy link

image ”hami_“ Which exporter collects the indicator data at the beginning? Nowhere described in the documentation

I used helm install my-hami-webui hami-webui/hami-webui --namespace hami -f values.yaml
When deploying, I only modified the following parameters in the values.yaml file, as shown in the figure:
image
Is my deployment method wrong?

@mel3c
Copy link

mel3c commented Nov 8, 2024

image ”hami_“ Which exporter collects the indicator data at the beginning? Nowhere described in the documentation

I used helm install my-hami-webui hami-webui/hami-webui --namespace hami -f values.yaml When deploying, I only modified the following parameters in the values.yaml file, as shown in the figure: image Is my deployment method wrong?

I don't think it's a deployment issue, I can't query the indicator data starting with "hami_" in prometheus, it's probably because some hami_exporter is missing that the indicator can't be collected

@liujunfei980
Copy link

image ”hami_“ Which exporter collects the indicator data at the beginning? Nowhere described in the documentation

I used helm install my-hami-webui hami-webui/hami-webui --namespace hami -f values.yaml When deploying, I only modified the following parameters in the values.yaml file, as shown in the figure: image Is my deployment method wrong?

I don't think it's a deployment issue, I can't query the indicator data starting with "hami_" in prometheus, it's probably because some hami_exporter is missing that the indicator can't be collected

I can see my-hami-webui-serviceMonitor in the Targets of promethues, and I can also see the indicators when I open metrics, but there is no indicator parameter starting with hami_. What is the reason for this? I have no idea now.
image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants