-
Notifications
You must be signed in to change notification settings - Fork 89
pool server - don't issue MANY_STORAGE_ISSUES after scaling up (dfbgus 4152) #9235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1190,8 +1190,11 @@ function calc_hosts_pool_mode(pool_info, storage_by_mode, s3_by_mode) { | |
| const storage_count = hosts.by_service.STORAGE; | ||
| const storage_offline = storage_by_mode.OFFLINE || 0; | ||
| const storage_optimal = storage_by_mode.OPTIMAL || 0; | ||
| const storage_low_capacity = storage_by_mode.LOW_CAPACITY || 0; | ||
| const storage_offline_ratio = (storage_offline / host_count) * 100; | ||
| const storage_issues_ratio = ((storage_count - storage_optimal) / storage_count) * 100; | ||
| //don't count individual storage with low capacity as having issues. | ||
| //low capacity is handled for the entire BS by free_ratio check below | ||
| const storage_issues_ratio = ((storage_count - storage_optimal - storage_low_capacity) / storage_count) * 100; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion | 🟠 Major 🧩 Analysis chainVerify handling of zero If Confirm whether Note: Line 1202 has similar logic for Run the following script to check if there are any guards or validations for zero counts elsewhere in the codebase: 🏁 Script executed: #!/bin/bash
# Description: Search for similar ratio calculations and guards for zero denominators
# Search for ratio calculations with potential division by zero
rg -n -C3 'ratio.*=.*\/.*count' src/server/system_services/pool_server.js
# Search for guards checking if count is zero before division
rg -n -C3 'count.*===.*0|count.*>.*0.*ratio' src/server/system_services/pool_server.jsLength of output: 1635 🏁 Script executed: #!/bin/bash
# Search for storage_issues_ratio definition and nearby guards
rg -n -C3 'storage_issues_ratio' src/server/system_services/pool_server.js
# Search for any if-statements checking storage_count before use
rg -n -C3 'if.*storage_count' src/server/system_services/pool_server.js
# Search for s3_issues_ratio definition and nearby guards
rg -n -C3 's3_issues_ratio' src/server/system_services/pool_server.js
# Search for any if-statements checking s3_issues_ratio or its denominator
rg -n -C3 'if.*s3_issues_ratio' src/server/system_services/pool_server.jsLength of output: 2783 Guard against zero denominators when computing 🤖 Prompt for AI Agents |
||
| const hosts_initializing = hosts.by_mode.INITIALIZING || 0; | ||
| const hosts_migrating = (hosts.by_mode.INITIALIZING || 0) + (hosts.by_mode.MIGRATING || 0); | ||
| const s3_count = hosts.by_service.GATEWAY; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about
NO_CAPACITYmode ?Worth going over all other modes (enum here) and see if they can also be excluded from
MANY_STORAGE_ISSUES.Maybe anything that can be a result of a user operation (e.g.
DELETING) or a temp thing (INITIALIZING) can also be ignored. @alphaprinz @nimrod-becker WDYT?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to go with a no.
-Temp states will go away and no one will care about them.
-User actions affecting the state makes sense.
-NO_CAPACITY is tricky to ignore because if some other issues incapacitates other hosts, you're left with nothing (as opposed to LOW_CAPACITY which would still allow you some operational uptime).