[Accton][wedge800cact] Fix warmboot VerifyHostToQueueMappingClassID failure#1334
Open
BrandonCheng0121 wants to merge 1 commit into
Open
[Accton][wedge800cact] Fix warmboot VerifyHostToQueueMappingClassID failure#1334BrandonCheng0121 wants to merge 1 commit into
BrandonCheng0121 wants to merge 1 commit into
Conversation
…ailure Regarding the T1 warmboot.AgentQueuePerHostL2Test.VerifyHostToQueueMappingClassID failure, the issue is a warmboot stats race condition. The test gets stale cached ACL stats within milliseconds of warmboot completing because the background stats thread only updates every 1 second. Solution: Explicitly call updateStats() in SwSwitch::initialConfigApplied() after warmboot completes to immediately sync hardware stats to software cache.
Contributor
Author
|
@Tianyu-Meta |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Regarding the T1 warmboot.AgentQueuePerHostL2Test.VerifyHostToQueueMappingClassID failure: the issue is a warmboot stats race condition.
The test gets stale cached ACL stats within milliseconds of warmboot completing because the background stats thread updates them only every 1 second.
Solution:
Explicitly call updateStats() in SwSwitch::initialConfigApplied() after warmboot completes to immediately sync hardware stats to the software cache.
Pre-submission checklist
pip install -r requirements-dev.txt && pre-commit installpre-commit run --files fboss/agent/SwSwitch.cpp fboss/agent/test/agent_hw_tests/AgentQueuePerHostL2Tests.cppSummary
After warmboot, tests immediately reading stats would get stale cached values because the background thread updates only every 1s.
Test Plan
Test command:
for i in {1..10}; do echo "=== run $i attempts ==="; time ./bin/run_test.py sai_agent --agent-run-mode mono --filter=AgentQueuePerHostL2Test.VerifyHostToQueueMappingClassID --skip-known-bad-tests "leaba/25.11.4210/25.11.4210/graphene202x" --enable-production-features g202x --config /opt/fboss/share/hw_test_configs/wedge800cact.agent.materialized_JSON --fruid-path /home/Go_FBOSS_Test/W800CA-Fix/./fboss-configs/fboss/oss/scripts/run_configs/fruid.json --mgmt-if eth0 --platform_mapping_override_path /home/Go_FBOSS_Test/W800CA-Fix/./fboss-configs/fboss/lib/platform_mapping_v2/generated_platform_mappings/wedge800cact_platform_mapping-2026-0418-v0.7-honglim_20260409-del_pie.json 2>&1 | tee 463e_W800CACT_VerifyHostToQueueMappingClassID_DUT35_DVT1_v5_$(date '+%Y-%m-%d-%H:%M')_$i.log; doneThe test consistently passes for 10 consecutive runs.