DAOS-18705 rebuild: set rebuild flag before creating rebuild_pool_tls#17941
Conversation
- stop refreshing aggregation epoch while rebuilding - set rebuilding flag before setting rebuild fence Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
|
Ticket title is 'DAOS 2.6.5: Interrupt rebuild with reintegration and interrupt with exclude with active IO' |
|
Test stage Unit Test completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17941/1/testReport/ |
Signed-off-by: Wang Shilong <shilong.wang@hpe.com>
| /* local rebuild epoch mainly to constrain the VOS aggregation | ||
| * to make sure aggregation will not cross the epoch | ||
| /* | ||
| * XX: remove this. |
There was a problem hiding this comment.
you mean replace by other method?
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
|
Test stage Functional on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17941/7/execution/node/1037/log |
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
|
Test stage Functional on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17941/8/execution/node/1032/log |
| ; | ||
|
|
||
| epoch_range.epr_lo = epoch_min != 0 ? epoch_min + 1 : 0; | ||
| if (i == 0) |
There was a problem hiding this comment.
[minor]Perhaps to add some comments to explain even epr_lo is set to zero here, we have filter optimizations later. and this is used to guarantee aggregation could not cross snapshot.
|
Test stage Unit Test completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17941/9/testReport/ |
|
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17941/12/execution/node/640/log |
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17941/12/execution/node/630/log |
|
Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17941/12/execution/node/765/log |
…#17941) (#17962) - set rebuild flag before creating rebuild_pool_tls, otherwise aggregation can progress to higher epoch than rebuild. - aggregation doesn't do full scan anymore after rebuild - fix a rpt refcount leak in rebuild_tgt_scan_handler() Signed-off-by: Liang Zhen <gnailzenh@gmail.com> Co-authored-by: Wang Shilong <shilong.wang@hpe.com> Reviewed-by: Xuezhao Liu <xuezhao.liu@hpe.com> Reviewed-by: Niu Yawei <yawei.niu@hpe.com>
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
Signed-off-by: Wang Shilong <shilong.wang@hpe.com>
82a7a0f
Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
|
Test stage Test RPMs on EL 9.6 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17941/14/execution/node/988/log |
…#17941) (#17985) set rebuild flag before creating rebuild_pool_tls, otherwise aggregation can progress to higher epoch than rebuild. aggregation doesn't do full scan anymore after rebuild fix a rpt refcount leak in rebuild_tgt_scan_handler() Signed-off-by: Liang Zhen <gnailzenh@gmail.com> Co-authored-by: Wang Shilong <shilong.wang@hpe.com> Reviewed-by: Xuezhao Liu <xuezhao.liu@hpe.com>
can progress to higher epoch than rebuild.
Steps for the author:
After all prior steps are complete: