Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improved location claim algorithm (#1008)
* Configuration and module alignment Add claim function and target_n_val configuration into cuttlefish. Move modules around to try and make it more obvious where functions used in membership reside. * Remove deprecated v1 claim/wants * Update framework for Claim Stops fake wants being required to prompt claim on a location change. Allow for a claim module to implement a sort_members_for_choose(Ring, Members, Owners) -> SortedMembers function, to pre-sort the members being passed into claim_rebalance. Add further specs. * Add choose_claim_v4 * Location claim improvements Location claim improved so that it will try to balance the spread of vnodes, if it reaches the end and is still unbalanced. Also uses a stronger meets_target_n to fallback to sequential_claim more reliably on incorrect spacing (of vnodes across nodes, but not yet across locations). * Refinements to claim_v4 Extended potential for test by determining what nodes are safe to both add and remove from loops, rather than simply relying on sequential order. * Correction following removal of log * Count remove list not Excess to determine loops * Better order of initial striping Resolves some test issues. Also try harder to do safe removals when looking back in the ring (as opposed to the removal other additions) * A new claim algorithm (#1003) * Support two transition changes Where the second transition is triggered by a change of location. Need to ensure that the location_changed status update is recognised in the ring * Unrelated fix to remove reference to gen_fsm_compat * unrelated fix to get rid of deprecation warning * Testing claim * The new claim algorithm as purely functional algorithm * add new entry for version 5 claiming * Refactor v5 into v4 * move impossible config test to place where we actually may enter recursion * Documentation The algorithm should be described in more detail in a markup document * Allow configurations with zero nodes in location for better placement update This works better when a location is emptied on nodes. Less transfers. * Keep order of nodes to avoid back translate issue --------- Co-authored-by: Martin Sumner <[email protected]> * Claim API requires export of 2-arity choose function as well as 3-arity * Always return indices Otherwise, if all vnodes have become excluded there is no escape from this condition (unless other traffic can trigger the creation of vnodes). This is helpful in situations where transfers are performed on standby clusters with no other traffic. This commit also logs a timing of the claim each time it is called. * Calculate swaps only once * Remember v4 solutions via claimant To allow for the riak_core_ring_manager and riak_core_claimant to remember v4 solutions, they are shared via the state of the claimant * Long-running tests * Adding an extra test (#1004) * Add an extra test to show owners may stay the same if only location changes This is not always the case, but holds when there is a solution in the first place * Fix type error that dialyzer could not find * Introduce necessary conditions to fallback to version 2 * update tests * Check whether it is worth to use brute force * make historic values the norm * Introduce nvals map type * Take nr nodes into account when checking for brute force cond. * Property to evaluate skipping brute force strategy * QuickCheck property starts with choosing ring size. * Remove fallback for necessary conditions * Filter tests to get away with flakyness * In order to re-run tests suite, remove strict precondition * Check in test suite * Replace claim_suite.suite by larger claim.suite * Sometimes it is worth to brute_force to a zero node violation * better documentation binring algorithm * Run property with a sufficient condition * Revert "Long-running tests" This reverts commit 6527033. * Test adjustments * Test adjustments * Add support for configured target_location_n_val * Memoise fixes The cache of v4 solutions is required by the ring_manager and the claimant - so specifically update both of these processes each time. Otherwise cache will be missed when the ring_manager calls to riak_core_claimant:ring_changed/2. There is a fix to the last gasp check before writing the ring file. prune_write_notify_ring function does not care if the write of a ring errors - so error for this function rather than crashing the ring manager. This otherwise causes instability in location tests. * Example configurations saves in source format (#1005) * Remove pre-computed test suite * cleanup * Make claim_eqc tests not fail on weird configs by supplying a diverse list of options * Add full-rebalance for v4 The leave call on a failure of simple_transfer will call sequential_claim - which is part of the v2 claim family. Now we have v4, if this is configured it should call v4 as it does handle leaves. * Support leave in prop_claim * Update - to use correct claim_fun on leave Property temporarily changed to consider only failures with locations * Use application env to read target_n_val (#1007) * Use application env to read target_n_val * Re-introduce v2 in riak_core_claim_eqc * Move precondition to postcondition to also test less perfect cases * Fixed error in transfer_node usage * cleanup not using remove_from_cluster/5. * Add warning if simple_transfer produces unbalanced result In this case - full_rebalance should be enabled * only_swap/swap_only confusion Add recommendation to use full_rebalance_on_leave for locations * Update riak_core_claim_eqc.erl * Mas i1001 docupdate (#1009) * Change doc Change introduction to refer to vnodes and nodes. Removes the recommendation not to vary location_n_val and node_n_val. * Update comments on having different target n_vals * Further doc updates * Update docs/claim-version4.md * Update docs/claim-version4.md * Update docs/claim-version4.md Co-authored-by: Thomas Arts <[email protected]> * Update docs/claim-version4.md Co-authored-by: Thomas Arts <[email protected]> --------- Co-authored-by: Thomas Arts <[email protected]> --------- Co-authored-by: Thomas Arts <[email protected]>
- Loading branch information