You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now when we run a test, we blow away the contents of the etcd store before every test, to prevent side effects of user interactions causing a collision between tests, and making them flaky.
There's some downsides to this, but it encompasses the ROI of it's current usage. The one project that's stressing the development of this ticket is deployments working loosely towards a system in which service tests can be coupled into a shared infrastructure before running all of their respective e2e tests.
It should also make our current test runs more reliable as the source of most of our test failures revolve around timeouts based on the /debug/provision step (aka: waiting for the cluster to recover from wiping test data)
The jist of the solution is to rejigger the debug server to handle pre test provisioning a little differently, fitting this rough description:
Given that we have a shared database amongst all the other tests across all the other teams:
We need to generate a random org + user + token combo to use for the test runner, and be responsive to collisions during creation just at the org level
Before every test, we cycle through the /api/v2/:resourceType -> /api/v2/:resourceType/delete/:id dance for that org to fetch all resources that could ever be generated, and delete them all.
After all of the tests, do that dance one more time, and then kill the org + user + token combo
Flush the test env every month or so for those times we didn't get to 3, because of reasons
The last point is mostly to encompass that there is a trade off between correctness and cost of development, and that there are operational means to solve technical problems. we can find the balance later.
As always, feel free to solve it a different way, this is just a starting point on an intent.
The text was updated successfully, but these errors were encountered:
The jist of the solution is to rejigger the debug server to handle pre test provisioning a little differently, fitting this rough description:
Is this the only solution to this problem? It seems from my outside perspective that it's a solution constrained by the current system. It's a solution that is meant to work within the current confines. I'm curious if those confines are relevant.
we blow away the contents of the etcd store before every test, to prevent side effects of user interactions causing a collision between tests, and making them flaky.
Is this a reality of the world or is this just how our system has evolved based on our tests? can we change the underlying assumptions our tests make about what resources exist when the test starts?
Right now when we run a test, we blow away the contents of the etcd store before every test, to prevent side effects of user interactions causing a collision between tests, and making them flaky.
There's some downsides to this, but it encompasses the ROI of it's current usage. The one project that's stressing the development of this ticket is deployments working loosely towards a system in which service tests can be coupled into a shared infrastructure before running all of their respective e2e tests.
It should also make our current test runs more reliable as the source of most of our test failures revolve around timeouts based on the
/debug/provision
step (aka: waiting for the cluster to recover from wiping test data)The jist of the solution is to rejigger the debug server to handle pre test provisioning a little differently, fitting this rough description:
Given that we have a shared database amongst all the other tests across all the other teams:
/api/v2/:resourceType
->/api/v2/:resourceType/delete/:id
dance for that org to fetch all resources that could ever be generated, and delete them all.The last point is mostly to encompass that there is a trade off between correctness and cost of development, and that there are operational means to solve technical problems. we can find the balance later.
As always, feel free to solve it a different way, this is just a starting point on an intent.
The text was updated successfully, but these errors were encountered: