-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC improvements: GC only on a single node and add a missing index in PG #2159
Conversation
a96f3fb
to
62550af
Compare
|
||
// RunWithLocksClient runs the provided function with a pglock.Client. Should only be used | ||
// for migrations. | ||
func RunWithLocksClient(conn *pgx.Conn, runner func(client *pglock.Client) error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it very clear this is only for migrations.
func RunWithLocksClient(conn *pgx.Conn, runner func(client *pglock.Client) error) error { | |
func RunWithLocksClientForMigrations(conn *pgx.Conn, runner func(client *pglock.Client) error) error { |
client, err := pglock.UnsafeNew(db, | ||
pglock.WithCustomTable(locksTableName), | ||
pglock.WithLeaseDuration(timeout), | ||
pglock.WithHeartbeatFrequency(heartbeatFrequency), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default gcTimeout is 1m
, meaning the lease will last 1 minute. The heartbeat is 2 seconds by default. If folks change the gcTimeout
via SpiceDB configuration, the heartbeat won't scale with it. WithHeartBeatFrequency
docs indicates it should never be greater than half of the timeout.
// WithHeartbeatFrequency defines the frequency of the heartbeats. Heartbeats
// should have no more than half of the duration of the lease.
Given that user-provided configuration could violate this, I propose we use a fraction of the timeout: e.g. 1/3.
You should also use max(heartbeatFrequency, timeout/3)
to ensure we never go to zero or into a very frequent rate.
internal/datastore/postgres/gc.go
Outdated
|
||
func (pgd *pgDatastore) LockGCRun(ctx context.Context, timeout time.Duration, gcRun func(context.Context) error) (bool, error) { | ||
if pgd.gcInterval < lockMinimumInterval { | ||
return true, gcRun(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like with a gcInterval
less than 30 seconds, locks will be bypassed. If customer reconfigures SpiceDB GC, they could end up regressing the singleflighted GC runs, overload the datastore, and cause an incident. Seems like dangerous behaviour we should prevent.
I don't think this should be compared against lockMinimumInterval
, but against timeout
, which is what's used for the lock duration. If that's the case, we can validate this during application bootstrap, instead of failing later on, when the service is considered healthy. I vaguely recall we already had such safeguards, better double check.
internal/datastore/postgres/gc.go
Outdated
// Run the GC process under the lock. | ||
currentTimestampData, err := time.Now().UTC().MarshalBinary() | ||
if err != nil { | ||
return fmt.Errorf("failed to marshal current timestamp: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not seem like this is being used to drive the lock. If the reason for this is jumping into the postgres instance and troubleshooting the locks, then please clarify this with a comment accordingly.
internal/datastore/common/gc.go
Outdated
// If the implementation does not support locking, it should just execute | ||
// the function and return true. | ||
// If GC was run within the last interval, the function should return false. | ||
LockGCRun(ctx context.Context, timeout time.Duration, gcRun func(context.Context) error) (bool, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any new tests added - please add one to pg integration test suite, simulating multiple GC instances running at once.
if err := DatabaseMigrations.Register("add-gc-lock-table", "add-expiration-support", | ||
func(ctx context.Context, conn *pgx.Conn) error { | ||
return common.RunWithLocksClient(conn, func(client *pglock.Client) error { | ||
return client.TryCreateTable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes me uncomfortable being at the mercy of a third-party library migration code. How do we know bumping the library does not completely change the definition of the table and cause subsequent migrations to fail?
"github.com/jackc/pgx/v5" | ||
) | ||
|
||
const addGCIndexForRelationTupleTransaction = `CREATE INDEX CONCURRENTLY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add documentation to the PR body on why this was added, the previous explain, and the new explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment into the code
62550af
to
9fcc5c1
Compare
Redesigned to use the native locks as discussed |
1) Have GC lock so that it only runs on a single node at a time 2) Add a missing index in the Postgres datastore for GC This should reduce datastore CPU pressure
9fcc5c1
to
c5d52b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though need to resolve the test failures
405bec5
to
b7d77b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me, BUT:
Please note this only solves a part of the problem: it prevents multiple nodes from GC'ing simultaneously, which will avoid contention and spikes, but it does not elect a leader node who is the only one that should be running GC.
For example, if 100 nodes are doing GC and have their timers sufficiently skewed, it's possible to have 100 nodes GC'ing one after the other (this is an extreme, worst-case scenario). It won't happen at the same time, but it is still an inefficient use of datastore compute and can cause a sustained load on the datastore, proportional to the number of nodes.
internal/datastore/postgres/migrations/zz_migration.0022_add_missing_gc_index.go
Outdated
Show resolved
Hide resolved
That's true, but following the first GC, the next N GCs should more or less no-op. I considered adding a "last time GCed", but that would require a new table |
This is necessary because GC takes an exclusive lock now
b7d77b7
to
4f1ed5f
Compare
Updated |
This should reduce datastore CPU pressure