-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC improvements: GC only on a single node and add a missing index in PG #2159
base: main
Are you sure you want to change the base?
Conversation
a96f3fb
to
62550af
Compare
|
||
// RunWithLocksClient runs the provided function with a pglock.Client. Should only be used | ||
// for migrations. | ||
func RunWithLocksClient(conn *pgx.Conn, runner func(client *pglock.Client) error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it very clear this is only for migrations.
func RunWithLocksClient(conn *pgx.Conn, runner func(client *pglock.Client) error) error { | |
func RunWithLocksClientForMigrations(conn *pgx.Conn, runner func(client *pglock.Client) error) error { |
client, err := pglock.UnsafeNew(db, | ||
pglock.WithCustomTable(locksTableName), | ||
pglock.WithLeaseDuration(timeout), | ||
pglock.WithHeartbeatFrequency(heartbeatFrequency), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default gcTimeout is 1m
, meaning the lease will last 1 minute. The heartbeat is 2 seconds by default. If folks change the gcTimeout
via SpiceDB configuration, the heartbeat won't scale with it. WithHeartBeatFrequency
docs indicates it should never be greater than half of the timeout.
// WithHeartbeatFrequency defines the frequency of the heartbeats. Heartbeats
// should have no more than half of the duration of the lease.
Given that user-provided configuration could violate this, I propose we use a fraction of the timeout: e.g. 1/3.
You should also use max(heartbeatFrequency, timeout/3)
to ensure we never go to zero or into a very frequent rate.
internal/datastore/postgres/gc.go
Outdated
|
||
func (pgd *pgDatastore) LockGCRun(ctx context.Context, timeout time.Duration, gcRun func(context.Context) error) (bool, error) { | ||
if pgd.gcInterval < lockMinimumInterval { | ||
return true, gcRun(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like with a gcInterval
less than 30 seconds, locks will be bypassed. If customer reconfigures SpiceDB GC, they could end up regressing the singleflighted GC runs, overload the datastore, and cause an incident. Seems like dangerous behaviour we should prevent.
I don't think this should be compared against lockMinimumInterval
, but against timeout
, which is what's used for the lock duration. If that's the case, we can validate this during application bootstrap, instead of failing later on, when the service is considered healthy. I vaguely recall we already had such safeguards, better double check.
internal/datastore/postgres/gc.go
Outdated
// Run the GC process under the lock. | ||
currentTimestampData, err := time.Now().UTC().MarshalBinary() | ||
if err != nil { | ||
return fmt.Errorf("failed to marshal current timestamp: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not seem like this is being used to drive the lock. If the reason for this is jumping into the postgres instance and troubleshooting the locks, then please clarify this with a comment accordingly.
internal/datastore/common/gc.go
Outdated
// If the implementation does not support locking, it should just execute | ||
// the function and return true. | ||
// If GC was run within the last interval, the function should return false. | ||
LockGCRun(ctx context.Context, timeout time.Duration, gcRun func(context.Context) error) (bool, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any new tests added - please add one to pg integration test suite, simulating multiple GC instances running at once.
if err := DatabaseMigrations.Register("add-gc-lock-table", "add-expiration-support", | ||
func(ctx context.Context, conn *pgx.Conn) error { | ||
return common.RunWithLocksClient(conn, func(client *pglock.Client) error { | ||
return client.TryCreateTable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes me uncomfortable being at the mercy of a third-party library migration code. How do we know bumping the library does not completely change the definition of the table and cause subsequent migrations to fail?
"github.com/jackc/pgx/v5" | ||
) | ||
|
||
const addGCIndexForRelationTupleTransaction = `CREATE INDEX CONCURRENTLY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add documentation to the PR body on why this was added, the previous explain, and the new explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment into the code
62550af
to
9fcc5c1
Compare
Redesigned to use the native locks as discussed |
1) Have GC lock so that it only runs on a single node at a time 2) Add a missing index in the Postgres datastore for GC This should reduce datastore CPU pressure
9fcc5c1
to
c5d52b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though need to resolve the test failures
11e2df9
to
62c8697
Compare
This is necessary because GC takes an exclusive lock now
405bec5
to
b7d77b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me, BUT:
Please note this only solves a part of the problem: it prevents multiple nodes from GC'ing simultaneously, which will avoid contention and spikes, but it does not elect a leader node who is the only one that should be running GC.
For example, if 100 nodes are doing GC and have their timers sufficiently skewed, it's possible to have 100 nodes GC'ing one after the other (this is an extreme, worst-case scenario). It won't happen at the same time, but it is still an inefficient use of datastore compute and can cause a sustained load on the datastore, proportional to the number of nodes.
} | ||
|
||
func (mds *Datastore) releaseLock(ctx context.Context, lockName lockName) error { | ||
_, err := mds.db.ExecContext(ctx, ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comment from MySQL docs:
_, err := mds.db.ExecContext(ctx, ` | |
// A lock obtained with GET_LOCK() is released explicitly by executing RELEASE_LOCK() | |
// or implicitly when your session terminates (either normally or abnormally). | |
_, err := mds.db.ExecContext(ctx, ` |
ON relation_tuple_transaction (xid DESC, timestamp);` | ||
|
||
func init() { | ||
if err := DatabaseMigrations.Register("add-missing-gc-index", "add-expiration-support", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please rename to add-index-for-transaction-gc
) | ||
|
||
func (mds *Datastore) tryAcquireLock(ctx context.Context, lockName lockName) (bool, error) { | ||
// Acquire the lock, with max 1s timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// please add comment indicating GET_LOCK
is reentrant. From the docs:
It is even possible for a given session to acquire multiple locks for the same name. Other sessions cannot acquire a lock with that name until the acquiring session releases all its locks for the name.
) | ||
|
||
func (pgd *pgDatastore) tryAcquireLock(ctx context.Context, lockID lockID) (bool, error) { | ||
// Acquire the lock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also mention that pg locks are reentrant, but with a caveat
A lock can be acquired multiple times by its owning process; for each completed lock request there must be a corresponding unlock request before the lock is actually released
...
If a session already holds a given advisory lock, additional requests by it will always succeed, even if other sessions are awaiting the lock; this statement is true regardless of whether the existing lock hold and new request are at session level or transaction level.
_, err := pgd.writePool.Exec(ctx, ` | ||
SELECT pg_advisory_unlock($1) | ||
`, lockID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the result. If it returns false, we know something is wrong, so we should at least log it.
This should reduce datastore CPU pressure