Skip to content

CASSGO-22 Changes to Query and Batch to make them safely reusable #1868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: trunk
Choose a base branch
from

Conversation

joao-r-reis
Copy link
Contributor

@joao-r-reis joao-r-reis commented Mar 13, 2025

API Changes

ExecutableQuery

ExecutableQuery is currently an interface that Query and Batch implements (and is referenced by HostSelectionPolicy). However, it is also used in driver internals so the interface contains private methods which makes it impossible for users to "mock" the interface for testing purposes.

In this PR, ExecutableQuery is changed so it contains only public methods and is no longer implemented by Query and Batch. Now, ExecutableQuery is used exclusively as a "hook" in HostSelectionPolicy. This does mean that users can no longer attempt to cast an ExecutableQuery to Query or Batch but I've added a new method that provides this functionality (Statement).

type ExecutableQuery interface {
	GetRoutingKey() ([]byte, error)
	Keyspace() string
	Table() string
	IsIdempotent() bool
	Statement() Statement
}

Statement

This is the new interface that represents (and is implemented by) Query and Batch. In this PR it's only referenced by ExecutableQuery.

type Statement interface {
	Iter() *Iter
	Exec() error
}

internalRequest

New interface that is used by driver internals to decouple the public API of Query/Batch from the internal API.
When creating an internal request object the driver now copies most if not all of the query/batch properties so that users can submit a Query/Batch for execution and then re-use immediately without having to be concerned about the object being modified by the driver execution. It also makes it less error prone because the driver is free to modify these properties (e.g. page state, consistency, query metrics) without causing a change on the objects that the users are using.

type internalRequest interface {
	execute(ctx context.Context, conn *Conn) *Iter
	attempt(keyspace string, end, start time.Time, iter *Iter, host *HostInfo)
	retryPolicy() RetryPolicy
	speculativeExecutionPolicy() SpeculativeExecutionPolicy
	getQueryMetrics() *queryMetrics
	RetryableQuery
	ExecutableQuery
}

Query Metrics (i.e. Query/Batch.Attempts(), Query/Batch.Latency())

This functionality makes the API a bit awkward because the user submits a query for execution and then inspects the query object to retrieve the side effects of that execution after it's done. It's a better design (imo) to have this kind of data available in the Iter object since this is the type that represents the return value of a query/batch.

Due to this, AddAttempts(), Attempts(), Latency() and AddLatency() have been removed from Query and Batch. Latency() and Attempts() have been added to Iter and a new Batch.Iter() method has been added so that users can obtain the Iter object when executing a batch (even though it doesn't make a lot of sense conceptually due to the name but Iter is what the driver uses to return data about a request so it's not just about "iterating").

queryPool

Query objects were allocated from a sync.Pool but I don't see the gain of keeping it especially now that queryMetrics have been moved to Iter. Also users can create these query objects once and store them if they are worried about memory/GC.

Query/Batch.GetRoutingKey()

Having this method as part of the public API of Query and Batch makes it difficult to manage from the driver maintainer POV, it's supposed to be an internal method that can be called by implementations of HostSelectionPolicy but since ExecutableQuery was implemented by Query and Batch this meant that these two types had to have this method on their public API as well.

With this PR, this method is deprecated on Query and Batch since these two types no longer implement ExecutableQuery so we can keep the Query/Batch API much simpler.

@joao-r-reis joao-r-reis marked this pull request as ready for review March 17, 2025 13:58
Copy link
Contributor

@worryg0d worryg0d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @joao-r-reis,

I spent some time reviewing your PR. Great work! It simplifies public query and batch API which is pretty good.

Removing queryPool is a good idea. If anyone really needs this, it can be handled on their application side.

I left some minor comments but overall implementation is well.

I am only a bit concerned about the changes this PR provides to the driver internals, especially the conn part which overlaps with #1822.

@joao-r-reis
Copy link
Contributor Author

I am only a bit concerned about the changes this PR provides to the driver internals, especially the conn part which overlaps with #1822.

Yeah I'll have to spend some time rebasing this branch and even writing some tests, for now it's good enough if reviewers can provide feedback on the current state of the PR especially the public API changes

"time"
)

// ExecutableQuery is an interface that represents a query or batch statement that
// exposes the correct functions for the HostSelectionPolicy to operate correctly.
type ExecutableQuery interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about renaming type to HostPolicyQuery?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted a comment about the renaming of this interface below in a response to James

qry.releaseAfterExecution()
}

type queryOptions struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we group here all relevant options into sub-types, for example:

Paging
    pageSize
    initialPageState
    disableAutoPage
Monitoring
    observer
    trace
Parameters
    values
    binding
WriteTimestamp
    defaultTimestamp
    defaultTimestampValue
Consistency
    initialConsistency
    serialConsistency
FrameOptions
    customPayload
    disableSkipMetadata

stmt
prefetch
rt
spec
context
idempotent
keyspace
skipPrepare
routingKey

Grouping is not definitive, it is just an idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I'll do this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought I'm not sure if I like this idea that much, I'd have to create a few new types when maybe I can just group them a bit more by arranging the way they are declared... Wdyt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this:

type queryOptions struct {
	stmt string

	// Paging
	pageSize         int
	initialPageState []byte
	disableAutoPage  bool

	// Monitoring
	trace    Tracer
	observer QueryObserver

	// Parameters
	values  []interface{}
	binding func(q *QueryInfo) ([]interface{}, error)

	// Timestamp
	defaultTimestamp      bool
	defaultTimestampValue int64

	// Consistency
	initialConsistency Consistency
	serialCons         SerialConsistency

	// Protocol flag
	disableSkipMetadata bool

	customPayload map[string][]byte
	prefetch      float64
	rt            RetryPolicy
	spec          SpeculativeExecutionPolicy
	context       context.Context
	idempotent    bool
	keyspace      string
	skipPrepare   bool
	routingKey    []byte

	// getKeyspace is field so that it can be overriden in tests
	getKeyspace func() string
}

qryOpts *queryOptions
pageState []byte
metrics *queryMetrics
refCount uint32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, leftover from a copy paste

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still persist. but changed its position 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, maybe I added it back during rebase, fixed

session.go Outdated
// Iter executes a batch operation and returns an Iter object
// that can be used to access properties related to the execution like Iter.Attempts and Iter.Latency
func (b *Batch) Iter() *Iter {
return b.session.executeBatch(b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that people may execute sequence of Exec() and Iter(). Shall we make an assertion that only one of the is invoked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see what you mean but I don't think we can lock them into one or another if we want it to be reusable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that people may execute sequence of Exec() and Iter(). Shall we make an assertion that only one of the is invoked?

They can't execute sequence (Exec().Iter()) unless they do it like:

err := b.Exec()
iter := b.Iter()

Which I believe is already something that can be done with the current API and it will trigger two requests. We can try to improve documentation if this is a concern.

+1 to what James said

@jameshartig
Copy link
Contributor

executing a batch (even though it doesn't make a lot of sense conceptually due to the name but Iter is what the driver uses to return data about a request so it's not just about "iterating").

Note that Yugabyte lets you iterate over the row status from a batch operation.

@jameshartig
Copy link
Contributor

With this PR, this method is deprecated on Query and Batch since these two types no longer implement ExecutableQuery so we can keep the Query/Batch API much simpler.

Why not just remove it? Seems like any HostSelectionPolicy implementation would need to make major changes already.

@jameshartig
Copy link
Contributor

I don't want to bikeshed about names but ExecutableQuery is a bit confusing since it contains Query and Batch. Did that name come from somewhere? Would ExecutableStatement make more sense since it has a statement?

qry.routingInfo.keyspace = info.request.keyspace
qry.routingInfo.table = info.request.table
qry.routingInfo.mu.Unlock()
q.routingInfo.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still need to lock? I don't think it does unless we expect GetRoutingKey to be called from separate goroutines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we just change routingInfo to not be a struct anymore? Seems like it only was because...

// routingInfo is a pointer because Query can be copied and copyable struct can't hold a mutex.

Do we still allow copies of internalQuery? If so, maybe we should make a Clone method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still need to lock? I don't think it does unless we expect GetRoutingKey to be called from separate goroutines?

I think this is still the case due to speculative executions (and possibly retries I'm not sure).

Also can we just change routingInfo to not be a struct anymore? Seems like it only was because...

// routingInfo is a pointer because Query can be copied and copyable struct can't hold a mutex.

Do we still allow copies of internalQuery? If so, maybe we should make a Clone method?

I don't think it needs to be a struct anymore, I'll change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I made some changes to the routingInfo part but I didn't remove the struct... I think it still serves a purpose by making it clear that the mutex is used to protect access against the two fields within the struct. Do you still prefer to remove the struct and just have the mutex and two fields in the request type itself?

}

func newQueryOptions(q *Query) *queryOptions {
return &queryOptions{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be copying values, customPayload, initialPageState, routingKey? it might not be obvious that those are not copied and can't be mutated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think copying values and customPayload might be a bit too much due to performance reasons. initialPageState should probably be copied for safety and since its length is usually pretty small. routingKey I'm not so sure, I think it will be nil the vast majority of cases anyway but I can make sure it gets copied as well

}

type batchOptions struct {
Type BatchType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why some of these are public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type itself is not public and it is never exposed so it wasn't a deliberate choice, just a result of copy paste. I can make them all private for consistency

@jameshartig
Copy link
Contributor

What about making the methods Exec(context.Context) and Iter(context.Context) error so there's no need to shallow copy query in WithContext? This is a bigger change though that we might not want to make. Taht said, I think Query would be a bit easier to re-use then. I know this differs from http.Request but in that case it might be a symptom of not wanting to break the existing API when context's were introduced.

@joao-r-reis
Copy link
Contributor Author

With this PR, this method is deprecated on Query and Batch since these two types no longer implement ExecutableQuery so we can keep the Query/Batch API much simpler.

Why not just remove it? Seems like any HostSelectionPolicy implementation would need to make major changes already.

Hmm I don't think HostSelectionPolicy implementations have to make major changes after this PR unless they rely on casting the ExecutableQuery object into Query or Batch (and even this is a small change, they just need to call .Statement() before doing so).

I don't want to bikeshed about names but ExecutableQuery is a bit confusing since it contains Query and Batch. Did that name come from somewhere? Would ExecutableStatement make more sense since it has a statement?

I kept ExecutableQuery name unchanged so HostSelectionPolicy implementations can pretty much continue to work without many changes (or even any change at all). cc @lukasz-antoniak because you also brought this up in 1 of your comments.
I'm concerned that renaming this interface will increase the chance users will have to change their code when upgrading the driver but I do agree that the current name isn't good.

@joao-r-reis
Copy link
Contributor Author

What about making the methods Exec(context.Context) and Iter(context.Context) error so there's no need to shallow copy query in WithContext? This is a bigger change though that we might not want to make. Taht said, I think Query would be a bit easier to re-use then. I know this differs from http.Request but in that case it might be a symptom of not wanting to break the existing API when context's were introduced.

I like the idea but I think it would be too much to ask of users when upgrading since it would affect every single statement on their app... I'd be down to adding an overload and deprecating WithContext though... I just don't know what names would fit for the new Iter/Exec overloads...

@joao-r-reis joao-r-reis changed the title CASSGO-22 Changes to Query and Batch to make them safely reusable and "threadsafe" CASSGO-22 Changes to Query and Batch to make them safely reusable Mar 25, 2025
@worryg0d
Copy link
Contributor

I'm concerned that renaming this interface will increase the chance users will have to change their code when upgrading the driver but I do agree that the current name isn't good.

We can add a type alias to leave the ExecutableQuery name for the migration period:

// Deprecated: Will be removed in the future major release.
// Please use Statement instead.
type ExecutableQuery = Statement

type Statement interface {
...
}

I tested it on go 1.19.13 and it works fine

@worryg0d
Copy link
Contributor

I just don't know what names would fit for the new Iter/Exec overloads...

Well, we can use the common Context suffix for those overloads, like it is done in sql package from standard lib - https://pkg.go.dev/database/sql#Conn.QueryContext
https://pkg.go.dev/database/sql#DB.ExecContext

@joao-r-reis
Copy link
Contributor Author

I've addressed all PR comments, I'll work on rebasing the branch now and then I'll work on the change mentioned here.

@joao-r-reis joao-r-reis force-pushed the cassgo-22-prototype branch from 26570d6 to 4ad60b6 Compare April 10, 2025 12:16
@joao-r-reis
Copy link
Contributor Author

Rebase is done, it was a bit more complex than I thought it would be so I'd appreciate if you guys could take a look again @jameshartig @lukasz-antoniak @worryg0d

Comment on lines 295 to 301
func newQueryOptions(q *Query, ctx context.Context) *queryOptions {
var newPageState, newRoutingKey []byte
if q.initialPageState != nil {
pageState := q.initialPageState
newPageState = make([]byte, len(pageState))
copy(newPageState, pageState)
}
Copy link
Contributor

@ribaraka ribaraka Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newPageState seems to be redundant here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the reason for this is to ensure that a query execution isn't affected if the user tries to modify the page byte slice after query has been submitted for execution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if we're just adding a performance penalty to safeguard against something that users won't attempt to do 99% of the times

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that newPageState is copied but never used. And it doesn't seem the queryOptions struct has such a field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh you're right, I removed this field from this type and moved it up to internalQuery but I forgot to remove this part, nice catch. Done

qryOpts *queryOptions
pageState []byte
metrics *queryMetrics
refCount uint32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still persist. but changed its position 😅

Comment on lines 559 to 567
func newInternalBatch(batch *Batch, ctx context.Context) *internalBatch {
return &internalBatch{
originalBatch: batch,
batchOpts: newBatchOptions(batch, ctx),
metrics: &queryMetrics{m: make(map[string]*hostMetrics)},
routingInfo: &queryRoutingInfo{},
session: batch.session,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency is missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, done

session.go Outdated
Comment on lines 753 to 761
// Iter executes a batch operation and returns an Iter object
// that can be used to access properties related to the execution like Iter.Attempts and Iter.Latency
func (b *Batch) Iter() *Iter { return b.session.executeBatch(b, nil) }

// Iter executes a batch operation with the provided context and returns an Iter object
// that can be used to access properties related to the execution like Iter.Attempts and Iter.Latency
func (b *Batch) IterContext(ctx context.Context) *Iter {
return b.session.executeBatch(b, ctx)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicks:

  1. The Iter method should call the IterContext
  2. The comment for IterContext should start with the method name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 50 to 55
// Statement is an interface that represents a CQL statement that the driver can execute
// (currently Query and Batch via Session.Query and Session.Batch)
type Statement interface {
Iter() *Iter
Exec() error
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding here IterContext and ExecContetx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion 👍 done

@joao-r-reis joao-r-reis requested a review from ribaraka April 15, 2025 14:44
@joao-r-reis
Copy link
Contributor Author

@jameshartig @worryg0d do you guys have some free time to take a look at this again during the next few days?

# Conflicts:
#	cassandra_test.go
#	session.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants