Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load Tests - Cassandra Reverse Replication #2163

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

taherkl
Copy link
Contributor

@taherkl taherkl commented Jan 31, 2025

This PR introduces Load Testing for the Spanner to Cassandra DB pipeline to evaluate performance, and reliability under high data loads.

Key Changes:

  • Added Load Test Scenarios
    • Simulates high-throughput data migration from Spanner to Cassandra
    • Tests different batch sizes and concurrent transactions
    • Measures latency, throughput, and error rates
  • Implemented Performance Metrics Logging
    • Tracks execution time, failed vs. successful transactions
    • Captures Cassandra write latencies and Spanner read times

taherkl and others added 19 commits November 27, 2024 20:19
Use [self-hosted, it] for prepare java cache workflow (GoogleCloudPlatform#2080)
Support avro arrays for postgres insertion. (GoogleCloudPlatform#2154)
* Addition of Load Tests in SpannerToSourceDB For Cassandra
@taherkl taherkl marked this pull request as ready for review February 5, 2025 07:07
@taherkl taherkl requested a review from a team as a code owner February 5, 2025 07:07
Copy link

codecov bot commented Feb 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 46.90%. Comparing base (2631a1d) to head (f9b1e3e).
Report is 10 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #2163   +/-   ##
=========================================
  Coverage     46.89%   46.90%           
- Complexity     4346     4351    +5     
=========================================
  Files           874      874           
  Lines         52070    52090   +20     
  Branches       5461     5468    +7     
=========================================
+ Hits          24420    24433   +13     
- Misses        25906    25909    +3     
- Partials       1744     1748    +4     
Components Coverage Δ
spanner-templates 68.82% <ø> (+0.01%) ⬆️
spanner-import-export 65.70% <ø> (+0.05%) ⬆️
spanner-live-forward-migration 76.50% <ø> (ø)
spanner-live-reverse-replication 78.67% <ø> (ø)
spanner-bulk-migration 87.87% <ø> (ø)

see 2 files with indirect coverage changes

throw new IllegalArgumentException("CassandraResourceManager must not be null.");
}

// Use reflection to call executeStatement if it exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is reflection needed? Can you give some context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darshan-sj as resourceManager object's type (T) is generic, and the method executeStatement is not part of a well-defined interface or superclass that resourceManager is guaranteed to implement. Reflection allows here to dynamically invoke the executeStatement method on the resourceManager object if available

*/
public synchronized ResultSet executeStatement(String statement) {
LOG.info("Executing statement: {}", statement);
return this.executeStatement(statement, DEFAULT_CASSANDRA_TIMEOUT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are introducing a 2 second timeout on this default executeStatement method. We should not modify the behavior for other tests which are using this method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darshan-sj I guess request timeout is 2 sec only so does not required any changes for other method

import org.apache.beam.it.gcp.storage.GcsResourceManager;

/**
* Base class for Spanner to sourcedb Load tests. It provides helper functions related to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct the comment also to reflect Spanner to Cassandra Load tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed @darshan-sj

private static final String TEMPLATE_SPEC_PATH =
MoreObjects.firstNonNull(
TestProperties.specPath(),
"gs://dataflow-templates-spanner-to-cassandra/templates/flex/Spanner_to_SourceDb");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this bucket gs://dataflow-templates-spanner-to-cassandra/ ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed @darshan-sj

MoreObjects.firstNonNull(
TestProperties.specPath(),
"gs://dataflow-templates-spanner-to-cassandra/templates/flex/Spanner_to_SourceDb");
public CassandraResourceManager cassandraSharedResourceManager;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this variable called `SharedResourceManager? How is it shared?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed @darshan-sj it is typo mistake

getGcsPath(artifactBucket, "input/shard.json", gcsResourceManager));
getGcsPath(
artifactBucket,
!Objects.equals(sourceType, MYSQL_SOURCE_TYPE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use template parameters overriding instead of having a ternary operator here.
Example:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darshan-sj address the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants