Cloud Spanner Change Streams Subscriber for Go
This library is an implementation to subscribe a change stream's records of Google Cloud Spanner in Go. It is heavily inspired by the SpannerIO connector of the Apache Beam SDK and is compatible with the PartitionMetadata data model.
To read a change streams, Google Cloud offers Dataflow connector as a scalable and reliable solution, but in some cases the abstraction and capabilities of Dataflow pipelines can be too much (or is simply too expensive). This library aims to make reading change streams native for non beam/dataflow use cases.
package main
import (
"context"
"encoding/json"
"errors"
"fmt"
"io"
"os"
"os/signal"
"sync"
"cloud.google.com/go/spanner"
"github.com/anicoll/screamer"
"github.com/anicoll/screamer/pkg/model"
"github.com/anicoll/screamer/partitionstorage"
)
func main() {
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, os.Kill)
defer stop()
database := fmt.Sprintf("projects/%s/instances/%s/databases/%s", "foo-project", "bar-instance", "baz-database")
spannerClient, err := spanner.NewClient(ctx, database)
if err != nil {
panic(err)
}
defer spannerClient.Close()
partitionMetadataTableName := "PartitionMetadata_FooStream"
partitionStorage := partitionstorage.NewSpanner(spannerClient, partitionMetadataTableName)
if err := partitionStorage.CreateTableIfNotExists(ctx); err != nil {
panic(err)
}
changeStreamName := "FooStream"
subscriber := screamer.NewSubscriber(spannerClient, changeStreamName, partitionStorage)
fmt.Fprintf(os.Stderr, "Reading the stream...\n")
logger := &Logger{out: os.Stdout}
if err := subscriber.Subscribe(ctx, logger); err != nil && !errors.Is(ctx.Err(), context.Canceled) {
panic(err)
}
}
type Logger struct {
out io.Writer
mu sync.Mutex
}
func (l *Logger) Consume(change *model.DataChangeRecord) error {
l.mu.Lock()
defer l.mu.Unlock()
return json.NewEncoder(l.out).Encode(change)
}
$ go install github.com/anicoll/screamer@latest
NAME:
screamer screamer
USAGE:
screamer screamer [command options]
OPTIONS:
--dsn value [$DSN]
--stream value [$STREAM]
--metadata-table value [$METADATA_TABLE]
--start value (default: Start timestamp with RFC3339 format, default: current timestamp) [$START]
--end value (default: End timestamp with RFC3339 format default: indefinite) [$END]
--heartbeat-interval value (default: 10s) [$HEARTBEAT_INTERVAL]
--partition-dsn value (default: Database dsn for use by the partition metadata table. If not provided, the main dsn will be used.) [$PARTITION_DSN]
--help, -h show help
Heavily inspired by below projects.
- spream (https://github.com/toga4/spream)