Skip to content

feat: postgres dialect suppport #82

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

mrsufgi
Copy link

@mrsufgi mrsufgi commented Aug 7, 2025

It was silently failing due to incorrect ID used, now, documents reuse the node list even if the document itself already exists.

image

@mrsufgi
Copy link
Author

mrsufgi commented Aug 7, 2025

closes #48 ?

@mrsufgi
Copy link
Author

mrsufgi commented Aug 10, 2025

I will clean this commit, I mistakenly used the same branch for addtional work (pg backend)

Comment on lines 455 to +459
annotations = append(annotations, &ent.Annotation{
DocumentID: &documentID,
NodeID: &nodes[idx].ID,
Name: name,
Value: value,
IsUnique: true,
NodeID: &nodes[idx].ID,
Name: name,
Value: value,
IsUnique: true,
Copy link
Author

@mrsufgi mrsufgi Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Hooks, we never allowed setting both NodeID and DocumentID; they are mutually exclusive. This fixes it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

@mrsufgi mrsufgi changed the title fix: missing SetID call cause issues on duplocate NodeList feat: postgres dialect suppport Aug 13, 2025
Comment on lines 152 to 175
func (backend *Backend) withTx(fns ...TxFunc) error {
return backend.withTxContext(backend.ctx, fns...)
}

// withTxContext creates a transaction with a specific context to avoid race conditions
func (backend *Backend) withTxContext(ctx context.Context, fns ...TxFunc) error {
if backend.client == nil {
return fmt.Errorf("%w", errUninitializedClient)
}

tx, err := backend.client.Tx(backend.ctx)
// Create a NEW context for this transaction instead of modifying the shared one
txCtx := ctx
tx, err := backend.client.Tx(txCtx)
if err != nil {
return fmt.Errorf("creating transactional client: %w", err)
}

backend.ctx = ent.NewTxContext(backend.ctx, tx)
// Use transaction-local context instead of polluting backend.ctx
// This prevents concurrent transactions from interfering with each other
localTxCtx := ent.NewTxContext(txCtx, tx)
_ = localTxCtx // Mark as used for now

for _, fn := range fns {
if err := fn(tx); err != nil {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is extremely important, as using Protobom with a real concurrent database (unlike SQLite) is crucial. Reusing the same context introduces a data leak, mainly where we set documentId on the context. There is a major race condition if "backend" is used by multiple consumers and we attempt to store in parallel. This isolates context per transaction.

SetIsUnique(annotations[idx].IsUnique)

builders = append(builders, builder)
var (
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgres does not support OnConflict ("any conflict"), therefore, we need to break it down into each possible conflict and use OnConflictColumns instead

@@ -128,7 +197,7 @@ func (backend *Backend) saveDocumentTypes(docTypes []*sbom.DocumentType, opts ..
fn(newDocType)
}

if err := newDocType.OnConflict().Ignore().Exec(backend.ctx); err != nil && !ent.IsConstraintError(err) {
if err := newDocType.OnConflictColumns("type", "name", "description").Ignore().Exec(backend.ctx); err != nil && !ent.IsConstraintError(err) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cases where only one unique index exists, we can just use it as is.

@@ -29,7 +29,6 @@ func (pm ProtoMessage[T]) Fields() []ent.Field {
field.Bytes(protoMessageField).
GoType(goType).
Nillable().
Unique().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the unique constraint here? Is there a use case for having the ability to store a raw wire format proto message twice in a table?

Comment on lines +59 to +64
test-integration: # Run integration tests
@printf "Running integration tests for ${CYAN}backends/ent${RESET}...\n"
@go test -failfast -v -tags=integration -coverprofile=coverage.out -covermode=atomic ./backends/ent/...
@printf "${GREEN}DONE${RESET}\n\n"

${call coverage-report}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use .PHONY since test-integration isn't a real file
  • nit: ensure file contains a single trailing newline
Suggested change
test-integration: # Run integration tests
@printf "Running integration tests for ${CYAN}backends/ent${RESET}...\n"
@go test -failfast -v -tags=integration -coverprofile=coverage.out -covermode=atomic ./backends/ent/...
@printf "${GREEN}DONE${RESET}\n\n"
${call coverage-report}
.PHONY: test-integration
test-integration: # Run integration tests
@printf "Running integration tests for ${CYAN}backends/ent${RESET}...\n"
@go test -failfast -v -tags=integration -coverprofile=coverage.out -covermode=atomic ./backends/ent/...
@printf "${GREEN}DONE${RESET}\n\n"
${call coverage-report}

Comment on lines 455 to +459
annotations = append(annotations, &ent.Annotation{
DocumentID: &documentID,
NodeID: &nodes[idx].ID,
Name: name,
Value: value,
IsUnique: true,
NodeID: &nodes[idx].ID,
Name: name,
Value: value,
IsUnique: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

Comment on lines +40 to +43
field.String("value_key").
Optional().
Immutable().
StorageKey("value_key"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind explaining the purpose of this new field? I forget the syntax, but you can throw an ent annotation comment on there if you like

Comment on lines +92 to +97
if u, ok := mutation.IsUnique(); ok && u {
mutation.SetValueKey("")
} else if v, ok := mutation.Value(); ok {
mutation.SetValueKey(v)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
if u, ok := mutation.IsUnique(); ok && u {
mutation.SetValueKey("")
} else if v, ok := mutation.Value(); ok {
mutation.SetValueKey(v)
}
if unique, ok := mutation.IsUnique(); ok && unique {
mutation.SetValueKey("")
} else if valueKey, ok := mutation.Value(); ok {
mutation.SetValueKey(valueKey)
}

@@ -634,7 +641,7 @@ func (as *annotationsSuite) getTestResult(annotationName string) ent.Annotations

result, err := as.Backend.Ent().QueryContext(
as.Context(),
"SELECT * FROM annotations WHERE name == ?",
"SELECT id, document_id, node_id, name, value, is_unique, value_key FROM annotations WHERE name = ?",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing == to = is still valid for SQLite right? Just making sure it was verified

Comment on lines 17 to +18
sqlite "github.com/glebarez/go-sqlite"
_ "github.com/lib/pq" // PostgreSQL driver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably my main concern. Downstream code might not want the extra dependencies if only one specific database provider is desired.

Is there a way we can split into multiple go modules or something like that to keep the dependencies discrete?

if backend.client == nil {
return fmt.Errorf("%w", errUninitializedClient)
}

tx, err := backend.client.Tx(backend.ctx)
// Create a NEW context for this transaction instead of modifying the shared one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
// Create a NEW context for this transaction instead of modifying the shared one
// Create a new context for this transaction instead of modifying the shared one.

backend.ctx = ent.NewTxContext(backend.ctx, tx)
// Use transaction-local context instead of polluting backend.ctx
// This prevents concurrent transactions from interfering with each other
localTxCtx := ent.NewTxContext(txCtx, tx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this variable need to be created if unused?

@jhoward-lm
Copy link
Contributor

It was silently failing due to incorrect ID used, now, documents reuse the node list even if the document itself already exists.

I don't understand the problem statement, can you provide some background info?

Not sure if there's overlap here, and it's been a while since I was able to work with it, but at the beginning of the year I had some work in progress that would deduplicate all redundant proto messages in the database and allow reuse of identical messages by multiple edges that referenced it. IIRC it's a pretty big overhaul so I didn't want to stomp on each other's work if I ever am able to get back to it

Copy link
Contributor

@jhoward-lm jhoward-lm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll also need to generate the migration and commit it, as well as atlas.sum. Should happen automatically as part of make generate-ent. The problem is your branch is main, which is where the migration files pull their generated name from, and the migration files are intended to have descriptive names.

As a hacky workaround, you might be able to just temporarily set a static value for migrationName and comment out the first chunk of main() in internal/backends/ent/migrate/main.go.

func main() {
	// // Get name of current git branch.
	// output, err := exec.Command("git", "branch", "--show-current").Output()
	// if err != nil {
	// 	log.Fatal("failed getting current Git branch")
	// }

	// migrationName := string(output)

	// // Strip leading non-alpha characters, any set of characters ending with a slash, and trailing newline.
	// migrationName = regexp.MustCompile(`^([^A-Za-z]|[^\/]+?\/)+|\n$`).ReplaceAllString(migrationName, "")

	// // Replace non-alphanumeric characters with underscore.
	// migrationName = regexp.MustCompile(`[^\w]`).ReplaceAllString(migrationName, "_")

	migrationName := "postgres_dialect_support"

	// Register the SQLite driver as "sqlite3".
	if !slices.Contains(sql.Drivers(), "sqlite3") {
		sqlite.RegisterAsSQLITE3()
	}

	// ...
}

Although that alerted me to another significant issue. Currently migrations are only generated for sqlite3 as shown on the very next part after the static migration name. I'm not sure this will work without a major overhaul; the migrations are pretty critical

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants