HOTFIX: Fix rapi_bind length mismatch when saving citation networks#261
HOTFIX: Fix rapi_bind length mismatch when saving citation networks#261seanthimons wants to merge 3 commits intointegrationfrom
Conversation
Two issues caused "Bind parameter values need to have the same length": 1. NULL source_notebook_id passed directly to dbExecute created a zero-length bind parameter. Now coerced to NA_character_. 2. Scalar fallbacks for missing is_overlap/community columns produced length-1 vectors mixed with length-N columns. Now uses rep() to ensure proper length-N vectors. Adds regression test covering NULL source_notebook_id + missing columns.
There was a problem hiding this comment.
Pull request overview
Hotfix to prevent DuckDB bind-parameter length mismatches when saving citation networks, especially in “standalone” contexts where source_notebook_id is NULL and some node columns aren’t present.
Changes:
- Coerces
source_notebook_idfromNULLtoNA_character_beforedbExecute()to avoid zero-length bind parameters. - Ensures
is_overlap/communityfallbacks are length-N vectors (viarep(...)) to avoiddbWriteTable()column length mismatches. - Adds regression tests covering
NULL source_notebook_idand missingis_overlap/community.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
R/db.R |
Fixes parameter binding and data.frame column length handling in save_network() |
tests/testthat/test-save-network.R |
Adds regression coverage for the failing save scenario and a “full columns present” case |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| is_overlap = as.logical(if (!is.null(nodes_df$is_overlap)) nodes_df$is_overlap else rep(FALSE, nrow(nodes_df))), | ||
| community = as.character(if (!is.null(nodes_df$community)) nodes_df$community else rep(NA_character_, nrow(nodes_df))), |
There was a problem hiding this comment.
save_network() always includes a community column in the data.frame passed to dbWriteTable(…, "network_nodes", …, append = TRUE). In the repo migrations, network_nodes is created in migrations/006_create_citation_networks.sql without community, and migrations/010_add_multi_seed_support.sql adds is_overlap but not community. A fresh DB built from migrations will therefore likely fail when saving a network. Please add a migration to ALTER TABLE network_nodes ADD COLUMN IF NOT EXISTS community VARCHAR (and consider making save_network() resilient by omitting community when the column doesn’t exist).
| DBI::dbExecute(con, "CREATE TABLE IF NOT EXISTS network_nodes ( | ||
| network_id VARCHAR NOT NULL, paper_id VARCHAR NOT NULL, | ||
| is_seed BOOLEAN DEFAULT FALSE, title VARCHAR NOT NULL, authors VARCHAR, | ||
| year INTEGER, venue VARCHAR, doi VARCHAR, cited_by_count INTEGER DEFAULT 0, | ||
| x_position DOUBLE, y_position DOUBLE, | ||
| is_overlap BOOLEAN DEFAULT FALSE, community VARCHAR, | ||
| PRIMARY KEY (network_id, paper_id), | ||
| FOREIGN KEY (network_id) REFERENCES citation_networks(id) | ||
| )") |
There was a problem hiding this comment.
The test creates a custom network_nodes schema that includes community and is_overlap. This diverges from the schema managed by repo migrations (e.g., migrations/006_create_citation_networks.sql doesn’t define community and migrations/010_add_multi_seed_support.sql doesn’t add it), so the regression test can pass even when a fresh migrated DB would fail. Consider initializing the schema via the project’s migration/init helpers (or otherwise deriving the test table definitions from the migration SQL) so the test stays aligned with production schema.
|
Recreating branch from integration to resolve conflicts |
Hotfix
Issue: When saving a citation network:
Error saving network: rapi_bind: Bind parameter values need to have the same lengthRoot cause: Two issues in
save_network()(R/db.R):source_notebook_id = NULLpassed directly todbExecuteas a zero-length bind parameter — DuckDB requires all bind params to have equal lengthnodes_df$community %||% NA_character_andnodes_df$is_overlap %||% FALSEproduce scalar (length-1) fallbacks when columns are absent, mixed with length-N columns in the data.frame passed todbWriteTableFix:
NULL→NA_character_forsource_notebook_idbefore binding%||%fallbacks withrep()to ensure length-N vectorsTriage findings:
dbExecute(metadata insert) anddbWriteTable(nodes bulk insert) had parameter length mismatchescommunitycolumn addition and missing NULL handling were introduced at different timessource_notebook_id = "nb-1"and includedis_overlap/communitycolumns, masking the bugTest plan