Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_rowaddr and _rowid not exposed for merge_insert #3439

Open
oceanusxiv opened this issue Feb 9, 2025 · 1 comment
Open

_rowaddr and _rowid not exposed for merge_insert #3439

oceanusxiv opened this issue Feb 9, 2025 · 1 comment

Comments

@oceanusxiv
Copy link

Sort of a follow up on #3251, I noticed that _rowid and _rowaddr doesn't seem to be usable for merge_insert, while it works for merge. When I try to use it with a subcol update, something like

import pyarrow as pa
import polars as pl

initial_data = pa.table(
    {
        "a": range(10),
        "b": range(10),
        "c": range(10, 20),
    }
)

dataset = lance.write_dataset(
    initial_data, "/tmp/lance/test2.lance"
)

new_values = pl.from_arrow(dataset.to_table(with_row_id=True)).select(pl.col("_rowid"), pl.col("a") * 2)

(dataset.merge_insert("a").when_matched_update_all().execute(new_values))

gives me

OSError: Append with different schema: fields did not match, missing=[b, c], unexpected=[_rowid], location: /Users/runner/work/lance/lance/rust/lance-core/src/datatypes/schema.rs:142:27
@chenkovsky
Copy link
Contributor

I think it's quite different from #3251 . because _rowid is managed by lance, we cannot insert _rowid into lance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants