-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New [[.tbl_df
shouldn't use vec_slice()
when doing [[i, j]]
#681
Comments
I agree We define
in the invariants vignette, because we want it to work with data frame and array columns. Is this wrong? |
I think that is wrong because it implies the attributes are kept. library(tibble)
df <- tibble(x = 1:2, y = structure(1:2, extra = "stuff"))
# df[[1, 2]]
df[1,][[2]]
#> [1] 1
#> attr(,"extra")
#> [1] "stuff" Created on 2019-12-02 by the reprex package (v0.3.0.9000) I think it should be more like:
Which is very close to |
It also depends on what you want list column extraction to return library(vctrs)
library(tibble)
tbl <- tibble(y = 1:2, x = list(1, 2))
# tbl[[1, 2]]
tbl[1,][[2]]
#> [[1]]
#> [1] 1
vctrs:::vec_get(tbl[[2]], 1)
#> [1] 1 Created on 2019-12-02 by the reprex package (v0.3.0.9000) |
A potential short term solution is to do This isn't exactly right because it still would not remove extra attributes and row names from data frames and arrays, but would be close. Something like: library(tibble)
library(vctrs)
vec_get_ish <- function(x, i) {
can_use_bracket_bracket <- !is.data.frame(x) && !is.array(x)
if (can_use_bracket_bracket) {
return(x[[i]])
}
out <- vec_slice(x, i)
row.names(out) <- NULL
# Now strip all non-core attributes? Not sure how though
# ...
out
}
vec_get_ish(1:5, 1)
#> [1] 1
vec_get_ish(mtcars, 1)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
vec_get_ish(list(1, 2), 1)
#> [1] 1
tbl_bracket_bracket_ij <- function(x, i, j) {
vec_get_ish(x[[j]], i)
}
df <- tibble(x = tibble(a = 1:2))
df2 <- tibble(x = list(a = 1:2))
tbl_bracket_bracket_ij(df, 1, 1)
#> # A tibble: 1 x 1
#> a
#> <int>
#> 1 1
tbl_bracket_bracket_ij(df2, 1, 1)
#> [1] 1 2 Created on 2019-12-02 by the reprex package (v0.3.0.9000) |
What should happen here? tibble::tibble(hms::hms(1:3))[[1, 1]]
#> 00:00:01 Created on 2019-12-02 by the reprex package (v0.3.0) I don't understand what's wrong with keeping attributes. |
Ah, you're right. I forgot that the main difference between
library(vctrs)
x <- structure(1:5, extra = "stuff", names = letters[1:5])
# vec_slice keeps attributes
vec_slice(x, 1)
#> a
#> 1
#> attr(,"extra")
#> [1] "stuff"
# `[` drops attributes, except names
x[1]
#> a
#> 1
# `[[` drops attributes, including names
x[[1]]
#> [1] 1
# vec_get(), then, should keep attributes, except names?
strip_names <- function(x) {
if (is.data.frame(x) || is.array(x)) {
row.names(x) <- NULL
} else {
names(x) <- NULL
}
x
}
vec_get <- function(x, i) {
stopifnot(length(i) == 1L)
if (is.list(x) && !is.data.frame(x)) {
x <- x[[i]]
} else {
x <- vec_slice(x, i)
x <- strip_names(x)
}
x
}
vec_get(x, 1)
#> [1] 1
#> attr(,"extra")
#> [1] "stuff"
vec_get(mtcars, 1)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
vec_get(list(x = c(a = 1)), 1)
#> a
#> 1
vec_get(hms::hms(1:3), 1)
#> 00:00:01 Created on 2019-12-03 by the reprex package (v0.3.0.9000) |
@DavisVaughan: Do we need to do anything here? |
I think the main question is what you want this behavior with lists to be. library(tibble)
y <- tibble(z = list(1, 2))
i <- 1
j <- 1
# Current behavior. But is this right? It doesn't match CRAN tibble
y[[i, j]]
#> [[1]]
#> [1] 1
# I expect the invariant to be more like this, which
# is more accurately defined (for data frames) as `vec_get(y[[j]], i)`
y[[j]][[i]]
#> [1] 1 If you can definitively answer that, the answer to that also implies if names should be dropped here, as you would have to use library(tibble)
y <- tibble(z = structure(1:2, stuff = "foo", names = letters[1:2]))
i <- 1
j <- 1
# Current behavior. But names should be dropped too if we index into lists with
# `[[` behavior. (Other attributes would be kept).
y[[i, j]]
#> a
#> 1
#> attr(,"stuff")
#> [1] "foo" We still don't have a library(tibble)
library(vctrs)
strip_vec_names <- function(x) {
maybe_row_names <- is.data.frame(x) || is.array(x)
if (maybe_row_names) {
row.names(x) <- NULL
} else {
names(x) <- NULL
}
x
}
vec_get <- function(x, i) {
# Assuming you already do this
# i <- vec_as_location2(i, vec_size(x))
if (vec_is_list(x)) {
out <- x[[i]]
return(out)
}
out <- vec_slice(x, i)
out <- strip_vec_names(out)
out
}
x <- tibble(z = list(1, 2))
y <- tibble(z = structure(1:2, stuff = "foo", names = letters[1:2]))
vec_get(x[[1]], 1)
#> [1] 1
vec_get(y[[1]], 1)
#> [1] 1
#> attr(,"stuff")
#> [1] "foo" |
It's in the invariants, https://tibble.tidyverse.org/dev/articles/invariants.html#definition-of-xi-j-1:
It doesn't talk about names yet. IIUC, we need to return an unnamed scalar in the last example: x <- tibble::tibble(a = c(b = 1))
x$a
#> b
#> 1
x[["a"]]
#> b
#> 1
x[["a"]][[1]]
#> [1] 1
x[[1, "a"]]
#> b
#> 1 Created on 2020-03-20 by the reprex package (v0.3.0) |
Thanks! |
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary. |
CRAN tibble
Dev tibble
I think this is definitely not right.
[[
should always strip attributes from atomic vectors.The issue comes from using
vec_slice()
in[[.tbl_df
here:tibble/R/subsetting.R
Line 122 in e9663b9
The issue is that
vec_slice()
works more like[
which won't strip the attributes from the column.[[
would. The vctrs equivalent for[[
doesn't exist yet, but would bevec_get()
r-lib/vctrs#626
Note that
base:::[[.data.frame
uses[[
for the row extraction after selecting the column, which is why we previously had the correct behavior:vec_get()
also behaves differently thanvec_slice()
with lists, so list column extraction is also different now.CRAN tibble:
Dev tibble:
The text was updated successfully, but these errors were encountered: