Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New linter to recommend list2DF() base R function #2596

Open
Bisaloo opened this issue Jun 7, 2024 · 4 comments · May be fixed by #2834
Open

New linter to recommend list2DF() base R function #2596

Bisaloo opened this issue Jun 7, 2024 · 4 comments · May be fixed by #2834

Comments

@Bisaloo
Copy link
Collaborator

Bisaloo commented Jun 7, 2024

As an alternative to do.call(cbind.data.frame, x).

x <- list(
  ind = seq_along(letters),
  let = letters
)

do.call(cbind.data.frame, x)
#>    ind let
#> 1    1   a
#> 2    2   b
#> 3    3   c
#> 4    4   d
#> 5    5   e
#> 6    6   f
#> 7    7   g
#> 8    8   h
#> 9    9   i
#> 10  10   j
#> 11  11   k
#> 12  12   l
#> 13  13   m
#> 14  14   n
#> 15  15   o
#> 16  16   p
#> 17  17   q
#> 18  18   r
#> 19  19   s
#> 20  20   t
#> 21  21   u
#> 22  22   v
#> 23  23   w
#> 24  24   x
#> 25  25   y
#> 26  26   z
list2DF(x)
#>    ind let
#> 1    1   a
#> 2    2   b
#> 3    3   c
#> 4    4   d
#> 5    5   e
#> 6    6   f
#> 7    7   g
#> 8    8   h
#> 9    9   i
#> 10  10   j
#> 11  11   k
#> 12  12   l
#> 13  13   m
#> 14  14   n
#> 15  15   o
#> 16  16   p
#> 17  17   q
#> 18  18   r
#> 19  19   s
#> 20  20   t
#> 21  21   u
#> 22  22   v
#> 23  23   w
#> 24  24   x
#> 25  25   y
#> 26  26   z
y <- list(
  var1 = runif(1e6),
  var2 = rnorm(1e6),
  var3 = rnorm(1e6),
  var4 = rnorm(1e6)
)

microbenchmark::microbenchmark(
  do.call(cbind.data.frame, y),
  list2DF(y),
  check = "identical"
)
#> Unit: microseconds
#>                          expr     min       lq      mean   median       uq
#>  do.call(cbind.data.frame, y) 125.398 131.2090 158.49506 139.6595 166.9540
#>                    list2DF(y)   6.227   7.1795  10.40463   8.3165  10.6615
#>      max neval
#>  344.463   100
#>   67.065   100

Created on 2024-06-07 with reprex v2.1.0

Some hits in the wild: https://github.com/search?q=org%3Acran+%2Fdo%5C.call%5C%28cbind%5C.data%5C.frame%2F&type=code

Are there some edge cases one needs to be aware of? Happy to submit a PR for this.

@TimTaylor
Copy link

Are there some edge cases one needs to be aware of?

list2DF() doesn't recycle vectors whereas cbind.data.frame() will. I think list2DF() did recycle when first introduced but then became strict in a subsequent release.

x <- list(ind = 1:2, let = letters[1:6])
try(list2DF(x))
#> Error in list2DF(x) : all variables should have the same length
do.call(cbind.data.frame, x)
#>   ind let
#> 1   1   a
#> 2   2   b
#> 3   1   c
#> 4   2   d
#> 5   1   e
#> 6   2   f

even length 1 not recycled

x <- list(ind = 1, let = letters[1:6])
try(list2DF(x))
#> Error in list2DF(x) : all variables should have the same length
do.call(cbind.data.frame, x)
#>   ind let
#> 1   1   a
#> 2   1   b
#> 3   1   c
#> 4   1   d
#> 5   1   e
#> 6   1   f

@MichaelChirico
Copy link
Collaborator

In that case it makes it tough to recommend list2DF() statically. OTOH, even just data.frame() (which does do recycling) seems preferable?

I do see a surprising number of call sites for do.call(cbind.data.frame:

https://github.com/search?q=lang%3AR+%2Fdo%5B.%5Dcall%5B%28%5D%5Cs*%5B%27%22%5D%3Fcbind%5B.%5Ddata%5B.%5Dframe%2F&type=code

So we could recommend just using data.frame() instead, and mentioning list2DF() as another alternative to consider if recycling is not needed?

@AshesITR
Copy link
Collaborator

Is there different behavior of data.frame() vs. cbind.data.frame() if the argument lists contain complex types such as data frames / matrices / lists?

@MichaelChirico
Copy link
Collaborator

MichaelChirico commented Jun 15, 2024

Here's cbind.data.frame()

function (..., deparse.level = 1) 
data.frame(..., check.names = FALSE)
<bytecode: 0x55bca92f06a0>
<environment: namespace:base>

🙃

Bisaloo added a commit to Bisaloo/lintr that referenced this issue Mar 12, 2025
Bisaloo added a commit to Bisaloo/lintr that referenced this issue Mar 12, 2025
Bisaloo added a commit to Bisaloo/lintr that referenced this issue Mar 12, 2025
@Bisaloo Bisaloo linked a pull request Mar 12, 2025 that will close this issue
Bisaloo added a commit to Bisaloo/lintr that referenced this issue Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants