Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as.data.frame.function does not return a data frame #264

Open
josilber opened this issue Sep 11, 2015 · 1 comment
Open

as.data.frame.function does not return a data frame #264

josilber opened this issue Sep 11, 2015 · 1 comment

Comments

@josilber
Copy link

The as.data.frame.function (ADFF) function returns a function that can be used to generate a data frame. However, due to ADFF's name it is invoked whenever as.data.frame is called on a function f. This violates the return type specified in ?as.data.frame:

as.data.frame returns a data frame, normally with all row names "" if optional = TRUE.

The reason it's problematic (imo) to have as.data.frame not always return a data frame is that it technically requires library implementers to use is.data.frame every time they invoke as.data.frame to make sure they're now working with a data frame. As an example, consider running the following in a clean R session with no packages loaded (imagine somebody was using df for their data frame but the data failed to load in a previous step, leaving df as the density function for the F distribution):

merge(df, iris)
# Error in as.data.frame.default(x) : 
#   cannot coerce class ""function"" to a data.frame

This looks good -- merge has a helpful error message saying we can't convert df (a function) to a data frame for merging.

Now, consider what happens if we have the plyr package loaded:

library(plyr)
merge(df, iris)
# Error: C stack usage  7969596 is too close to the limit

The merge has failed by infinite looping. The culprit is the merge.default function:

function (x, y, ...) 
merge(as.data.frame(x), as.data.frame(y), ...)
<bytecode: 0x7fa71cb288b0>
<environment: namespace:base>

merge.default is calling as.data.frame on both arguments and then invoking merge; since as.data.frame(x) doesn't return a data frame (as the merge.default implementers assumed it would), merge.default is again being invoked, causing the infinite loop.

Would it be possible to rename as.data.frame.function to something else (e.g. as.df.function) to avoid this?

@andypea
Copy link

andypea commented Apr 11, 2019

I'd like to second the request to rename as.data.frame.function. Doing so would have saved me an hour of debugging today.

As @josilber said, it gets called by the as.data.frame generic method, but it doesn't return a data.frame. This silently breaks some of the base R functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants