-
Notifications
You must be signed in to change notification settings - Fork 75
Improve list.chunked()
+ List<List<T>>.toDataFrame
use case
#1486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I agree that this a popular use-case, I also faced with the same and handled with File API parsing Great, if tiny example, saying, with subtitles will be also added |
public fun <T> List<List<T>>.toDataFrame(containsColumns: Boolean = false): AnyFrame = | ||
@Refine | ||
@Interpretable("ValuesListsToDataFrame") | ||
public fun <T> List<List<T>>.toDataFrame(header: List<String>? = null, containsColumns: Boolean = false): AnyFrame = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a breaking change, can we keep the old function and deprecate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecated and moved from io to api package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a useful addition :) it's common in excel/csv's too to take the first row as headers unless headers are supplied explicitly, so it makes sense here too I guess
b6c1734
to
995529d
Compare
995529d
to
075e96d
Compare
Consider file structured as this:
I believe it's popular, one example is srt, but i personally had to deal with it a lot
It can be parsed into dataframe:
Surprisingly here toDataFrame is not generic Iterable.toDataFrame, but completely another function.

Problem: In current shape it's not helpful.
I end up with something very close to what i want, but required change to code is somewhat non-trivial.
I'd either have to:
Or switch to completely different route:
with compiler plugin
With this API change:
Plugin will understand resulting schema too