Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom aggregation functions (or: NA handling in aggregation functions) #97

Open
hol430 opened this issue Mar 7, 2025 · 1 comment
Open

Comments

@hol430
Copy link
Contributor

hol430 commented Mar 7, 2025

I have a field which was created by reading from multiple sources and then calling copyLayers(..., keep.all.to = TRUE). My sources cover (somewhat) different time periods, so the end result is that on some dates, I have NA values for one field. I then call aggregateYears() (though I suppose this applies to other aggregation functions as well), which returns NA values for days which contain at least one missing value.

What would be nice would be if aggregateYears() and friends accepted an na.rm argument or similar, though I'm not sure if that argument is supported by all of the currently-implemented aggregators. Otherwise, if we could pass in a custom aggregation function, that would solve the issue as well. I'm not sure how easy that would be to implement - if it's too hard, then having another aggregation method called something like "mean_na_rm" would be easy to implement and would solve the problem.

I have a workaround, so it's not really urgent or anything, but I think this use case (comparing the seasonality of two not-quite-temporally-overlapping layers) is not totally far-fetched and it would be nice if the package was able to handle this for us.

@MagicForrest
Copy link
Owner

Hi Drew. I see the problem. I think something can be done. Exactly which option will take a little thinking.

One thing to be aware of. DGVMTools wasn't really designed for combining objects with different time periods with copyLayers() like that. Rather two sources with different time periods can be kept separate but be plotted together by bundling them in a list. Or if you want to compare them directly, then compareLayers() is the tool for that. Exactly what processing do you need to do on the two datasets? Is there a way to do it would combing data from multiple Sources in one Field? Also, would it work to apply aggregateYears() before coming the layers?

But, yes, you are absolutely right. It would be good have something that also worked for this use case. I think that adding na.rm would work and be a very decent fix, and totally consistent with the normal R conventions. So I'll probably look at that.

hol430 added a commit to hie-dave/lpjg-output-analysis that referenced this issue Mar 17, 2025
This is a common use case when plotting layers which cover different
time periods.

See:

MagicForrest/DGVMTools#97
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants