Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewriting gts metrics to pull from ERDDAP #102

Merged
merged 5 commits into from
Jan 29, 2025

Conversation

MathewBiddle
Copy link
Contributor

Starting to address #35

This update removes the requirement to host the gts/GTS_regional_totals*.csv files in this repo. Instead, we are harvesting the data directly from the IOOS ERDDAP.

I've spot checked a couple of the quarterly summaries and the numbers are the same. However, I would like another set of eyes on it before it gets merged.

@ocefpaf mind taking a look when you get a chance?

We might need to update the Action for generating the webpage to install a few other packages. See https://github.com/ioos/ioos_metrics/blob/main/.github/workflows/website_create_and_deploy.yml

Mainly:

  • erddapy
  • fiscalyear
  • datetime

There are probably other opportunities to make this code more readable. Feel free to suggest any changes.

@MathewBiddle
Copy link
Contributor Author

@MathewBiddle
Copy link
Contributor Author

This also means we no longer need to have this script
https://github.com/ioos/ioos_metrics/blob/main/gts_regional_metrics.py

and we don't need to run that script in this action

- 'gts_regional_metrics.py'

@MathewBiddle
Copy link
Contributor Author

I have some bigger changes coming in a minute. Switching this to draft for now.

@MathewBiddle MathewBiddle marked this pull request as draft January 28, 2025 14:47
@MathewBiddle
Copy link
Contributor Author

This should now close #20

I added two new functions.

  1. get_ndbc_full_stats() which harvests all of the gts statistics we are hosting on IOOS erddap https://erddap.ioos.us/erddap/search/index.html?page=1&itemsPerPage=1000&searchFor=GTS
  2. stacked_bar_plot() which generates a plotly stacked bar chart in html to add to https://ioos.github.io/ioos_metrics/gts_regional.html

These two additions are replacements for the notebook https://github.com/ioos/ioos_metrics/blob/main/notebooks/GTS_Totals_weather_act.ipynb.

The way I've coded it is very clunky ATM. I'm making two calls to the IOOS ERDDAP to grab the IOOS regional statistics. I think that can be consolidated into one call which grabs all the GTS data, then we filter for what is needed for each plotting routine.

I am also getting a bunch of FutureWarning, SettingWithCopyWarning, and UserWarning messages when I run create_gts_regional_landing_page.py. As they are warnings, I'm ignoring them for now in an effort to make some progress on this topic and get my thoughts out there.

I am open to any suggestions on how to make this code more readable/efficient.

@MathewBiddle MathewBiddle marked this pull request as ready for review January 28, 2025 16:19
@MathewBiddle MathewBiddle requested a review from ocefpaf January 28, 2025 16:34
updating action to no longer ref unnecessary files
@ocefpaf
Copy link
Member

ocefpaf commented Jan 29, 2025

I am also getting a bunch of FutureWarning, SettingWithCopyWarning, and UserWarning messages when I run create_gts_regional_landing_page.py.

The only one that can lead to wrong result is SettingWithCopyWarning, the other two is safe to ignore and we can solve them as we update the code. Do you want me to take a look at what is causing the SettingWithCopyWarning before we merge? Or do you want to merge this and address it later? (If you trust the results it should be OK to merge, it is not always that SettingWithCopyWarning overwrites something that you will use later.)

Copy link
Contributor Author

@MathewBiddle MathewBiddle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments as to where I think SettingWithCopyWarning comes in.

@@ -30,7 +34,7 @@ def write_templates(configs, org_config):


def timeseries_plot(output):
output["date"] = pd.to_datetime(output["date"])
output["date"] = pd.to_datetime(output.index.strftime("%Y-%m"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where SettingWithCopyWarning comes in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are probably fine. Is that date column new or is it getting overwritten?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date column is new.


print(f_out)
key = "{} {}".format(f_out.split("_")[3], f_out.split("_")[4].split(".")[0])
totals_subset['date'] = totals_subset.index.strftime("%Y-%m")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where SettingWithCopyWarning comes in.

@MathewBiddle
Copy link
Contributor Author

From what I can tell, these changes are returning the exact same metrics as was calculated via the previous process. So, I'd like to get these changes in, then we can adjust to limit the number of warnings. How does that sound?

I added some line comments on where I think the issue is occurring.

@MathewBiddle MathewBiddle merged commit dac7d97 into ioos:main Jan 29, 2025
4 checks passed
@MathewBiddle MathewBiddle deleted the refactor_gts branch January 29, 2025 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants