You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for a great fifth (final!) Cohort Call last week. We talked about everyones' pathways forward and plans for more open and inclusive data science to migrate Earthdata workflows to the Cloud. Thank you all for sharing your progress with the group. We are so impressed with how much you all have learned through this cohort! And we're looking forward with this cohort to being able to support you with what you need - we're dedicated to helping migrate your workflows to the cloud.
In the next two months, we'll be supporting you as you need help through a few ways:
Slack. Our #2022-nasa-champions channel will be our primary place to get help asynchronously. Please check in, share links, screenshots, error messages, questions (and wins!) and we (NASA Openscapes) will review code, test things, confer with colleagues, and share back (we'll reply and use 👀 to let everyone know progress). When it makes sense to have video calls to screenshare and talk, we'll share the video link in the channel and anyone is welcome to join as well to watch or ask questions. We can also schedule these as additional co-working sessions, see next -
Co-working. on alternating Thursdays at 11:30-1pm PT. We'll continue to have breakout rooms for you to work with Mentors, and also will use these times to teach/screenshare, and will record. Some topics we're working on:
Practicing GitHub workflows and teaching others on your team
Cloud spatial subsetting
Environment management for creating cloud computing space that is reproducible and scalable (e.g. docker images)
Dask/Pangeo software stack to enable scalable processing
Cloud costs and setup
NetCDF to Zarr
1:1 chats. We are available if you or your team wants to chat. We know that identifying what you need is hard, and talking and screensharing how you work can help you think things through, and through this we'll likely see places to help. We'll be checking in with your teams - and please know this is an offer for anyone in the cohort, not only leads. We're all here to help each other learn.
Survey. This is an anonymous survey we ask each Champions Cohort to complete so we can improve future cohorts - and for our Cohort it's also another chance to share what you need. We'd appreciate your feedback - please fill out this survey by May 20.
Hope you have a great weekend - below is a light digest of Call 05.
Cheers,
The NASA-Openscapes Mentors @NASA-Openscapes/mentors-2021 @erinmr@jules32
Digest: Cohort Call 05 [ 2022-nasa-champions ]
Cohort Folder - contains agendas, video recordings, pathways folder
function keeps track of data provenance - wraps up complicated analyses
bringing new folks in - there's so much tech stuff - how do we help folks with onboarding
need to track which version of data they gave people cause they have lots of open collabs
challenge of bridging uses of R and python by different communities they deal with cloud
Pathways & Landscape resonates
hard to share openly if you don't feel safe. kindness = :-)
understanding power of markdown and ability to put it on GitHub
Cloud needs, cloud barriers - We'd need to set up billing, - administrative and budgeting needs
Setting up infrastructure is a huge challenge - beyond using the tools themselves
Costing in the cloud is a question that comes up a lot , as the cloud paradigm unfolds. I wonder if one approach to understanding cost and setting things up "on your own" (at own institution) is to first become more familiar and comfortable with doing data analysis in the cloud, through "training icloud accounts like Openscapes/2i2c, so taking advantage of programs like this. That could allow a better understanding of how much cloud computing you may need, what environment is required... which then allows you to go and do this within your labs/at your institutioins, may have a better idea what to ask for and set up. I also wonder if there is a parallel here setting up cloud computing to how we've traditionally been setting up and paying for HPC environments
"I like the idea of learning about "guardrails" so you don't mess up ($$$) on the cloud -- akin to benefits of adopting version control but for cloud credits"
Realized "how informal I am" in proj setup and this could be improved to work with teams
xarray is the magic access package, but slow in cloud -- working on optimization with GES-DISC wizards and gurus
Aaron Friesz is working on a prototype with kerchunk & Alexis's colleague Chris has a *very* similar prototype going with fsspec. Stay tuned!
Erin Robinson has really helped me - Matlab & Fortran code > running in the Cloud
This example can be generalized to a use case where one needs to use cloud-archived data with non-cloud data (e.g. GCM output, in-situ data, other institution's data etc)
Allan: I like the idea of learning about "guardrails" so you don't mess up ($$$) on the cloud -- akin to benefits of adopting version control but for cloud credits
Hi @NASA-Openscapes/2022-nasa-champions-team !
Thanks for a great fifth (final!) Cohort Call last week. We talked about everyones' pathways forward and plans for more open and inclusive data science to migrate Earthdata workflows to the Cloud. Thank you all for sharing your progress with the group. We are so impressed with how much you all have learned through this cohort! And we're looking forward with this cohort to being able to support you with what you need - we're dedicated to helping migrate your workflows to the cloud.
In the next two months, we'll be supporting you as you need help through a few ways:
Slack. Our #2022-nasa-champions channel will be our primary place to get help asynchronously. Please check in, share links, screenshots, error messages, questions (and wins!) and we (NASA Openscapes) will review code, test things, confer with colleagues, and share back (we'll reply and use 👀 to let everyone know progress). When it makes sense to have video calls to screenshare and talk, we'll share the video link in the channel and anyone is welcome to join as well to watch or ask questions. We can also schedule these as additional co-working sessions, see next -
Co-working. on alternating Thursdays at 11:30-1pm PT. We'll continue to have breakout rooms for you to work with Mentors, and also will use these times to teach/screenshare, and will record. Some topics we're working on:
Practicing GitHub workflows and teaching others on your team
Cloud spatial subsetting
Environment management for creating cloud computing space that is reproducible and scalable (e.g. docker images)
Dask/Pangeo software stack to enable scalable processing
Cloud costs and setup
NetCDF to Zarr
1:1 chats. We are available if you or your team wants to chat. We know that identifying what you need is hard, and talking and screensharing how you work can help you think things through, and through this we'll likely see places to help. We'll be checking in with your teams - and please know this is an offer for anyone in the cohort, not only leads. We're all here to help each other learn.
Survey. This is an anonymous survey we ask each Champions Cohort to complete so we can improve future cohorts - and for our Cohort it's also another chance to share what you need. We'd appreciate your feedback - please fill out this survey by May 20.
Hope you have a great weekend - below is a light digest of Call 05.
Cheers,
The NASA-Openscapes Mentors @NASA-Openscapes/mentors-2021 @erinmr @jules32
Digest: Cohort Call 05 [ 2022-nasa-champions ]
Cohort Folder - contains agendas, video recordings, pathways folder
Cohort webpage: https://nasa-openscapes.github.io/2022-nasa-champions
Goals: Each team shared their Pathways and we discussed next steps.
A few lines from shared notes in the Agenda doc during Pathway shares:
use MatLab & moving into python
no more emailing code! working to share code via GitHub
"Roiling chaos" and legacy code - how do you transition?
challenge: only 3 people in Openscapes but team is actually large. How might we get others to work in GitHub
"kindness" is key
now Jupyter Notebooks are shared, not just local
Kodi created earthdata R function and will work on documentation and CRAN
bringing new folks in - there's so much tech stuff - how do we help folks with onboarding
need to track which version of data they gave people cause they have lots of open collabs
challenge of bridging uses of R and python by different communities they deal with cloud
Pathways & Landscape resonates
hard to share openly if you don't feel safe. kindness = :-)
understanding power of markdown and ability to put it on GitHub
Cloud needs, cloud barriers - We'd need to set up billing, - administrative and budgeting needs
Costing in the cloud is a question that comes up a lot , as the cloud paradigm unfolds. I wonder if one approach to understanding cost and setting things up "on your own" (at own institution) is to first become more familiar and comfortable with doing data analysis in the cloud, through "training icloud accounts like Openscapes/2i2c, so taking advantage of programs like this. That could allow a better understanding of how much cloud computing you may need, what environment is required... which then allows you to go and do this within your labs/at your institutioins, may have a better idea what to ask for and set up. I also wonder if there is a parallel here setting up cloud computing to how we've traditionally been setting up and paying for HPC environments
Realized "how informal I am" in proj setup and this could be improved to work with teams
xarray is the magic access package, but slow in cloud -- working on optimization with GES-DISC wizards and gurus
if using NASA ESDIS data, could use the zarr-eosdis-store e.g. https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/09_Zarr_Access.html
Aaron Friesz is working on a prototype with kerchunk & Alexis's colleague Chris has a *very* similar prototype going with fsspec. Stay tuned!
Erin Robinson has really helped me - Matlab & Fortran code > running in the Cloud
This example can be generalized to a use case where one needs to use cloud-archived data with non-cloud data (e.g. GCM output, in-situ data, other institution's data etc)
Allan: I like the idea of learning about "guardrails" so you don't mess up ($$$) on the cloud -- akin to benefits of adopting version control but for cloud credits
Friendly learning resources
Intro NASA Earthdata on the Cloud
NASA Earthdata Cloud tutorials - NASA Openscapes Mentors. Follow-along tutorials for the Openscapes 2i2c JupyterHub
NASA Earthdata Glossary
Intro Python
Duke STA-663 - Colin Rundel. Lecture slides & recordings, code & notebooks. Features Jupyter, git, numpy, scipy, pandas, scikit-learn...
Intro to Geospatial Raster and Vector Data with Python - Carpentries. Follow-along tutorials & code. Features NEON data, intro to rasters & geostats rioxarray, geopandas...
Intro to Earth and Environmental Data Science- Ryan Abernathy. Intro to Python, JupyterLab, Unix, Git, some packages & workflows
Intro R
Intro to Open Data Science with R - Lowndes & Horst. Follow-along tutorials & code. Features workflows with RMarkdown, tidyverse, RStudio, GitHub...
What they forgot to teach you about R - Bryan & Hester. Reinforcing lessons for moderately experienced R users
R for Data Science - Wickham & Grolemund. - All things tidyverse, including dates, plots, modeling, programming, RMarkdown
The text was updated successfully, but these errors were encountered: