Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cropland: Zambia 2019 #221

Closed
9 tasks done
adebowaledaniel opened this issue Sep 29, 2022 · 45 comments
Closed
9 tasks done

Cropland: Zambia 2019 #221

adebowaledaniel opened this issue Sep 29, 2022 · 45 comments
Assignees
Labels
crop map Generate new crop map

Comments

@adebowaledaniel
Copy link
Collaborator

adebowaledaniel commented Sep 29, 2022

Start year: 2019
Start month: November

@adebowaledaniel adebowaledaniel added the crop map Generate new crop map label Sep 29, 2022
@adebowaledaniel adebowaledaniel self-assigned this Sep 29, 2022
@adebowaledaniel adebowaledaniel changed the title Zambia Cropland mask: Zambia Sep 29, 2022
@adebowaledaniel

This comment was marked as resolved.

@ivanzvonkov
Copy link
Collaborator

Merging script: https://github.com/nasaharvest/openmapflow/blob/main/openmapflow/scripts/merging.sh

@adebowaledaniel
Copy link
Collaborator Author

@ivanzvonkov, there are missing predictions in the Zambia cropland mask see here; despite running the inference multiple times which gave the same number of predictions made see the screenshot below.
image

@ivanzvonkov
Copy link
Collaborator

I think this is most likely due to the nan values in some tifs, you can go ahead and merge.
The logs can be investigated to find out why this is happening

@adebowaledaniel
Copy link
Collaborator Author

I already merged; see the link I provided above (https://code.earthengine.google.com/a9522bd391a18cd98268994b6bffe317?hideCode=true)

The error messages vary; one is a request timeout, and the other is not specific.

@ivanzvonkov
Copy link
Collaborator

Some of the errors I am seeing look like this
image

This means there's a nan value in the tif. This was fixed in a newer version of OpenMapFlow (nasaharvest/openmapflow#109)

It'll be deployed if you install it manually before deployment

pip install openmapflow==0.2.1rc2
export OPENMAPFLOW_MODELS="..."
openmapflow deploy

@ivanzvonkov
Copy link
Collaborator

ivanzvonkov commented Oct 31, 2022

New version can now be deployed if it's on master here by using the Github action manually: https://github.com/nasaharvest/crop-mask/actions/workflows/deploy.yml

@adebowaledaniel
Copy link
Collaborator Author

EE link

@ivanzvonkov
Copy link
Collaborator

ivanzvonkov commented Oct 31, 2022

  • Look into logs for what bands have Nans the most (logs)
  • Skip those in the training of the next model

@ivanzvonkov
Copy link
Collaborator

Blocked by #230

@ivanzvonkov
Copy link
Collaborator

Above is merged can retrain with missing values in training data

@ivanzvonkov ivanzvonkov changed the title Cropland mask: Zambia Cropland: Zambia 2019 Nov 14, 2022
@adebowaledaniel
Copy link
Collaborator Author

Here is the error I got while trying to retrain the model @ivanzvonkov.

image

@adebowaledaniel
Copy link
Collaborator Author

Nit: I think this warning might be of concern later.

image

@ivanzvonkov
Copy link
Collaborator

@adebowaledaniel re: arrow error it happens because of this line

print(f"Upsampling: local crop{arrow}non-crop: {local_crop}{arrow}{local_non_crop}")

Do you see the bug? 🐛

@ivanzvonkov
Copy link
Collaborator

#248 will be merged soon

@ivanzvonkov
Copy link
Collaborator

Consider creating small map and checking visually

@adebowaledaniel
Copy link
Collaborator Author

Still the same problem @ivanzvonkov
image

@hannah-rae
Copy link
Contributor

hannah-rae commented Jan 9, 2023

Map quality is poor due to large and small scale blockiness, blatantly wrong predictions

@adebowaledaniel to investigate a few things to debug:

  • check the GEE logs to see if there are any errors being thrown to explain why there is missing data
  • run error analysis notebook to plot errors in evaluation set to see if there is geographical pattern to errors (may need to update notebook)

@adebowaledaniel
Copy link
Collaborator Author

The three error codes in the logs:

  1. 500: The request was aborted because there was no available instance. As a result of (i) a sudden increase in traffic or (ii) long request processing time.
  2. 503: The request failed because either the HTTP response was malformed or connection to the instance had an error. The container instance was found to be using too much memory and was terminated. This is likely to cause a new container instance to be used for the next request to this revision.
  3. 504: The request has been terminated because it has reached the maximum request timeout.

Potential solution: Increasing the memory limit and reducing the request per container.

@adebowaledaniel
Copy link
Collaborator Author

@ivanzvonkov
Copy link
Collaborator

  • error analysis notebook updates
  • ERA5 grid overlay on map
  • If possible export grid overlay on map

@ivanzvonkov
Copy link
Collaborator

Ivan to send Zambia data if he has it
Adebowale to create PR for error analysis notebook update

@hannah-rae
Copy link
Contributor

During the operational meeting @bhyeh mentioned that @adebowaledaniel noted that since the Zambia_CEO_2019 dataset had a 0/0.5/0.5 train/val/test split the model may not have been trained with any points from Zambia. We usually set the CEO datasets with this ratio when we assume/know there are local samples in the other datasets (e.g. a ground-based dataset independent from the CEO dataset), but in the case of Zambia it's very possible there were little to no points in the other datasets. (One could check in GeoWiki how many are in Zambia, but this is probably the only dataset that has Zambia points). @adebowaledaniel can you post your updated results/plan based on your 0.6/0.2/0.2 split here?

@adebowaledaniel
Copy link
Collaborator Author

Thank you, @hannah-rae. As you mentioned, the Geowiki is the only dataset with Zambia data with a training subset with 336 sample points (positive class: 5.6%). Here is the result for the split to 0.6/0.2/0.2.

crop-mask/data/models.json

Lines 325 to 341 in 1785602

"Zambia_2019_v1": {
"params": "https://wandb.ai/nasa-harvest/crop-mask/runs/4k39krn6",
"test_metrics": {
"accuracy": 0.9726,
"f1_score": 0.7,
"precision_score": 0.7,
"recall_score": 0.7,
"roc_auc_score": 0.9703
},
"val_metrics": {
"accuracy": 0.9749,
"f1_score": 0.7222,
"precision_score": 0.619,
"recall_score": 0.8667,
"roc_auc_score": 0.9662
}
},

I applied the post-classification NDVI filtering method by Ben on a subset produced by the model; the output is here.

@adebowaledaniel
Copy link
Collaborator Author

June 5 - Check for cloud presence in the tif files

@adebowaledaniel
Copy link
Collaborator Author

adebowaledaniel commented Jun 14, 2023

@hannah-rae, Contrary to our expectations of cloud presence, the Sentinel-1 bands were absent in those oddly-shaped regions on the map. I shared my observations in this slide and also included a notebook (link in the slide) in case you want to reproduce what I did.

@hannah-rae
Copy link
Contributor

Very interesting... was that not captured in the logs at all? Maybe we should add a test when the data are exported to check that none of the bands are missing data.

For now, perhaps it makes sense to train a new model without S1?

@adebowaledaniel
Copy link
Collaborator Author

Here are the outputs of the new model trained without S1: map(as expected, it's without the weird features) and metrics.
Let me know your observation of the subset map generated. Should we continue with this model for the entire country?

I will create an issue regarding the missing S1 bands; also, check the eo export log for any clues.

@adebowaledaniel
Copy link
Collaborator Author

@hannah-rae Crop Mask + Postclassification processing: here

@hannah-rae
Copy link
Contributor

@adebowaledaniel can you make the assets public?

@adebowaledaniel
Copy link
Collaborator Author

Done @hannah-rae

@hannah-rae
Copy link
Contributor

Loading is crashing for me. @cnakalembe to try loading and will do expert sign-off

@cnakalembe
Copy link

I reviewed the map; I think the next step is manual cleanup removing obvious features like roads, I've seen some mines too. We could develop some clear guidance for this and I think Diana can do it in QGIS/ArcGIS

@hannah-rae hannah-rae assigned hannah-rae and unassigned cnakalembe Aug 28, 2023
@hannah-rae
Copy link
Contributor

hannah-rae commented Aug 28, 2023

@hannah-rae will make GEE script in repo to export ensemble map for Zambia (and other future countries)

update: should be addressed by notebook/GEE app created by @ivanzvonkov in #315

@hannah-rae
Copy link
Contributor

@ivanzvonkov will make this map and update intercomparison re #346

@hannah-rae hannah-rae assigned ivanzvonkov and unassigned hannah-rae Sep 11, 2023
@ivanzvonkov
Copy link
Collaborator

ivanzvonkov commented Sep 11, 2023

After running intercomparison on Zambia with full evaluation set (validation and test), ensemble ties the glad map.
image

There are also not that many points to begin with because many of them were sampled outside of Zambia boundaries (old CEO project).
Should we proceed with just exporting GLAD map?
@hannah-rae

@hannah-rae hannah-rae assigned cnakalembe and unassigned ivanzvonkov Sep 18, 2023
@hannah-rae
Copy link
Contributor

Next step for @cnakalembe to check if the GLAD map looks ok and is ok for use case, or if there is some reason to export the ensemble map instead.

@cnakalembe
Copy link

GLAD map is okay for the use case!

@hannah-rae
Copy link
Contributor

Next step: @ivanzvonkov run the export code for GLAD map

@hannah-rae hannah-rae assigned ivanzvonkov and unassigned cnakalembe Sep 25, 2023
@ivanzvonkov
Copy link
Collaborator

Shared exported map on slack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crop map Generate new crop map
Projects
None yet
Development

No branches or pull requests

4 participants