Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freshdesk: #220943 & #222192 SA Museum data load: 18/12/2024 #1147

Open
timhicks-ala opened this issue Dec 18, 2024 · 2 comments
Open

Freshdesk: #220943 & #222192 SA Museum data load: 18/12/2024 #1147

timhicks-ala opened this issue Dec 18, 2024 · 2 comments

Comments

@timhicks-ala
Copy link

timhicks-ala commented Dec 18, 2024

From https://support.ehelp.edu.au/a/tickets/220943:

"There are two files, one for our tissues data and one for everything else. They are both intended as complete uploads to replace the appropriate existing data."

sam ala 2024-12-18.zip

Additional record removal from: https://support.ehelp.edu.au/a/tickets/222192

This record needs to be removed:

https://biocache.ala.org.au/occurrences/7a20dc1a-30ce-457c-b03c-18db2b56f8c7

@timhicks-ala
Copy link
Author

Note that they have requested R71850 be removed from the dataset: https://support.ehelp.edu.au/a/tickets/222192

@rosemaryjoconnor rosemaryjoconnor changed the title SA Museum data load: 18/12/2024 Freshdesk: #220943 & #222192 SA Museum data load: 18/12/2024 Jan 7, 2025
@rosemaryjoconnor
Copy link
Contributor

rosemaryjoconnor commented Jan 7, 2025

07/01/2024

Code: https://github.com/AtlasOfLivingAustralia/databox/blob/master/data-resources/dr346-SAMA/sourceCode/dr346-SAMA-update.py

Data has additional backslashes and double-quote characters in the Locality field that cause issues with loading the csv files to dataframe. Also some spurious newlines within cells.
Code removes all of the above before processing.

  • Removed record with catalogNumber: R71850
  • Clean and merge datasets.
  • Test in databox
  • Load production
  • Index check

Record counts

  • Current Prod: 565,382
  • New: 570,362

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants