Skip to content

Conversation

emiglietta
Copy link
Contributor

add clarification on how to verify correct transfer to staging bucket

add clarification on how to verify correct transfer to staging bucket
Copy link
Member

@ErinWeisbart ErinWeisbart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks or adding this! I simplified and clarified a bit through suggestions.

- Total file size on origin: `du -sh --apparent-size PATH/TO/YOUR/FILES`
- Number of files on origin: `find PATH/TO/YOUR/FILES -type f | wc -l`
with
- Total object size on the Staging bucket: `aws s3 ls s3://staging-cellpainting-gallery/$PROJECT_PREFIX/$SOURCE/$YOUR_FILES --summarize --human-readable --recursive | grep Total`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Total object size on the Staging bucket: `aws s3 ls s3://staging-cellpainting-gallery/$PROJECT_PREFIX/$SOURCE/$YOUR_FILES --summarize --human-readable --recursive | grep Total`

@@ -105,4 +105,15 @@ Run your transfer commands to `staging-cellpainting-gallery`.

Once the transfers are complete, either you (Imaging Platform internal) or your data champion (if external) must verify the data transferred to `staging-cellpainting-gallery` is complete.
(Currently this is done manually, though this will be programatic in the future.)

To verify if the transfer was succesful you (Imaging Platform internal) can compare:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To verify if the transfer was succesful you (Imaging Platform internal) can compare:
To verify if the transfer was successful, compare object counts between your source and destination.
Because of differences in the way file sizes are calculated between file systems and object storage, file size is not a reliable metric for comparison.

@@ -105,4 +105,15 @@ Run your transfer commands to `staging-cellpainting-gallery`.

Once the transfers are complete, either you (Imaging Platform internal) or your data champion (if external) must verify the data transferred to `staging-cellpainting-gallery` is complete.
(Currently this is done manually, though this will be programatic in the future.)

To verify if the transfer was succesful you (Imaging Platform internal) can compare:
- Total file size on origin: `du -sh --apparent-size PATH/TO/YOUR/FILES`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Total file size on origin: `du -sh --apparent-size PATH/TO/YOUR/FILES`


To verify if the transfer was succesful you (Imaging Platform internal) can compare:
- Total file size on origin: `du -sh --apparent-size PATH/TO/YOUR/FILES`
- Number of files on origin: `find PATH/TO/YOUR/FILES -type f | wc -l`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Number of files on origin: `find PATH/TO/YOUR/FILES -type f | wc -l`
- Number of files on origin (for a file system): `find PATH/TO/YOUR/FILES -type f | wc -l`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants