Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import validation updates #1227

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

vish-cs
Copy link
Contributor

@vish-cs vish-cs commented Feb 4, 2025

  • Invoke dc import tool to generate resolved mcf
  • Upload validation output files to GCS
  • Validation status check for upload
  • Default executor type: CLOUD_RUN

@vish-cs vish-cs requested a review from ajaits February 4, 2025 13:55
@vish-cs vish-cs self-assigned this Feb 4, 2025
@vish-cs vish-cs force-pushed the executor branch 2 times, most recently from 63babc5 to 222cd59 Compare February 9, 2025 06:31
output_file = open(self.validation_output, mode='w', encoding='utf-8')
output_file.write('test,status,message\n')
status = True
for config in self.validation_config:
result = self._run_validation(config)
output_file.write(
f'{result.name},{result.status},{result.message}\n')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a csv row. If yes, comma in result.message might interfere with parsing later. Perhaps consider usign a csv lib. An example code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are yet to work on refining the messages. They are currently dummy messages. I have added a TODO to switch to a CSV lib as a follow up when we do that.

config_file)
import_prefix, 'validation')
current_data_path = os.path.join(validation_output_path, '*.mcf')
previous_data_path = latest_version + f'/{import_prefix}/validation/*.mcf'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous data path might not contain mcfs (for already onbaorded auto-imports). Should we skip if mcfs are missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are uploading mcfs as part of the import output upload step now. In that case, why shall we not find mcfs in GCS?

- Invoke dc import tool to generate resolved mcf
- Upload validation output files to GCS
- Validation status check for upload
- Default executor type: CLOUD_RUN
)
return
# blob.download_to_filename(previous_data_path)
'Skipping validation due to missing latest version.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the summary_report.csv be validated without the differ for the first run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will require some logic to only run certain validations (and skip the differ ones). We can do that as a follow up as needed.

self.config.import_metadata_mcf_filename))
else:
logging.error(
"Skipping latest version update due to validation failure.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be better to upload output to GCS even if validation fails to help with debugging.
We can skip updating latest_version.txt if validation fails so the next diff is with the previous valid output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the case actually. We are uploading validation output files as the last step in _invoke_import_validation but not updating latest_version in case of failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants