Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement user event import from Bigquery #351

Merged
merged 1 commit into from
Nov 28, 2024
Merged

Conversation

csutter
Copy link
Contributor

@csutter csutter commented Nov 27, 2024

Rewrite the existing Python Google Cloud function as a Ruby service class. This receives an event type and a date, and makes the necessary request to Discovery Engine to import BigQuery user event data.

  • Add DiscoveryEngine::UserEvents::Import service to implement importing of user events from BigQuery along the lines of the original Python code
  • Add GOOGLE_CLOUD_PROJECT_ID app configuration
  • Add a rake task to call the service
  • Ensure tests require Google API namespace for stubbing

see the vertex_events_push function on the infrastructure repo: https://github.com/alphagov/search-v2-infrastructure/blob/2991588b5dae11a20fc80432393fd134c7acb53c/terraform/environment/files/vertex_events_push/main.py

@csutter csutter force-pushed the user-event-import branch 2 times, most recently from bf7626d to 9de4f1a Compare November 27, 2024 11:29
csutter added a commit to alphagov/govuk-helm-charts that referenced this pull request Nov 27, 2024
This adds two scheduled tasks to run the two user events import Rake
tasks, in integration only for now to verify it all works properly.

These will eventually replace the following GCP Cloud Scheduler runs:
https://github.com/alphagov/search-v2-infrastructure/blob/main/terraform/environment/events_ingestion.tf#L366-L484

See alphagov/search-api-v2#351
@jackbot
Copy link

jackbot commented Nov 28, 2024

Nice one. Code looks good to me. 👍

Is the idea that each day we'll run the rake task for the intraday events multiple times a day (using rake user_events:import_intraday_events) then once per day we'll import the previous day's events (using rake user_events:import_yesterdays_events)?

spec/spec_helper.rb Outdated Show resolved Hide resolved
Rewrite the existing Python Google Cloud function as a Ruby service
class. This receives an event type and a date, and makes the necessary
request to Discovery Engine to import BigQuery user event data.

- Add `DiscoveryEngine::UserEvents::Import` service to implement
  importing of user events from BigQuery along the lines of the original
  Python code
- Add `GOOGLE_CLOUD_PROJECT_ID` app configuration
- Add a rake task to call the service
- Ensure tests `require` Google API namespace for stubbing

see the `vertex_events_push` function on the infrastructure repo:
https://github.com/alphagov/search-v2-infrastructure/blob/2991588b5dae11a20fc80432393fd134c7acb53c/terraform/environment/files/vertex_events_push/main.py

Co-Authored-By: Chae Cramb <[email protected]>
@csutter
Copy link
Contributor Author

csutter commented Nov 28, 2024

Is the idea that each day we'll run the rake task for the intraday events multiple times a day (using rake user_events:import_intraday_events) then once per day we'll import the previous day's events (using rake user_events:import_yesterdays_events)?

Spot on – I've already drafted scheduled tasks for this in integration: https://github.com/alphagov/govuk-helm-charts/pull/2803/files

And then there is a third Rake task for a specific date which is something that has occasionally come up as a manual task that needs doing (for example, if the import fails for some reason). This way someone can manually execute the Rake task in an environment.

Copy link

@jackbot jackbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely stuff

@csutter csutter merged commit 3d69d7c into main Nov 28, 2024
8 checks passed
@csutter csutter deleted the user-event-import branch November 28, 2024 11:37
chaecramb added a commit to alphagov/govuk-helm-charts that referenced this pull request Dec 3, 2024
This adds two scheduled tasks to run the two user events import Rake
tasks. It has already been deployed and tested in integration.

These replace the following GCP Cloud Scheduler runs:
https://github.com/alphagov/search-v2-infrastructure/blob/main/terraform/environment/events_ingestion.tf#L366-L484

See alphagov/search-api-v2#351
chaecramb added a commit to alphagov/govuk-helm-charts that referenced this pull request Dec 3, 2024
chaecramb added a commit to alphagov/govuk-helm-charts that referenced this pull request Dec 4, 2024
This adds two scheduled tasks to run the two user events import Rake
tasks. It has already been deployed and tested in integration.

These replace the following GCP Cloud Scheduler runs:
https://github.com/alphagov/search-v2-infrastructure/blob/main/terraform/environment/events_ingestion.tf#L366-L484

See alphagov/search-api-v2#351
chaecramb added a commit to alphagov/govuk-helm-charts that referenced this pull request Dec 5, 2024
This adds two scheduled tasks to run the two user events import Rake
tasks. It has already been deployed and tested in integration.

These replace the following GCP Cloud Scheduler runs:
https://github.com/alphagov/search-v2-infrastructure/blob/main/terraform/environment/events_ingestion.tf#L366-L484

See alphagov/search-api-v2#351
chaecramb added a commit to alphagov/search-v2-infrastructure that referenced this pull request Dec 5, 2024
The functionality of these Google Cloud Functions has been integrated
directly into the Search API repo via two rake tasks
alphagov/search-api-v2#351.
chaecramb added a commit to alphagov/search-v2-infrastructure that referenced this pull request Dec 13, 2024
The functionality of these Google Cloud Functions has been integrated
directly into the Search API repo via two rake tasks
alphagov/search-api-v2#351.
chaecramb added a commit to alphagov/search-v2-infrastructure that referenced this pull request Dec 13, 2024
The functionality of this Google Cloud Function has been integrated
directly into the Search API repo via two rake tasks
alphagov/search-api-v2#351.

This commit removes the function itself and related Terraform
scheduling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants