Skip to content

Conversation

@erikamov
Copy link
Contributor

@erikamov erikamov commented Oct 22, 2025

Description

This PR adds documentation on how can we run download_schedule_feeds from command line.

Also small changes:

  • The description about how long the process take to run was wrong: INFO - took 2 minutes ago to process 272 configs. Replaced humanize.naturaltime by humanize.naturaldelta so the message will remove the word ago.
  • Using print instead of logging.info so we can visualize the progress on terminal
  • Commenting out xcom_push to not send error message when running through the command line (temporarily disabled)

[#4354]

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

How has this been tested?

Tested locally, running GOOGLE_CLOUD_PROJECT=cal-itp-data-infra CALITP_BUCKET__GTFS_DOWNLOAD_CONFIG="gs://calitp-gtfs-download-config" CALITP_BUCKET__GTFS_SCHEDULE_RAW="gs://calitp-gtfs-schedule-raw-v2" poetry run python download_schedule_feeds.py

GOOGLE_CLOUD_PROJECT=cal-itp-data-infra-staging CALITP_BUCKET__GTFS_DOWNLOAD_CONFIG="gs://calitp-staging-gtfs-download-config" CALITP_BUCKET__GTFS_SCHEDULE_RAW="gs://calitp-staging-gtfs-schedule-raw-v2" poetry run python download_schedule_feeds.py

Post-merge follow-ups

  • No action required
  • Actions required (specified below)

@github-actions
Copy link

github-actions bot commented Oct 22, 2025

Terraform plan in iac/cal-itp-data-infra/airflow/us

Plan: 0 to add, 4 to change, 0 to destroy.
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
!~  update in-place

Terraform will perform the following actions:

  # google_storage_bucket_object.calitp-composer["dags/download_gtfs_schedule_v2/README.md"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer" {
!~      crc32c              = "KHF6mA==" -> (known after apply)
!~      detect_md5hash      = "3HjxKRPPTT8Y84NgGgLIAA==" -> "different hash"
!~      generation          = 1751416670936608 -> (known after apply)
        id                  = "calitp-composer-dags/download_gtfs_schedule_v2/README.md"
!~      md5hash             = "3HjxKRPPTT8Y84NgGgLIAA==" -> (known after apply)
        name                = "dags/download_gtfs_schedule_v2/README.md"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-composer["dags/download_gtfs_schedule_v2/download_schedule_feeds.py"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer" {
!~      crc32c              = "3CFrIg==" -> (known after apply)
!~      detect_md5hash      = "tuDGKx58gvxzc6Anuo4Sxg==" -> "different hash"
!~      generation          = 1751416672559844 -> (known after apply)
        id                  = "calitp-composer-dags/download_gtfs_schedule_v2/download_schedule_feeds.py"
!~      md5hash             = "tuDGKx58gvxzc6Anuo4Sxg==" -> (known after apply)
        name                = "dags/download_gtfs_schedule_v2/download_schedule_feeds.py"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-composer-catalog will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer-catalog" {
!~      content             = (sensitive value)
!~      crc32c              = "HFLpFA==" -> (known after apply)
!~      detect_md5hash      = "AxlvqGlF0XQytL7X6uBrow==" -> "different hash"
!~      generation          = 1761674676592506 -> (known after apply)
        id                  = "calitp-composer-data/warehouse/target/catalog.json"
!~      md5hash             = "AxlvqGlF0XQytL7X6uBrow==" -> (known after apply)
        name                = "data/warehouse/target/catalog.json"
#        (16 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-composer-manifest will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-composer-manifest" {
!~      content             = (sensitive value)
!~      crc32c              = "JvfK6A==" -> (known after apply)
!~      detect_md5hash      = "/ncG1pOau+Ob3onRgEJBiQ==" -> "different hash"
!~      generation          = 1761674678163486 -> (known after apply)
        id                  = "calitp-composer-data/warehouse/target/manifest.json"
!~      md5hash             = "/ncG1pOau+Ob3onRgEJBiQ==" -> (known after apply)
        name                = "data/warehouse/target/manifest.json"
#        (16 unchanged attributes hidden)
    }

Plan: 0 to add, 4 to change, 0 to destroy.

📝 Plan generated in Plan Terraform for Warehouse and DAG changes #885

@github-actions
Copy link

github-actions bot commented Oct 22, 2025

Terraform plan in iac/cal-itp-data-infra-staging/airflow/us

Plan: 0 to add, 2 to change, 0 to destroy.
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
!~  update in-place

Terraform will perform the following actions:

  # google_storage_bucket_object.calitp-staging-composer["dags/download_gtfs_schedule_v2/README.md"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-staging-composer" {
!~      crc32c              = "KHF6mA==" -> (known after apply)
!~      detect_md5hash      = "3HjxKRPPTT8Y84NgGgLIAA==" -> "different hash"
!~      generation          = 1749661095007111 -> (known after apply)
        id                  = "calitp-staging-composer-dags/download_gtfs_schedule_v2/README.md"
!~      md5hash             = "3HjxKRPPTT8Y84NgGgLIAA==" -> (known after apply)
        name                = "dags/download_gtfs_schedule_v2/README.md"
#        (17 unchanged attributes hidden)
    }

  # google_storage_bucket_object.calitp-staging-composer["dags/download_gtfs_schedule_v2/download_schedule_feeds.py"] will be updated in-place
!~  resource "google_storage_bucket_object" "calitp-staging-composer" {
!~      crc32c              = "3CFrIg==" -> (known after apply)
!~      detect_md5hash      = "tuDGKx58gvxzc6Anuo4Sxg==" -> "different hash"
!~      generation          = 1749661091724383 -> (known after apply)
        id                  = "calitp-staging-composer-dags/download_gtfs_schedule_v2/download_schedule_feeds.py"
!~      md5hash             = "tuDGKx58gvxzc6Anuo4Sxg==" -> (known after apply)
        name                = "dags/download_gtfs_schedule_v2/download_schedule_feeds.py"
#        (17 unchanged attributes hidden)
    }

Plan: 0 to add, 2 to change, 0 to destroy.

📝 Plan generated in Plan Terraform for Warehouse and DAG changes #885

@vevetron
Copy link
Contributor

We should document the instructions for running this in prod as well.

@erikamov erikamov force-pushed the mov/4354-run-download-schedule-feed branch 2 times, most recently from 855d988 to ffe62e3 Compare October 23, 2025 00:42
@github-actions
Copy link

github-actions bot commented Oct 23, 2025

Terraform plan in iac/cal-itp-data-infra/composer/us

No changes. Your infrastructure matches the configuration.
No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration
and found no differences, so no changes are needed.

📝 Plan generated in Plan Terraform for Warehouse and DAG changes #885

@erikamov erikamov force-pushed the mov/4354-run-download-schedule-feed branch 6 times, most recently from 23ecb22 to f07e267 Compare October 27, 2025 18:33
@erikamov erikamov force-pushed the mov/4354-run-download-schedule-feed branch from f07e267 to 1ebff49 Compare October 29, 2025 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants