Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg_cleanup fails with psycopg2 exception if used with --object-store-id #19880

Closed
bernt-matthias opened this issue Mar 24, 2025 · 3 comments · Fixed by #19925
Closed

pg_cleanup fails with psycopg2 exception if used with --object-store-id #19880

bernt-matthias opened this issue Mar 24, 2025 · 3 comments · Fixed by #19925
Assignees
Labels
area/database Galaxy's database or data access layer kind/bug

Comments

@bernt-matthias
Copy link
Contributor

Describe the bug

During pg_cleanup with --object-store-id "files-24.1-scratch" disk usage is recalculated with

UPDATE galaxy_user SET disk_usage = (
    WITH per_user_histories AS (
        SELECT id
        FROM history
        WHERE user_id = %(id)s
        AND NOT purged
    ),
    per_hist_hdas AS (
        SELECT DISTINCT dataset_id
        FROM history_dataset_association
        WHERE NOT purged
        AND history_id IN (SELECT id FROM per_user_histories)
    )
    SELECT COALESCE(SUM(COALESCE(dataset.total_size, dataset.file_size, 0)), 0)
    FROM dataset
    LEFT OUTER JOIN library_dataset_dataset_association ON dataset.id = library_dataset_dataset_association.dataset_id
    WHERE dataset.id IN (SELECT dataset_id FROM per_hist_hdas)
    AND library_dataset_dataset_association.id IS NULL
    AND (dataset.object_store_id NOT LIKE 'user_objects://%' OR dataset.object_store_id IS NULL)
    AND ( dataset.object_store_id IS NULL  OR  dataset.object_store_id NOT IN %(exclude_object_store_ids)s )
)
WHERE id = %(id)s

Called with args={'id': 1, 'exclude_object_store_ids': ('files-24.1', 'files-24.1-scratch', 'legacy')}

This fails with

  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 1328, in _execute
    sql_str = cur.mogrify(sql, args).decode("utf-8")
              ^^^^^^^^^^^^^^^^^^^^^^
psycopg2.ProgrammingError: argument formats can't be mixed

Galaxy Version and/or server at which you observed the bug
Galaxy Version: 24.1
Commit: 4607d88

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Additional context

My object store config:

  object_store_config:
    type: distributed
    search_for_missing: true
    backends:
      - id: "files-24.1"
        type: disk
        device: "files-24.1"
        weight: 1
        store_by: uuid
        allow_selection: true
        private: false
        name: "Permanent Storage"
        description: Data in Permanent Storage is not deleted automatically. Default quota is {{ galaxy_default_quota }}. Data that has been marked as deleted will be purged after {{ galaxy_maxage }} days. Histories of users that left the UFZ will be archived and purged.
        files_dir: database/files_24.1/
        badges:
          - type: not_backed_up
      - id: "files-24.1-scratch"
        type: disk
        device: "files-24.1"
        weight: 0
        store_by: uuid
        allow_selection: true
        private: true
        quota:
          source: scratch
        name: "Scratch: {{ galaxy_scratch_maxage }}-day storage"
        description: "Data in scratch storage is scheduled for automatic removal after {{ galaxy_scratch_maxage }} days. Default quota is {{ galaxy_scratch_default_quota }}."
        files_dir: database/files_24.1/
        badges:
          - type: not_backed_up
          - type: short_term
            message: "Data stored here is scheduled for removal after 30 days"
      - id: legacy
        type: disk
        store_by: id
        weight: 0
        files_dir: database/files/
@mvdbeek mvdbeek added kind/bug area/database Galaxy's database or data access layer labels Mar 25, 2025
@jdavcs
Copy link
Member

jdavcs commented Mar 26, 2025

Reproduced, although indirectly. @bernt-matthias can you, please, provide the full command you used to call the script and the complete error traceback?

@bernt-matthias
Copy link
Contributor Author

Here are two examples

  • ./scripts/cleanup_datasets/pgcleanup.py --dry-run -o 1 --object-store-id files-24.1-scratch -l /var/log/galaxy/cleanup-galaxy-dev purge_error_hdas
  • ./scripts/cleanup_datasets/pgcleanup.py --dry-run -o 1 --object-store-id files-24.1-scratch -l /var/log/galaxy/cleanup-galaxy-dev purge_old_hdas

I also tested without --dry-run.

The traceback:

2025-03-27 10:12:03,749 INFO  recalculate_disk_usage(): Recalculating disk usage for users whose data were purged
==== Log closed: 2025-03-27T10:12:03.749653 ============================
2025-03-27 10:12:03,749 ERROR <module>(): Caught exception in run sequence:
Traceback (most recent call last):
  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 1390, in <module>
    app.run()
  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 1383, in run
    self._run_action(action)
  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 1374, in _run_action
    action.handle_results(cur)
  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 247, in handle_results
    method()
  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 410, in recalculate_disk_usage
    self._update(sql, new_args, add_event=False)
  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 1355, in _update
    cur = self._execute(sql, args)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/gpfs1/data/galaxy_server/galaxy-dev/./scripts/cleanup_datasets/pgcleanup.py", line 1326, in _execute
    sql_str = cur.mogrify(sql, args).decode("utf-8")
              ^^^^^^^^^^^^^^^^^^^^^^
psycopg2.ProgrammingError: argument formats can't be mixed

@jdavcs
Copy link
Member

jdavcs commented Mar 31, 2025

Fixed in #19925

@jdavcs jdavcs closed this as completed Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/database Galaxy's database or data access layer kind/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants