Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add query to remove user ids from filepaths #1259

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions postgresql/migratedb.d/17.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@

DO
$$
DECLARE
-- The version we know how to do migration from, at the end of a successful migration
-- we will no longer be at this version.
sourcever INTEGER := 16;
changes VARCHAR := 'Remove user ids from filepaths';
BEGIN
IF (select max(version) from sda.dbschema_version) = sourcever then
RAISE NOTICE 'Changes: %', changes;

UPDATE sda.files
SET submission_file_path = regexp_replace(submission_file_path, '^[^/]*/', '');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution is a bit harsh since it will remove the first part of the path even if is not a username.

Suggested change
SET submission_file_path = regexp_replace(submission_file_path, '^[^/]*/', '');
SET submission_file_path = REPLACE(submission_file_path, CONCAT(REPLACE(submission_user, '@', '_'),'/'), '');;

Copy link
Contributor Author

@pahatz pahatz Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I went this way is that we have some old entries in BP that are missing the @ domain suffix in users and the corresponding _domain suffix in file path.
But the following seems to work for those cases too:

UPDATE sda.files
SET submission_file_path = CASE
	WHEN LENGTH(submission_file_path) = 40 THEN regexp_replace(submission_file_path, '[@_]/^[^/]', '')
	ELSE REPLACE(submission_file_path, CONCAT(REPLACE(submission_user, '@', '_'),'/'), '')
END;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this maybe sholdn't even be a .sql file by´ut rather a readme so the DB admin knows that it needs to be done. Because this is not something that should be allowed to happen automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's discuss this on a daily.


ELSE
RAISE NOTICE 'Schema migration from % to % does not apply now, skipping', sourcever, sourcever+1;
END IF;
END
$$
Loading