Skip to content

Versioned personal data stored by PaperTrail left behind #11

@ahukkanen

Description

@ahukkanen

Decidim stores versions of the user data through PaperTrail as part of the Decidim::Traceable module as defined here:
https://github.com/decidim/decidim/blob/bfc862f2308c3215c52b324e16de8680ed64fe16/decidim-core/lib/decidim/traceable.rb#L19

The data is stored to the versions table in the database within the object_changes column in YML format e.g. as follows:

id:                                                                   
-                                                                     
- 221                                                                 
email:                                                                
- ''                                                                  
- testuser@example.org                   
encrypted_password:                                                   
- ''                                                                  
- "$2a$11$R4LAq...L690yl8Sy7VUw6vB.6"      
created_at:                                                           
-                                                                     
- !ruby/object:ActiveSupport::TimeWithZone                            
  utc: &1 2023-03-01 09:59:33.930443106 Z                             
  zone: &2 !ruby/object:ActiveSupport::TimeZone                       
    name: Etc/UTC
  time: 2023-03-01 09:59:33.930443106 Z
updated_at:
- 
- !ruby/object:ActiveSupport::TimeWithZone
  utc: *1
  zone: *2
  time: 2023-03-01 09:59:33.930443106 Z
decidim_organization_id:
- 
- 1
confirmed_at:
- 
- !ruby/object:ActiveSupport::TimeWithZone
  utc: 2023-03-01 09:59:33.830071364 Z
  zone: *2
  time: 2023-03-01 09:59:33.830071364 Z
name:
- 
- Lisabeth Schiller 4 4 endr4
nickname:
- ''
- coleman_gleichner
type:
- 
- Decidim::User

You can find this data e.g. with the following command from the rails console:

Decidim::User.all.sample.versions[0].object_changes

I think this data should be also cleared up for deleted users after a certain period of time as it contains personal details especially when applied to the user related models.

Note that this data can be sometimes useful to trace back the changes in the user model, e.g. in case we are accidentally deleting some account or in case we need to investigate some issue with the account.

I would suggest that there would be a defined (preferrably configurable) "cutoff" period after which the versioned user data would be also deleted for deleted accounts.

Note that this same issue also applies for the Decidim::Authorization model which also holds personal data. Those records can be already deleted by admins from the admin panel but the versions table is not currently cleaned after the removal. A similar "cutoff" period should also apply to the versioned authorization data.

To fetch the version data for deleted user accounts:

PaperTrail::Version.joins(
  <<~SQL.squish
    INNER JOIN decidim_users ON decidim_users.id = versions.item_id
      AND versions.item_type IN ('Decidim::User', 'Decidim::UserBaseEntity')
  SQL
).where.not(decidim_users: { deleted_at: nil })

To fetch the version data for deleted authorizations:

PaperTrail::Version.joins(
  <<~SQL.squish
    LEFT JOIN decidim_authorizations ON decidim_authorizations.id = versions.item_id
      AND versions.item_type = 'Decidim::Authorization'
  SQL
).where(item_type: "Decidim::Authorization", decidim_authorizations: { id: nil })

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions