Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unused fields on retention purge #789

Open
logut opened this issue Apr 29, 2024 · 4 comments
Open

Remove unused fields on retention purge #789

logut opened this issue Apr 29, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@logut
Copy link

logut commented Apr 29, 2024

When doing a retention cleanup the fields added automatically should be removed if no existing events is using them.
Or at least add an API endpoint to remove a field from a stream.

@nitisht nitisht added the bug Something isn't working label Apr 30, 2024
@MKLeb
Copy link

MKLeb commented Apr 30, 2024

+1 here, would be nice to have the fields cleaned up if nothing is using them.

@mrchypark
Copy link

I believe that before archiving data and removing columns, it is crucial to give ample notice and opportunity for teams to update any code that depends on those columns. If there isn't a clear process to communicate which columns will be removed and ensure necessary code changes are made, then I don't think the columns should be eliminated.
Archiving data is important, but it has to be done carefully and with plenty of advance notice. Abruptly removing columns could break existing functionality and lead to disruptions. The teams responsible for the depending code should have sufficient time to adapt before the archiving process moves forward.

@nikhilsinhaparseable
Copy link
Contributor

@logut @MKLeb @mrchypark Thank you all for the comments.
If we remove the fields and event is ingested with the same fields, Parseable adds the fields to the schema again as schema is always evolving as and when the event comes.
But, instead of deleting a few fields from the schema which can become tricky, the best we can do is to check if there are no events left in the stream and then delete all the schema fields so that fresh events reconstruct the schema.

Do let us know your thoughts on this.

@logut
Copy link
Author

logut commented Jun 12, 2024

Isn't this almost the same as deleting the stream entirely and to let the first event create the stream and setting the fields ? I prefer to be able to delete unused fields while keeping recent events in the stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants