Skip to content

Add utility to deduplicate ZIM items and replace them with redirects at ZIM creation time #261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 9, 2025

Conversation

benoit74
Copy link
Collaborator

@benoit74 benoit74 commented May 6, 2025

Fix #33

Note that this is kinda a resurrection of #86 where important things have already been discussed (e.g. the fact that we do not want to extend the Creator API but add a new distinct API to better trace memory issues)

@benoit74 benoit74 self-assigned this May 6, 2025
@benoit74
Copy link
Collaborator Author

benoit74 commented May 6, 2025

@rgaudin can you give me a first feedback on the proposed API.

Code is still missing any test and versions of new libraries in pyproject.toml are wrong, but I prefer to first get feedback on the API since I'm really not convinced this is the optimal approach, but I fail to find something more convenient

Copy link

codecov bot commented May 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (dc21be3) to head (f86b79c).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #261   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           40        41    +1     
  Lines         2480      2512   +32     
  Branches       334       339    +5     
=========================================
+ Hits          2480      2512   +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@benoit74
Copy link
Collaborator Author

benoit74 commented May 6, 2025

And btw, should we prefer alias to redirect? It is still unclear to me when we should prefer one to the other

Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ; thank you.

I appreciate that it's completely optional, explicit, independent and discrete.
We need to use it in the wild to see how it performs now.

@benoit74 benoit74 force-pushed the dedup_zim_items branch 2 times, most recently from 37b6539 to 11f4df5 Compare May 9, 2025 11:36
@benoit74 benoit74 marked this pull request as ready for review May 9, 2025 11:39
@benoit74 benoit74 force-pushed the dedup_zim_items branch from 11f4df5 to f86b79c Compare May 9, 2025 11:43
@benoit74 benoit74 changed the title Add utility to deduplicate ZIM items and replace them with redirects at ZIM creation time (WIP) Add utility to deduplicate ZIM items and replace them with redirects at ZIM creation time May 9, 2025
@benoit74 benoit74 merged commit 001108d into main May 9, 2025
9 checks passed
@benoit74 benoit74 deleted the dedup_zim_items branch May 9, 2025 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically redirect to articles with same checksum
2 participants