WIP: 3056 compact db by copying single docs #3082
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Reduce disk space by coping over docs to remove revs and then copying them back.
Type of Change
Proposed Solution
Thread w/ NH:
We have some tabs that have used almost of the available space. At one point we had revs_limit: 1 in our options, but somehow it stopped working, so there are many revisions that are taking up a lot of space. Anyway, we tried using the pouch compact() function; however, we get a memory error (range error) when using it on a huge db with over 50,000 documents.
We have tried looping though the changes feed in batches of 200 (limit:200 and using a placeholder for the since property) and doing the following for each change:
Replicate the doc to the target db that uses the pouchdb memory adapter configured with the revs_limit=1 setting
Delete the doc from the source db
Replicate from the target memory db that same doc.
Using the auto_compaction:true flag on the source db.
Does not work – the deleted flag is still set in a new revision.
The goal is to not create any new revs so we don’t have a ton of docs to sync from our tabs in the field, so removing the deleted property won’t help us.
Do you have any suggestions how we could do this?
I’m currently trying this out:
loop through alldocs instead of changes
create a normal file-based Pouch target db instead of memory adapter based.
Replicate each doc, and then delete it from source db. The tab is probably tight on disk space, so the hope is that by using the auto_compaction flag that when the doc is delete, most of its disk space will be free’d up. (We understand that there is a little space consumed by the _revisions status for each deleted doc...) This process will build a mirror of the source db that has only 1 rev per doc.
Once this process is complete, the source db will be empty.
Delete the source db.
Create a new source db (same name) and replicate data from the target db to this new source db – again, one doc at a time. Delete the doc on the target db after each replication to the new source db.
Once that is complete, the target db will be empty and can be deleted.
Of course it would be much easier to do this if we could replicate the whole source db to the target, but since disk space is so low, it is not possible.