Fix index generation for Sitemapper.stream#1
Open
newtrat wants to merge 3 commits into
Open
Conversation
Re-organizes Builder and Streamer into InMemoryBuilder and StreamBuilder respectively Updates file generation for streaming so that: - The last file includes a page number, instead of just being named sitemap.xml - The index includes all generated pages Needs tests and possibly test fixes as well
This reverts commit 76ff1d6.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Re-organizes Builder and Streamer into InMemoryBuilder and StreamBuilder respectively.
Updates file generation for streaming so that:
Also removes the encoding option passed when generating XML files, because:
Motivation and Context
Streamer was working fine for generating sitemap files themselves, but it didn't interact well with Builder's
generateandgenerate_indexmethods. The last generated sitemap would have no page number, and the generated index only knew about that single, last page. Modifying these to work for streamed sitemaps reduced Builder's and Streamer's overlap enough that I thought they belonged as two subclasses of a single base class, rather than having Streamer as a subclass of Builder. Their shared functionality is now inside Builder, and they themselves have been renamed StreamBuilder and InMemoryBuilder respectively.I don't have a minimal reproduction ready yet for demonstrating the libxml2 crash, because I'm using sitemapper inside a private repo. However, I could definitely make one and report it as an issue if you'd like. It seems to happen only when generating the second or later pages of large sitemaps.