-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check WMArchive doc size and cut off big docs #11967
Conversation
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one component configuration to be fixed.
In addition, the problem with this solution is that we will keep piling up large documents in the database, and the component will continue to load them cycle after cycle.
A better solution would be to find the worst offender field, truncate it and move on with document injection.
However, Andrea and I have been trying to put this on hold such that we can focus on the containerization. Given that this problem is under control at the moment, I'd suggest to come back to this in the future (or if things are still failing around).ntrol and we can get back to this once we are done with other developments. And
src/python/WMComponent/ArchiveDataReporter/ArchiveDataPoller.py
Outdated
Show resolved
Hide resolved
Alan, I doubt your suggestion for better solution stands. We still need to find what is causing large data size and for that we need somehow to identify rejected docs. Without actual details how you want to hunt for worst offenders (please keep in mind that we may have combination of fields contributing to large data size) I think there is no such solution, we still need to reject and dump these docs to get full sense of the actual issue. |
Jenkins results:
|
@vkuznet In my understanding, the situation is similar to three weeks ago, so probably it would be better to keep this issue on hold until we finalize the containerization problem, even though I see your concern |
As briefly discussed in the issue, I am closing this PR. Eventually we will revisit all this WMArchive integration. Thanks Valentin! |
Fixes #11960
Status
In development
Description
Provide check for newly created WMArchive document before sending them to WMArchive service. The size threshold can be configured and by default equal to 8MB (current threshold on CMSWEB nginx). To avoid flooding log with very large docs a short version of the WMArchive document (a slice) is provided and printed out to the logger together with full size of the document and used threshold.
Is it backward compatible (if not, which system it affects?)
YES
Related PRs
External dependencies / deployment changes