-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Fix download issues #1874
WIP: Fix download issues #1874
Conversation
I made progress last night on figuring out where the invalid download index issue is. Essentially, when the download list looks for pending downloads, it’s using the list row index minus the total active downloads to hope that that will align to the vector index of pending downloads. My plan is to create/update a signal/slot to manage a mapping of pending downloads on the list side based on the lists row index and pending downloads index. Which will have to be continually updated when the index row updates. |
Interesting that this was commented out. I wonder if there was a reason for that... |
Right? I looked back to see if there was any notes, but looks like it'd been commented out for some time now, like over 5+ years. I've been running it locally, haven't seen any issues so far with it on. I've tested like 20 mods in download and turning off my wifi causes them all to be paused, which that can then be resumed. Would be nice if there was a resume all, but for now this should be good. |
Has anyone looked through git blame to see what the last few commits to touch the relevant lines claimed they were doing? It can be a bit of a pain sometimes as the repo split means you may have to track down an equivalent commit in the old unsplit repo under Tannin's org, but there's some kind of explanation reasonably often. |
Just pushed an update that includes the unique id for downloads. Most methods are now set to use the DownloadId instead of the index. There's a couple I've left that may still need it, but for the most part I've been able to get everything to start using the DownloadId. Also, the DownloadList has it's own m_downloads model of type QList. This will allow the list to keep track of the downloads via DownloadId and it's row. |
Just looked through the code and found the commit. Seems this was originally used to try and fix stuck downloads. It was then commented out due to downloads getting stuck when paused. So, the implementation in use here is reusing a similar idea to pause when a timeout occurs, but removed the downloadFinished and resumeDownload calls so that it only pauses on timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a "code style" review - I'll have to test the code to see if I can find issue. Right now, the changes make senses, except for most of the flow of DownloadList
.
You also need to apply clang-format when committing.
That makes a lot of since. Looking through this it does seem a bit hobbled together from different evolutions of past refactoring attempts. So, what I've tried to do is separate the download manager from the view. That's why there's now two different lists, one to keep track of what's in the view vs one that is keeping track of the download files and its respective meta files. If this is not the direction that you'd like to go, that's ok, but I would like clear direction before I move forward. If you'd rather just leave it the way downloads are now and just patch the index issue that's ok too. |
Having two lists trying to be in sync. is just asking for troubles in the long run - There should be a single list of downloads like there is a single list of mods, then you can proxy it via a Qt model to display in the view. Right now (in For me, it'd be better to just fix the index issue and do a complete overall of the manager in a later PR. |
I think the only thing I'm not sure about is why we're maintaining a second set of lists in the model class. I think this can be reworked to just get the download list from the manager. The main reason why I have two lists in my revision above is because you have to maintain parity between the view rows and the model data which requires an indexed list. But the uuids ensure we're always referencing the same download. So having a map indexed to the uuids makes sense to improve lookup efficiency. I think keeping the two in sync within the download manager isn't all that difficult. (Just take a look at the modlist, pretty sure there are three lists in that one.) If we can just rework the list model to pull the data from the manager rather than maintaining another set of lists, that will make things simpler I think. |
I think the main difficulty with that is that the tree view does require certain signals to be sent from the data class to update the rows after additions and deletions. |
98a232c corrects another issue with dropping downloads to the mod pane. Because the indexes of the DownloadManager DownloadInfo list do not match up with the DownloadList list, we were loading the wrong DownloadInfo during the drop event. This reworks the List display to associate the index of the DownloadManager list as the canonical row position for the TreeView. We can then use the UUID map to get the correct info object based on the DownloadInfo. Ultimately we should probably scrap the extra lists in the DownloadList and just use the Manager data. By adding the DownloadList as a friend class we can access these directly rather than trying to pipe data to the DownloadList and save it separately. We just need move the logic for translating that DownloadInfo data into the display data into the |
People reported 10 seconds delays when removing downloads and other operations on 1k plus download lists. This is almost certainly caused by DirectoryWatcher not being disabled before stuff is changed in the downloads folder (this includes any metadata being changed, as qt creates a .meta.lock file during metadata update.). I'm tempted to consider ripping out the watcher and introducing our own folder iteration and indexing code, that we can run periodically and at key moments after we know we have concluded all operations. This should avoid refresh spam caused by our own changes. An alternative could be changing the signal we are listening on from It still doesn't solve a lot of problems with batch operations, and we might still be leveraging the watcher to trigger refreshes for us after updating some data, so in all I think that manual indexing might be a better solution. Could be a separate PR if we go down that road, but we have problems with it right now. |
If we do that, then either we have something running at a very high rate, far from ideal, or this will be pretty bad UX for people saving files to the download folder.
We should analyze what actually trigger extra/recursive refreshes instead of trying to completely disable the watcher, hoping it works this way. If we move the refresh callback to a thread, then we avoid many "freezing" issue, and we don't even need to disable the watcher since we can simply check if a refresh-thread is running. This may trigger many signals but will never cause a chain reaction.
I think this PR should be reduce to the initial index fix so that we can merge it and release a proper 2.5 at some point. |
So any file addition/deletion rename in the downloads folder causes the callback, which in turn causes a refresh.
The idea of not disabling the watcher and only refreshing if there isn't a refresh already running has the following downsides/problems:
What I think we need is to block any refresh during other operations (like bulk operations, or metadata update), and do a single refresh at the end of that operation, or series of operations. This way any outside changes don't risk interrupting the operation, but can still be picked up after the operation is finished, since we trigger the refresh, and the refresh is guaranteed to only be triggered once per operation. |
I'm fairly sure most of the additional delay with the current PR is caused by the creation of the display data lists in the DownloadList class. That being said I do think threading the refresh is a decent idea as long as we visually indicate the refresh is ongoing. I also think it's possible to check for specific download updates by maintaining a last check time and only updating downloads that were modified later or if the download is not currently in the list. I'm fairly certain this would ultimately be much more performant than rebuilding the entire list every time even accounting for checking all of the files for existing downloads and the filetime since we still avoid having to read the metadata and recreate the DownloadInfo. |
In the short term I think the most performant option which should resolve race issues with the index would be to assign the UUIDs and both a map with UUIDs as keys and a list with basic indices. Unfortunately we can't totally get rid of that because the TreeView can only reference the model data by the row number so something needs to be associated with the rows. I think maintaining both, despite the additional complexity, is ultimately better because it should reduce the reliance on lookup searches once we have either of those keys. Any time we are forced to use the row as the lookup we should be able to pull the UUID and then use that for any additional access and thus avoid suddenly referencing the wrong download. |
The performance issues I think are related to how I'm checking for MoID in the meta file on every refresh (via createFromMeta). This is only needing to happen once ever. Once all meta files have a valid UUID, then this doesn't need to happen again. Any new download will get a MoID, so this is just to backfill meta files that are already there. I was planning, to move this call to the initialization of the download tab and possibly place a progress popup to show something is happening on load? The other thing is to persist that this event has occurred and don't run again, I'm assuming that would be a setting?
This is the bit of code that is running on each refresh at the moment. Which, it should only get into the first part once and then fall into the else afterwards, but it's still probably a performance hit for those that have a lot of downloads. |
What do you need the meta file to contain the mod ID? As far as I understand this PR, the uuid is here to uniquely identify downloads within MO2, there should be no need to put them in meta file unless you intend to use them elsewhere in MO2, e.g., to reference installation archive, but that would break backward compatibility, so this should be explained somewhere. |
I'm not talking about what can trigger the watcher, I'm talking about what can trigger the watcher inside a refresh. A refresh will never create an archive, starts a download, etc., so that does not need to be considered. I'm not sure why we are talking about bulk operations everywhere since there is no bulk operations on the download folder through MO2 currently, so there is no way to "bulk" do something. The other (or actually main) issue here is that the "refresh" does way more than it should in a very bad way, e.g., a refresh call should never update a metadata file.
You can just add a new refresh to the queue. You'll have two refreshes, or more, but you will not have a chain reaction since you will only queue a single refresh at any time.
This is a separate issue from refresh, this is thread safety.
I don't really see why that would be an issue. Bulk operations will trigger a few refresh, so what? A download list refresh should not be a huge thing, that's another issue.
Again, what are "bulk operations"? I don't really see the problem here, you can check which metadata needs update and work with that. Right now we are using a tool, the file watcher, and trying to work against/around it rather than with it. |
It's not the refresh itself that causes multiple refreshes, it's the normal operations or the consequences of those other operations. If you update a metadata value on a DL file, like changing the pause state, adding nexus data etc, what actually happens is the following:
So any change to a single download metadata triggers at least two refreshes. If you do multiple changes to metadata, you can trigger more. If you change more than one download you trigger even more.
There are currently multiple bulk operations:
Without suspending the refreshes triggered by the watcher, these operations used to take 30+ minutes with 200 downloads. People used to close Mo2, remove downloads manually from the folder and then start it again to avoid performance degradation when going over 200 or so downloads. Plus there is the fact that our users would very much like to be able to perform bulk operations on downloads other than the ones we already support. So not considering bulk operations at this stage would just hamper any improvement in that area in the future.
That is not acceptable considering that we can have hundreds or thousands of queued refreshes (this is what happened with Hide Installed bulk operation for example). Currently, triggering multiple refreshes for operation would cripple the performance of the downloads tab to the point of making it unusable. I'm very familiar with the performance of the downloads tab previous to my hack, and doing multiple refreshes is not a viable solution.
Right, so we keep the watcher. We put it on a thread, we don't disable the watcher, but we do use a lock or flag or something from the main code, to stop refreshes from being triggered while we are performing any operation that could touch the file system multiple times, or that requires a refresh in the end regardless. In the watcher FileChanged() callback, if the flag is set, we queue a refresh to be performed after the flag is unset (or lock freed or whatever). If a refresh is already queued, don't queue more than one. This way, we can avoid multiple refreshes being triggered when we know that our operation will make multiple filesystem changes (or thousands of them). |
Ok, so, the reason it's being persisted to the meta file is because I'm trying to separate the Download List from the Download Manager. So currently, when ever a refresh occurs, it removes everything from the download manager's list that is downloaded and in a ready state and then reloads them. This used to occur in the view as well. In order to allow the download manager to continue refreshing its list without affecting the display view, that's why I've created a new initializeList method that is very similar to the refreshList. In the InitializeList, I'm adding everything to the view. In the refreshList, I'm only updating what's changed in the refresh to the view. In order to correlate the download each time it's added/removed I needed the unique ID to persist with the meta so that it was consistent through out a refresh cycle instead of just assigning a new ID every time. |
Ah, I'm wondering if in the refreshList where I'm setting the uuid in the meta file if it's not there is causing this cycle. So, I need to wrap that with the disable dirwatcher call. That may be the performance issue then. |
I don't even know why we're talking about multiple refreshes. Whether or not we disable the watcher or ignore refreshes during an existing refresh, the result is the same. Since the refresh builds the download list over again either the data is already out of date because something changed in the middle of the refresh operation or everything gets updated anyway. I still think a better strategy is to find new or deleted relevant files or newly modified files and only update those as a result of the watcher firing. I think we currently run the full refresh far more than it needs to be run, and without a major rework to the way that data is stored and parsed we're always going to be running up against performance issues when we reach thousands of downloads. I do think we can make a more limited update in the short term to use the UUID to avoid the problems with ambiguous download IDs. This does need to be persistent, which is why it's stored on the download meta. That being said it should be created either if it's missing when we load the meta data or if it's newly created. This would only add time to process the first time we find a missing UUID. Running a generation pass at startup should also avoid most of this. Finally, I don't see the point in displaying it on the download pane, particularly not by default. The only thing it's really useful for is potentially debugging the code. Users will have no clue what it is and the use is purely internal. |
The result isn't the same. The problem isn't that the Refresh operation itself causes multiple refreshes (although in this PR there is also this problem), it's the fact that other operations do multiple changes to the DL folder, causing multiple refreshes. Like setting Hidden flag on 100 downloads causes a refresh after each meta file is changed, effectively 100 refreshes. You want to ignore the refreshes from the watcher while performing the bulk hiding operation, and only do a single refresh after you have finished hiding all the 100 downloads. This is to explain why I was arguing against the idea from Holt to remove the disable watcher code entirely. The watcher currently doesn't return any information regarding which file was changed and how because we are watching an entire folder, so it just returns the path of the folder (the watcher could watch multiple folders). From the documentation it isn't clear if the fileChanged() signal is fired as well when watching a folder and a file in it changed. |
This isn't what I mean. I mean if we ignore 'new' refresh triggers or disable the watcher, the result is the same (since the watch triggers most of the refreshes). The only reason to run the refresh twice in a row is because data changed during the refresh, which is just something that we'll have to live with without a major rework to how we detect changes to downloads. (And this is something that can happen with or without MO2 making the changes.) |
Yes, anything writing to the meta or touching files in general should be wrapped in the disable watcher code. I would just do the entire Refresh function. |
The refresh function has been and still is disabling the watcher. We could also just rework it to monitoring existing refreshes and ignore additional refresh calls during an active refresh. The result is the same. I do not believe this PR or the existing code is triggering excessive refreshes at the time being. This is not where the current slowdowns are happening, it is within the span of a single refresh. Although I do think bulk operations can be reworked to only trigger the refresh one time after updating all of the downloads. I think these may still trigger a refresh after each download is changed, which is completely unnecessary unless we implement a strategy that does not rebuild the entire download list every refresh. |
Keep in mind that the way the current disable refresh code is done is pretty crap. It's quite possible that it isn't getting disabled correctly or there is an extra endDisableDirWatcher call somewhere. It is just an int that is getting increased and decreased. It should really be either atomic or a lock of some sort and all the calls be changed to use a scoped version. Also Bulk operations currently should only trigger one refresh at the end. If that is not the case then there is an issue with how the watcher is disabled. |
The scoped disabler should prevent forgetting to call the 'end' function, which should make this less of an issue. Unless that disabler never goes out of scope, but that would indicate another problem. I've had some discussions with Trout and I think we're looking at starting with a leaner version that just addresses the unique ID issue for the time being. But for a post 2.5 release I think we've decided there's still some utility in caching the display data in the List class but keeping the indexed DownloadInfo list maintained in one location (the manager). So essentially the DownloadManager list determines the initial 'rows' while the DownloadList references this to get the UUID and fetch the saved display info from there. All it would need to do is add and prune the display info data (which should help performance in the long run since we don't have to calculate that data with every display update). |
If everyone's good with this approach, I'll close this PR. I'll leave this branch and create a new branch and push another PR once that's ready. |
Sounds good. |
ok, will do. Closing this PR |
Fix download issues