Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Fix download issues #1874

Closed
wants to merge 35 commits into from

Conversation

SaltyTrout
Copy link

Fix download issues

  • Updated DownloadManager to handle timeouts
    • The DownloadMananger will now check if the download progress has timed out, and if so, it will pause the download. I also put a log notifying the user that the connectivity to the server had been lost and pausing the file.
  • Working on invalid download index issue

@SaltyTrout
Copy link
Author

I made progress last night on figuring out where the invalid download index issue is. Essentially, when the download list looks for pending downloads, it’s using the list row index minus the total active downloads to hope that that will align to the vector index of pending downloads. My plan is to create/update a signal/slot to manage a mapping of pending downloads on the list side based on the lists row index and pending downloads index. Which will have to be continually updated when the index row updates.

@Silarn
Copy link
Member

Silarn commented Sep 16, 2023

Interesting that this was commented out. I wonder if there was a reason for that...

@SaltyTrout
Copy link
Author

Interesting that this was commented out. I wonder if there was a reason for that...

Right? I looked back to see if there was any notes, but looks like it'd been commented out for some time now, like over 5+ years. I've been running it locally, haven't seen any issues so far with it on. I've tested like 20 mods in download and turning off my wifi causes them all to be paused, which that can then be resumed. Would be nice if there was a resume all, but for now this should be good.

@AnyOldName3
Copy link
Member

Has anyone looked through git blame to see what the last few commits to touch the relevant lines claimed they were doing? It can be a bit of a pain sometimes as the repo split means you may have to track down an equivalent commit in the old unsplit repo under Tannin's org, but there's some kind of explanation reasonably often.

@SaltyTrout
Copy link
Author

SaltyTrout commented Sep 22, 2023

Just pushed an update that includes the unique id for downloads. Most methods are now set to use the DownloadId instead of the index. There's a couple I've left that may still need it, but for the most part I've been able to get everything to start using the DownloadId. Also, the DownloadList has it's own m_downloads model of type QList. This will allow the list to keep track of the downloads via DownloadId and it's row.

@SaltyTrout
Copy link
Author

Has anyone looked through git blame to see what the last few commits to touch the relevant lines claimed they were doing? It can be a bit of a pain sometimes as the repo split means you may have to track down an equivalent commit in the old unsplit repo under Tannin's org, but there's some kind of explanation reasonably often.

Just looked through the code and found the commit. Seems this was originally used to try and fix stuck downloads.
decbbb6

It was then commented out due to downloads getting stuck when paused.
b7d6bbb

So, the implementation in use here is reusing a similar idea to pause when a timeout occurs, but removed the downloadFinished and resumeDownload calls so that it only pauses on timeout.

@SaltyTrout SaltyTrout marked this pull request as ready for review September 22, 2023 01:05
@SaltyTrout SaltyTrout changed the title WIP: Fix download issues Fix download issues Sep 22, 2023
Copy link
Member

@Holt59 Holt59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a "code style" review - I'll have to test the code to see if I can find issue. Right now, the changes make senses, except for most of the flow of DownloadList.

You also need to apply clang-format when committing.

src/downloadlist.cpp Outdated Show resolved Hide resolved
src/downloadlist.cpp Outdated Show resolved Hide resolved
src/downloadlist.cpp Outdated Show resolved Hide resolved
src/downloadlist.cpp Outdated Show resolved Hide resolved
src/downloadlist.h Outdated Show resolved Hide resolved
src/downloadlistview.cpp Outdated Show resolved Hide resolved
src/downloadlistview.cpp Outdated Show resolved Hide resolved
src/downloadlistview.cpp Outdated Show resolved Hide resolved
src/downloadlistview.h Outdated Show resolved Hide resolved
src/organizer_en.ts Outdated Show resolved Hide resolved
@SaltyTrout SaltyTrout marked this pull request as draft September 23, 2023 01:49
@SaltyTrout SaltyTrout changed the title Fix download issues WIP: Fix download issues Sep 23, 2023
@SaltyTrout
Copy link
Author

I'm not sure I really like this PR, basically we are duplicating the list of downloads in two places and trying to maintain consistency between both, this is just going to break at some point... It's already very complicated to follow the code due to everything being 100% public so you have no clue what happens when, so it's impossible to follow the flow of modification of objects.
What is the initial reason the list of items is duplicated in DownloadList instead of simply being consolidated in the manager and fetch from there?

I think originally there were multiple lists of items, Active, pending etc, and custom logic to display them correctly in the downloads tab. The downloads tab itself was rewritten a couple of times. Before it was a list of custom views instead of standard QTreeWidget or whatever we are using now. Ideally we would rewrite everything to be as simple as possible instead of having all these different layers and having to keep them in sync.

That makes a lot of since. Looking through this it does seem a bit hobbled together from different evolutions of past refactoring attempts. So, what I've tried to do is separate the download manager from the view. That's why there's now two different lists, one to keep track of what's in the view vs one that is keeping track of the download files and its respective meta files.

If this is not the direction that you'd like to go, that's ok, but I would like clear direction before I move forward. If you'd rather just leave it the way downloads are now and just patch the index issue that's ok too.

@Holt59
Copy link
Member

Holt59 commented Oct 22, 2023

I think originally there were multiple lists of items, Active, pending etc, and custom logic to display them correctly in the downloads tab. The downloads tab itself was rewritten a couple of times. Before it was a list of custom views instead of standard QTreeWidget or whatever we are using now. Ideally we would rewrite everything to be as simple as possible instead of having all these different layers and having to keep them in sync.

That makes a lot of since. Looking through this it does seem a bit hobbled together from different evolutions of past refactoring attempts. So, what I've tried to do is separate the download manager from the view. That's why there's now two different lists, one to keep track of what's in the view vs one that is keeping track of the download files and its respective meta files.

If this is not the direction that you'd like to go, that's ok, but I would like clear direction before I move forward. If you'd rather just leave it the way downloads are now and just patch the index issue that's ok too.

Having two lists trying to be in sync. is just asking for troubles in the long run - There should be a single list of downloads like there is a single list of mods, then you can proxy it via a Qt model to display in the view.

Right now (in master), it's pretty ugly and badly done but at least there is a single place where downloads are store, in the manager and DownloadList is just a proxy, similar to ModList for ModInfo.

For me, it'd be better to just fix the index issue and do a complete overall of the manager in a later PR.

@Silarn
Copy link
Member

Silarn commented Oct 22, 2023

I think the only thing I'm not sure about is why we're maintaining a second set of lists in the model class. I think this can be reworked to just get the download list from the manager.

The main reason why I have two lists in my revision above is because you have to maintain parity between the view rows and the model data which requires an indexed list. But the uuids ensure we're always referencing the same download. So having a map indexed to the uuids makes sense to improve lookup efficiency. I think keeping the two in sync within the download manager isn't all that difficult. (Just take a look at the modlist, pretty sure there are three lists in that one.)

If we can just rework the list model to pull the data from the manager rather than maintaining another set of lists, that will make things simpler I think.

@Silarn
Copy link
Member

Silarn commented Oct 22, 2023

I think the main difficulty with that is that the tree view does require certain signals to be sent from the data class to update the rows after additions and deletions.

@Silarn
Copy link
Member

Silarn commented Oct 23, 2023

98a232c corrects another issue with dropping downloads to the mod pane. Because the indexes of the DownloadManager DownloadInfo list do not match up with the DownloadList list, we were loading the wrong DownloadInfo during the drop event.

This reworks the List display to associate the index of the DownloadManager list as the canonical row position for the TreeView. We can then use the UUID map to get the correct info object based on the DownloadInfo.

Ultimately we should probably scrap the extra lists in the DownloadList and just use the Manager data. By adding the DownloadList as a friend class we can access these directly rather than trying to pipe data to the DownloadList and save it separately.

We just need move the logic for translating that DownloadInfo data into the display data into the data function and the comparisons in the predicate function.

@Al12rs
Copy link
Member

Al12rs commented Oct 23, 2023

People reported 10 seconds delays when removing downloads and other operations on 1k plus download lists. This is almost certainly caused by DirectoryWatcher not being disabled before stuff is changed in the downloads folder (this includes any metadata being changed, as qt creates a .meta.lock file during metadata update.).

I'm tempted to consider ripping out the watcher and introducing our own folder iteration and indexing code, that we can run periodically and at key moments after we know we have concluded all operations.

This should avoid refresh spam caused by our own changes.

An alternative could be changing the signal we are listening on from QFileSystemWatcher::directoryChanged(QString directoryPath) to QFileSystemWatcher::fileChanged(QString filePath), and then do some filtering on the filePath to determine whether the file that was changed is interesting or not.

It still doesn't solve a lot of problems with batch operations, and we might still be leveraging the watcher to trigger refreshes for us after updating some data, so in all I think that manual indexing might be a better solution.
It will take longer to detect outside changes but it is probably worth it for stability.

Could be a separate PR if we go down that road, but we have problems with it right now.

@Holt59
Copy link
Member

Holt59 commented Oct 23, 2023

People reported 10 seconds delays when removing downloads and other operations on 1k plus download lists. This is almost certainly caused by DirectoryWatcher not being disabled before stuff is changed in the downloads folder (this includes any metadata being changed, as qt creates a .meta.lock file during metadata update.).

I'm tempted to consider ripping out the watcher and introducing our own folder iteration and indexing code, that we can run periodically and at key moments after we know we have concluded all operations.

If we do that, then either we have something running at a very high rate, far from ideal, or this will be pretty bad UX for people saving files to the download folder.

This should avoid refresh spam caused by our own changes.

We should analyze what actually trigger extra/recursive refreshes instead of trying to completely disable the watcher, hoping it works this way.

If we move the refresh callback to a thread, then we avoid many "freezing" issue, and we don't even need to disable the watcher since we can simply check if a refresh-thread is running. This may trigger many signals but will never cause a chain reaction.

Could be a separate PR if we go down that road, but we have problems with it right now.

I think this PR should be reduce to the initial index fix so that we can merge it and release a proper 2.5 at some point.

@Al12rs
Copy link
Member

Al12rs commented Oct 23, 2023

We should analyze what actually trigger extra/recursive refreshes instead of trying to completely disable the watcher, hoping it works this way.

If we move the refresh callback to a thread, then we avoid many "freezing" issue, and we don't even need to disable the watcher since we can simply check if a refresh-thread is running. This may trigger many signals but will never cause a chain reaction.

So any file addition/deletion rename in the downloads folder causes the callback, which in turn causes a refresh.
The actual cases are:

  • A new archive is added to the folder
  • A dl is started
  • A temporary dl file is renamed to final file
  • A dl file is removed
  • A dl meta file is edited (this creates a .meta.lock file)
    • meta status update
    • meta nexus info update
    • meta installed/uninstalled status update
    • meta hidden status update
    • any other metadata change.
  • All bulk operations can trigger multiple refreshes, one for each change:
    • Mass deletion
    • Mass additions
    • Mass metadata changes

The idea of not disabling the watcher and only refreshing if there isn't a refresh already running has the following downsides/problems:

  • We trigger refreshes early during complex or bulk operations and we don't read the final state during the refresh.
  • Early refreshes can interrupt or mess with operations midway.
  • This is still likely to refresh multiple times during bulk operations, which is wasted performance.

What I think we need is to block any refresh during other operations (like bulk operations, or metadata update), and do a single refresh at the end of that operation, or series of operations. This way any outside changes don't risk interrupting the operation, but can still be picked up after the operation is finished, since we trigger the refresh, and the refresh is guaranteed to only be triggered once per operation.

@Silarn
Copy link
Member

Silarn commented Oct 23, 2023

I'm fairly sure most of the additional delay with the current PR is caused by the creation of the display data lists in the DownloadList class.

That being said I do think threading the refresh is a decent idea as long as we visually indicate the refresh is ongoing.

I also think it's possible to check for specific download updates by maintaining a last check time and only updating downloads that were modified later or if the download is not currently in the list.

I'm fairly certain this would ultimately be much more performant than rebuilding the entire list every time even accounting for checking all of the files for existing downloads and the filetime since we still avoid having to read the metadata and recreate the DownloadInfo.

@Silarn
Copy link
Member

Silarn commented Oct 23, 2023

In the short term I think the most performant option which should resolve race issues with the index would be to assign the UUIDs and both a map with UUIDs as keys and a list with basic indices. Unfortunately we can't totally get rid of that because the TreeView can only reference the model data by the row number so something needs to be associated with the rows.

I think maintaining both, despite the additional complexity, is ultimately better because it should reduce the reliance on lookup searches once we have either of those keys. Any time we are forced to use the row as the lookup we should be able to pull the UUID and then use that for any additional access and thus avoid suddenly referencing the wrong download.

@SaltyTrout
Copy link
Author

SaltyTrout commented Oct 24, 2023

The performance issues I think are related to how I'm checking for MoID in the meta file on every refresh (via createFromMeta). This is only needing to happen once ever. Once all meta files have a valid UUID, then this doesn't need to happen again. Any new download will get a MoID, so this is just to backfill meta files that are already there. I was planning, to move this call to the initialization of the download tab and possibly place a progress popup to show something is happening on load? The other thing is to persist that this event has occurred and don't run again, I'm assuming that would be a setting?

if (metaFile.value("moId").isNull() || metaFile.value("moId").toUuid().isNull()) { info->m_moId = QUuid::createUuid(); metaFile.setValue("moId", info->m_moId.toString()); } else { info->m_moId = metaFile.value("moId").toUuid(); }

This is the bit of code that is running on each refresh at the moment. Which, it should only get into the first part once and then fall into the else afterwards, but it's still probably a performance hit for those that have a lot of downloads.

@Holt59
Copy link
Member

Holt59 commented Oct 24, 2023

The performance issues I think are related to how I'm checking for MoID in the meta file on every refresh (via createFromMeta). This is only needing to happen once ever. Once all meta files have a valid UUID, then this doesn't need to happen again. Any new download will get a MoID, so this is just to backfill meta files that are already there. I was planning, to move this call to the initialization of the download tab and possibly place a progress popup to show something is happening on load? The other thing is to persist that this event has occurred and don't run again, I'm assuming that would be a setting?

if (metaFile.value("moId").isNull() || metaFile.value("moId").toUuid().isNull()) { info->m_moId = QUuid::createUuid(); metaFile.setValue("moId", info->m_moId.toString()); } else { info->m_moId = metaFile.value("moId").toUuid(); }

This is the bit of code that is running on each refresh at the moment. Which, it should only get into the first part once and then fall into the else afterwards, but it's still probably a performance hit for those that have a lot of downloads.

What do you need the meta file to contain the mod ID? As far as I understand this PR, the uuid is here to uniquely identify downloads within MO2, there should be no need to put them in meta file unless you intend to use them elsewhere in MO2, e.g., to reference installation archive, but that would break backward compatibility, so this should be explained somewhere.

@Holt59
Copy link
Member

Holt59 commented Oct 24, 2023

We should analyze what actually trigger extra/recursive refreshes instead of trying to completely disable the watcher, hoping it works this way.
If we move the refresh callback to a thread, then we avoid many "freezing" issue, and we don't even need to disable the watcher since we can simply check if a refresh-thread is running. This may trigger many signals but will never cause a chain reaction.

So any file addition/deletion rename in the downloads folder causes the callback, which in turn causes a refresh. The actual cases are:

* A new archive is added to the folder

* A dl is started

* A temporary dl file is renamed to final file

* A dl file is removed

* A dl meta file is edited (this creates a .meta.lock file)
  
  * meta status update
  * meta nexus info update
  * meta installed/uninstalled status update
  * meta hidden status update
  * any other metadata change.

* All bulk operations can trigger multiple refreshes, one for each change:
  
  * Mass deletion
  * Mass additions
  * Mass metadata changes

I'm not talking about what can trigger the watcher, I'm talking about what can trigger the watcher inside a refresh. A refresh will never create an archive, starts a download, etc., so that does not need to be considered.

I'm not sure why we are talking about bulk operations everywhere since there is no bulk operations on the download folder through MO2 currently, so there is no way to "bulk" do something.

The other (or actually main) issue here is that the "refresh" does way more than it should in a very bad way, e.g., a refresh call should never update a metadata file.

The idea of not disabling the watcher and only refreshing if there isn't a refresh already running has the following downsides/problems:

* We trigger refreshes early during complex or bulk operations and we don't read the final state during the refresh.

You can just add a new refresh to the queue. You'll have two refreshes, or more, but you will not have a chain reaction since you will only queue a single refresh at any time.

* Early refreshes can interrupt or mess with operations midway.

This is a separate issue from refresh, this is thread safety.

* This is still likely to refresh multiple times during bulk operations, which is wasted performance.

I don't really see why that would be an issue. Bulk operations will trigger a few refresh, so what? A download list refresh should not be a huge thing, that's another issue.

What I think we need is to block any refresh during other operations (like bulk operations, or metadata update), and do a single refresh at the end of that operation, or series of operations. This way any outside changes don't risk interrupting the operation, but can still be picked up after the operation is finished, since we trigger the refresh, and the refresh is guaranteed to only be triggered once per operation.

Again, what are "bulk operations"? I don't really see the problem here, you can check which metadata needs update and work with that.

Right now we are using a tool, the file watcher, and trying to work against/around it rather than with it.

@Al12rs
Copy link
Member

Al12rs commented Oct 24, 2023

I'm not talking about what can trigger the watcher, I'm talking about what can trigger the watcher inside a refresh. A refresh will never create an archive, starts a download, etc., so that does not need to be considered.

It's not the refresh itself that causes multiple refreshes, it's the normal operations or the consequences of those other operations.

If you update a metadata value on a DL file, like changing the pause state, adding nexus data etc, what actually happens is the following:

  • Qt creates a meta.lock file as a backup and signal to other Qt threads not to edit the file, this triggers a refresh.
  • Qt edits the .meta file. This could cause a refresh depending on type of writing ( delete write or whatever).
  • Qt finishes writing and removes/renames the lock file, which triggers another refresh.

So any change to a single download metadata triggers at least two refreshes. If you do multiple changes to metadata, you can trigger more. If you change more than one download you trigger even more.

I'm not sure why we are talking about bulk operations everywhere since there is no bulk operations on the download folder through MO2 currently, so there is no way to "bulk" do something.

There are currently multiple bulk operations:

  • Hide Installed
  • Hide Uninstalled
  • Hide All
  • Delete Installed
  • Delete Uninstalled
  • Delete All
  • Anything that updates the metadata of multiple download files

Without suspending the refreshes triggered by the watcher, these operations used to take 30+ minutes with 200 downloads.
People have upwards of 2k+ downloads.
Those operations and the appalling performance were the reason I implemented the frankly dreadful stop watcher hack a long time ago.
Without those changes even a single operation on a large list would take triple or more the amount of time that it does now, and any bigger operation would make the downloads tab completely unusable.

People used to close Mo2, remove downloads manually from the folder and then start it again to avoid performance degradation when going over 200 or so downloads.

Plus there is the fact that our users would very much like to be able to perform bulk operations on downloads other than the ones we already support. So not considering bulk operations at this stage would just hamper any improvement in that area in the future.

You can just add a new refresh to the queue. You'll have two refreshes, or more, but you will not have a chain reaction since you will only queue a single refresh at any time.

I don't really see why that would be an issue. Bulk operations will trigger a few refresh, so what? A download list refresh should not be a huge thing, that's another issue.

That is not acceptable considering that we can have hundreds or thousands of queued refreshes (this is what happened with Hide Installed bulk operation for example).

Currently, triggering multiple refreshes for operation would cripple the performance of the downloads tab to the point of making it unusable.
Improving the performance of the refresh would just limit the impact of the problem, without actually addressing the fact that we are spamming refreshes for nothing. We are perfectly capable to avoid spamming refreshes and wasting performance, we should do so.

I'm very familiar with the performance of the downloads tab previous to my hack, and doing multiple refreshes is not a viable solution.

Right now we are using a tool, the file watcher, and trying to work against/around it rather than with it.

Right, so we keep the watcher. We put it on a thread, we don't disable the watcher, but we do use a lock or flag or something from the main code, to stop refreshes from being triggered while we are performing any operation that could touch the file system multiple times, or that requires a refresh in the end regardless.

In the watcher FileChanged() callback, if the flag is set, we queue a refresh to be performed after the flag is unset (or lock freed or whatever). If a refresh is already queued, don't queue more than one.

This way, we can avoid multiple refreshes being triggered when we know that our operation will make multiple filesystem changes (or thousands of them).

@SaltyTrout
Copy link
Author

The performance issues I think are related to how I'm checking for MoID in the meta file on every refresh (via createFromMeta). This is only needing to happen once ever. Once all meta files have a valid UUID, then this doesn't need to happen again. Any new download will get a MoID, so this is just to backfill meta files that are already there. I was planning, to move this call to the initialization of the download tab and possibly place a progress popup to show something is happening on load? The other thing is to persist that this event has occurred and don't run again, I'm assuming that would be a setting?
if (metaFile.value("moId").isNull() || metaFile.value("moId").toUuid().isNull()) { info->m_moId = QUuid::createUuid(); metaFile.setValue("moId", info->m_moId.toString()); } else { info->m_moId = metaFile.value("moId").toUuid(); }
This is the bit of code that is running on each refresh at the moment. Which, it should only get into the first part once and then fall into the else afterwards, but it's still probably a performance hit for those that have a lot of downloads.

What do you need the meta file to contain the mod ID? As far as I understand this PR, the uuid is here to uniquely identify downloads within MO2, there should be no need to put them in meta file unless you intend to use them elsewhere in MO2, e.g., to reference installation archive, but that would break backward compatibility, so this should be explained somewhere.

Ok, so, the reason it's being persisted to the meta file is because I'm trying to separate the Download List from the Download Manager. So currently, when ever a refresh occurs, it removes everything from the download manager's list that is downloaded and in a ready state and then reloads them. This used to occur in the view as well. In order to allow the download manager to continue refreshing its list without affecting the display view, that's why I've created a new initializeList method that is very similar to the refreshList. In the InitializeList, I'm adding everything to the view. In the refreshList, I'm only updating what's changed in the refresh to the view. In order to correlate the download each time it's added/removed I needed the unique ID to persist with the meta so that it was consistent through out a refresh cycle instead of just assigning a new ID every time.

@SaltyTrout
Copy link
Author

I'm not talking about what can trigger the watcher, I'm talking about what can trigger the watcher inside a refresh. A refresh will never create an archive, starts a download, etc., so that does not need to be considered.

It's not the refresh itself that causes multiple refreshes, it's the normal operations or the consequences of those other operations.

If you update a metadata value on a DL file, like changing the pause state, adding nexus data etc, what actually happens is the following:

  • Qt creates a meta.lock file as a backup and signal to other Qt threads not to edit the file, this triggers a refresh.
  • Qt edits the .meta file. This could cause a refresh depending on type of writing ( delete write or whatever).
  • Qt finishes writing and removes/renames the lock file, which triggers another refresh.

So any change to a single download metadata triggers at least two refreshes. If you do multiple changes to metadata, you can trigger more. If you change more than one download you trigger even more.

Ah, I'm wondering if in the refreshList where I'm setting the uuid in the meta file if it's not there is causing this cycle. So, I need to wrap that with the disable dirwatcher call. That may be the performance issue then.

@Silarn
Copy link
Member

Silarn commented Oct 24, 2023

I don't even know why we're talking about multiple refreshes. Whether or not we disable the watcher or ignore refreshes during an existing refresh, the result is the same. Since the refresh builds the download list over again either the data is already out of date because something changed in the middle of the refresh operation or everything gets updated anyway.

I still think a better strategy is to find new or deleted relevant files or newly modified files and only update those as a result of the watcher firing. I think we currently run the full refresh far more than it needs to be run, and without a major rework to the way that data is stored and parsed we're always going to be running up against performance issues when we reach thousands of downloads.

I do think we can make a more limited update in the short term to use the UUID to avoid the problems with ambiguous download IDs. This does need to be persistent, which is why it's stored on the download meta. That being said it should be created either if it's missing when we load the meta data or if it's newly created. This would only add time to process the first time we find a missing UUID. Running a generation pass at startup should also avoid most of this.

Finally, I don't see the point in displaying it on the download pane, particularly not by default. The only thing it's really useful for is potentially debugging the code. Users will have no clue what it is and the use is purely internal.

@Al12rs
Copy link
Member

Al12rs commented Oct 24, 2023

I don't even know why we're talking about multiple refreshes. Whether or not we disable the watcher or ignore refreshes during an existing refresh, the result is the same. Since the refresh builds the download list over again either the data is already out of date because something changed in the middle of the refresh operation or everything gets updated anyway.

The result isn't the same. The problem isn't that the Refresh operation itself causes multiple refreshes (although in this PR there is also this problem), it's the fact that other operations do multiple changes to the DL folder, causing multiple refreshes.

Like setting Hidden flag on 100 downloads causes a refresh after each meta file is changed, effectively 100 refreshes. You want to ignore the refreshes from the watcher while performing the bulk hiding operation, and only do a single refresh after you have finished hiding all the 100 downloads.

This is to explain why I was arguing against the idea from Holt to remove the disable watcher code entirely.

The watcher currently doesn't return any information regarding which file was changed and how because we are watching an entire folder, so it just returns the path of the folder (the watcher could watch multiple folders).

From the documentation it isn't clear if the fileChanged() signal is fired as well when watching a folder and a file in it changed.

@Silarn
Copy link
Member

Silarn commented Oct 24, 2023

The result isn't the same. The problem isn't that the Refresh operation itself causes multiple refreshes (although in this PR there is also this problem), it's the fact that other operations do multiple changes to the DL folder, causing multiple refreshes.

This isn't what I mean. I mean if we ignore 'new' refresh triggers or disable the watcher, the result is the same (since the watch triggers most of the refreshes).

The only reason to run the refresh twice in a row is because data changed during the refresh, which is just something that we'll have to live with without a major rework to how we detect changes to downloads. (And this is something that can happen with or without MO2 making the changes.)

@Al12rs
Copy link
Member

Al12rs commented Oct 24, 2023

Ah, I'm wondering if in the refreshList where I'm setting the uuid in the meta file if it's not there is causing this cycle. So, I need to wrap that with the disable dirwatcher call. That may be the performance issue then.

Yes, anything writing to the meta or touching files in general should be wrapped in the disable watcher code. I would just do the entire Refresh function.

@Silarn
Copy link
Member

Silarn commented Oct 24, 2023

The refresh function has been and still is disabling the watcher. We could also just rework it to monitoring existing refreshes and ignore additional refresh calls during an active refresh. The result is the same.

I do not believe this PR or the existing code is triggering excessive refreshes at the time being. This is not where the current slowdowns are happening, it is within the span of a single refresh.

Although I do think bulk operations can be reworked to only trigger the refresh one time after updating all of the downloads. I think these may still trigger a refresh after each download is changed, which is completely unnecessary unless we implement a strategy that does not rebuild the entire download list every refresh.

@Al12rs
Copy link
Member

Al12rs commented Oct 24, 2023

The refresh function has been and still is disabling the watcher. We could also just rework it to monitoring existing refreshes and ignore additional refresh calls during an active refresh. The result is the same.

Keep in mind that the way the current disable refresh code is done is pretty crap. It's quite possible that it isn't getting disabled correctly or there is an extra endDisableDirWatcher call somewhere. It is just an int that is getting increased and decreased. It should really be either atomic or a lock of some sort and all the calls be changed to use a scoped version.

Also Bulk operations currently should only trigger one refresh at the end. If that is not the case then there is an issue with how the watcher is disabled.

@Silarn
Copy link
Member

Silarn commented Oct 25, 2023

Keep in mind that the way the current disable refresh code is done is pretty crap. It's quite possible that it isn't getting disabled correctly or there is an extra endDisableDirWatcher call somewhere. It is just an int that is getting increased and decreased. It should really be either atomic or a lock of some sort and all the calls be changed to use a scoped version.

The scoped disabler should prevent forgetting to call the 'end' function, which should make this less of an issue. Unless that disabler never goes out of scope, but that would indicate another problem.

I've had some discussions with Trout and I think we're looking at starting with a leaner version that just addresses the unique ID issue for the time being. But for a post 2.5 release I think we've decided there's still some utility in caching the display data in the List class but keeping the indexed DownloadInfo list maintained in one location (the manager).

So essentially the DownloadManager list determines the initial 'rows' while the DownloadList references this to get the UUID and fetch the saved display info from there. All it would need to do is add and prune the display info data (which should help performance in the long run since we don't have to calculate that data with every display update).

@SaltyTrout
Copy link
Author

Keep in mind that the way the current disable refresh code is done is pretty crap. It's quite possible that it isn't getting disabled correctly or there is an extra endDisableDirWatcher call somewhere. It is just an int that is getting increased and decreased. It should really be either atomic or a lock of some sort and all the calls be changed to use a scoped version.

The scoped disabler should prevent forgetting to call the 'end' function, which should make this less of an issue. Unless that disabler never goes out of scope, but that would indicate another problem.

I've had some discussions with Trout and I think we're looking at starting with a leaner version that just addresses the unique ID issue for the time being. But for a post 2.5 release I think we've decided there's still some utility in caching the display data in the List class but keeping the indexed DownloadInfo list maintained in one location (the manager).

So essentially the DownloadManager list determines the initial 'rows' while the DownloadList references this to get the UUID and fetch the saved display info from there. All it would need to do is add and prune the display info data (which should help performance in the long run since we don't have to calculate that data with every display update).

If everyone's good with this approach, I'll close this PR. I'll leave this branch and create a new branch and push another PR once that's ready.

@Al12rs
Copy link
Member

Al12rs commented Oct 26, 2023

So essentially the DownloadManager list determines the initial 'rows' while the DownloadList references this to get the UUID and fetch the saved display info from there. All it would need to do is add and prune the display info data (which should help performance in the long run since we don't have to calculate that data with every display update).

If everyone's good with this approach, I'll close this PR. I'll leave this branch and create a new branch and push another PR once that's ready.

Sounds good.
For the future PR with the bigger changes, as Holt also noted, we should look at avoiding the use of the Row as the ID in QAbstractItemModel::createIndex() so that we may properly use QAbstractItemModel::index() to get a reliable reference to the actual model item.

@SaltyTrout
Copy link
Author

So essentially the DownloadManager list determines the initial 'rows' while the DownloadList references this to get the UUID and fetch the saved display info from there. All it would need to do is add and prune the display info data (which should help performance in the long run since we don't have to calculate that data with every display update).

If everyone's good with this approach, I'll close this PR. I'll leave this branch and create a new branch and push another PR once that's ready.

Sounds good. For the future PR with the bigger changes, as Holt also noted, we should look at avoiding the use of the Row as the ID in QAbstractItemModel::createIndex() so that we may properly use QAbstractItemModel::index() to get a reliable reference to the actual model item.

ok, will do. Closing this PR

@SaltyTrout SaltyTrout closed this Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants