-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(server): library refresh go brrr #14456
base: main
Are you sure you want to change the base?
Conversation
0eb1440
to
80aa615
Compare
80aa615
to
8ecde3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice start! I think there are still a lot of untapped potential improvements here.
} | ||
|
||
private async handleSyncAsset(id: string, importPaths: string[], exclusionPatterns: string[]): Promise<JobStatus> { | ||
const asset = await this.assetRepository.getById(id); | ||
if (!asset) { | ||
return JobStatus.SKIPPED; | ||
} | ||
|
||
const markOffline = async (explanation: string) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just log directly without this function since the offline status will be set at the batch level at the end.
@OnJob({ name: JobName.LIBRARY_SYNC_ASSETS, queue: QueueName.LIBRARY }) | ||
async handleSyncAssets(job: JobOf<JobName.LIBRARY_SYNC_ASSETS>): Promise<JobStatus> { | ||
for (const id of job.ids) { | ||
await this.handleSyncAsset(id, job.importPaths, job.exclusionPatterns); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fetch all the assets first with getByIds
. Group the ones to be marked offline, queued for metadata extraction, etc. as you check them, then do batched async calls at the end as needed. The try/catch for stat
should still be scoped to the asset so one error won't torpedo the batch.
The update to |
Thanks for your comments @mertalev ! I'll first attempt to do the import path and exclusion pattern checks in SQL and then move to your suggestions |
d394654
to
8b2a48c
Compare
6d69307
to
c26f6aa
Compare
c26f6aa
to
a3be620
Compare
.where({ isOffline: false }) | ||
.andWhere( | ||
new Brackets((qb) => { | ||
qb.where('originalPath NOT SIMILAR TO :paths', { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use LIKE
instead of SIMILAR TO
.
The exclusions and import paths are also specific to a particular library, right? So you need to specify the library in the query.
Also, can you generate SQL for this and confirm with EXPLAIN ANALYZE
that it uses an index?
.update() | ||
.set({ | ||
isOffline: true, | ||
deletedAt: new Date(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status
also needs to be set. This is why I don't really like the status field. The same info is stored in multiple places so it's so easy for it to go out of sync like this.
@OnJob({ name: JobName.LIBRARY_SYNC_ASSETS, queue: QueueName.LIBRARY }) | ||
async handleSyncAssets(job: JobOf<JobName.LIBRARY_SYNC_ASSETS>): Promise<JobStatus> { | ||
for (const id of job.ids) { | ||
const asset = await this.assetRepository.getById(id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do getByIds
in one call
a3be620
to
ef4db4e
Compare
For a library we currently queue one job per library asset to check if it is still online. It is more efficient to create one job per 10k assets instead, making it a tighter loop inside the library service than to create 10k tiny jobs
Other things changed:
A quick performance test for an external library with 22573 assets on my 32-core server got the following stats for a rescan
This performance improvement is important since libraries rarely change and a library rescan is run quite often.