Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: extend smart search, match exif-description too #14380

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

TheHamkerCat
Copy link

@TheHamkerCat TheHamkerCat commented Nov 27, 2024

Adding a feature to search images based on description.

I've extended /search/smart instead of creating /search/description, implemented due to personal need but wanted to share so everyone can benefit. Feedback welcome!

changelog:
When searching "beach vacation", results will first show images that have "beach vacation" in their description, and then visually similar images by embeddings (past behavior)

@mertalev
Copy link
Contributor

mertalev commented Nov 27, 2024

Thanks for the PR! This has been a much-requested feature that I'm sure will bring a smile to many faces. There are some changes that need to be made first, though.

You should create a trigram index and adjust this query to use it. They perform better, both for search time, finding results and in working with different languages. You can see searchPlaces and the migrations containing "admin1Name" etc. to get an idea of how to do this. npm run typeorm:migrations:create will create an empty migration file to add the index. Also use the f_unaccent function like the other indices.

The query will also need to be restructured. As written, it will do a brute force search on both all embeddings and all exif rows. This is a more performant way to write this query as it lets both text and embeddings leverage indices for the bulk of the query and rerank on a much smaller set of rows. You can make the query from searchAssetBuilder its own CTE that the other two CTEs reference.

@TheHamkerCat
Copy link
Author

Thanks @mertalev will do this on weekend!

@mertalev
Copy link
Contributor

Sounds good! I'd also like to mention that we're changing from TypeORM to Kysely very soon (#12857), so it may be the case that the change happens before this PR is merged. The DB migration won't be affected by this, but the searchSmart changes would need to be refactored to use Kysely if this happens. Feel free to continue working on this PR if this is fine for you. Otherwise, you might want to wait until that PR is merged.

@TheHamkerCat
Copy link
Author

I'll wait for it to be merged in that case, will be easier for me since I'm not that much familiar with the codebase.

@bo0tzz bo0tzz marked this pull request as draft November 28, 2024 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants