Skip to content

Queries are slow for large document collections #297

@laserval

Description

@laserval

When documents are input faster than Hydra consumes them, they will be stored in MongoDB. As the documents collection grows, queries against it get slower (as is expected).

exists and equals is affected badly by this, slowing to take seconds when document counts reach 100 000+. MongoDb recommends putting indices on fields queried like this, but that can't be done with the current document store model (as there are too many values for all content fields).

To speed this up, and other queries like it, we could store a list of fields in MongoDB alongside the content map and put an index on it. The exists query would then check that field instead of the content map. This requires changing MongoQuery and MongoDocument.

For equals there is little we can do from Hydra. The user can feel free to put an index on the content field in question in MongoDB.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions