docs: Add documentation for search query flow. #1326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

junhaoliao wants to merge 4 commits into y-scope:main from junhaoliao:query-docs

Member

junhaoliao commented Sep 24, 2025 •

edited

Loading

Description

Checklist

The PR satisfies the contribution guidelines.
This is a breaking change and that has been indicated in the PR title, OR this isn't a
breaking change.
Necessary docs have been updated, OR no docs need to be updated.

Validation performed

task docs:serve and viewed http://127.0.0.1:8080/dev-docs/design-search-query-flow.html to confirm the document was rendered as expected.


          docs: Add documentation for search query flow.

401098e

junhaoliao requested a review from All-less

September 24, 2025 15:15

Contributor

coderabbitai bot commented Sep 24, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

junhaoliao added 3 commits

September 24, 2025 11:23


          Add sequence diagram for search query flow and reorganize result retr…

930d6f9

…ieval section


          fix mermaid diagram directive for sphinx

fa198d8


          normalize cases

13c6fc0

junhaoliao commented

View reviewed changes

docs/src/dev-docs/design-search-query-flow.md


		Upon receiving the query, the server:

		1. Creates two jobs in MySQL database:

Member Author

junhaoliao Sep 24, 2025

TODO: mention name of the query job table

junhaoliao commented

View reviewed changes

docs/src/dev-docs/design-search-query-flow.md

Comment on lines +109 to +110

		- Collection named after `searchJobId`
		- Collection named after `aggregationJobId`

Member Author

junhaoliao Sep 24, 2025

Suggested change

      
               - Collection named after `searchJobId`
          
               - Collection named after `aggregationJobId`
          
               - Collection named after `searchJobId`, for receiving log event results
          
               - Collection named after `aggregationJobId`, for receiving time-based aggregation results

All-less reviewed

View reviewed changes

docs/src/dev-docs/design-search-query-flow.md


		### 4. Result caching in MongoDB

		Search results are stored in MongoDB with the following characteristics:

Contributor

All-less Oct 3, 2025

(Personally, I would like more description here about how the results are handled. Below is my attempt, and feel free to correct/refine it.)

MongoDB gathers the results from the distributed search and caches them for future use (e.g., subsequent user queries). After the workers finish their search tasks, they write the results into a MongoDB collection.

All-less reviewed

View reviewed changes

docs/src/dev-docs/design-search-query-flow.md


		Search results are stored in MongoDB with the following characteristics:

		1. Collection structure: Each job has its own collection named after its job ID

Contributor

All-less Oct 3, 2025

Collection name: Each job has its own collection named after its job ID (see bullet 2 of 2. Server-side processing)

All-less reviewed

View reviewed changes

docs/src/dev-docs/design-search-query-flow.md

+              The storage and query engine settings are configured in the package's `etc/clp-config.yml` and
+              they are passed to the WebUI client via `client/public/settings.json` and the server via
+              `server/settings.json` at startup.

Contributor

All-less Oct 3, 2025 •

edited

Loading

Maybe list the major databases and table/collection structures here. They are mentioned several times throughout the document, so it's helpful to give readers a quick preview.

## Major Data Structures

1. A `query_jobs` table in MariaDB/MySQL tracks the search job status. 
- `id`: The ID of the query job.
- `status`: The status of the query job.
- `job_config`: The configuration of the query job.

2. The `clp-query-results` database in MongoDB caches the search results. 
...

3. The `results-metadata` database in MongoDB tracks ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet