|
| 1 | +--- |
| 2 | +title: Optimize indexing performance with batch statistics |
| 3 | +description: Learn how to analyze the `progressTrace` to identify and resolve indexing bottlenecks in Meilisearch. |
| 4 | +--- |
| 5 | + |
| 6 | +# Optimize indexing performance by analyzing batch statistics |
| 7 | + |
| 8 | +Indexing performance can vary significantly depending on your dataset, index settings, and hardware. The [batch object](/reference/api/batches) provides information about the progress of asynchronous indexing operations. |
| 9 | + |
| 10 | +The `progressTrace` field within the batch object offers a detailed breakdown of where time is spent during the indexing process. Use this data to identify bottlenecks and improve indexing speed. |
| 11 | + |
| 12 | +## Understanding the `progressTrace` |
| 13 | + |
| 14 | +`progressTrace` is a hierarchical trace showing each phase of indexing and how long it took. |
| 15 | +Each entry follows the structure: |
| 16 | + |
| 17 | +```json |
| 18 | +"processing tasks > indexing > extracting word proximity": "33.71s" |
| 19 | +``` |
| 20 | + |
| 21 | +This means: |
| 22 | + |
| 23 | +- The step occurred during **indexing**. |
| 24 | +- The subtask was **extracting word proximity**. |
| 25 | +- It took **33.71 seconds**. |
| 26 | + |
| 27 | +Focus on the **longest-running steps** and investigate which index settings or data characteristics influence them. |
| 28 | + |
| 29 | +## Key phases and how to optimize them |
| 30 | + |
| 31 | +### `computing document changes`and `extracting documents` |
| 32 | + |
| 33 | +| Description | Optimization | |
| 34 | +|--------------|--------------| |
| 35 | +| Meilisearch compares incoming documents to existing ones. | No direct optimization possible. Process duration scales with the number and size of incoming documents.| |
| 36 | + |
| 37 | +### `extracting facets` and `merging facet caches` |
| 38 | + |
| 39 | +| Description | Optimization | |
| 40 | +|--------------|--------------| |
| 41 | +| Extracts and merges filterable attributes. | Keep the number of [**filterable attributes**](/reference/api/settings#filterable-attributes) to a minimum. | |
| 42 | + |
| 43 | +### `extracting words` and `merging word caches` |
| 44 | + |
| 45 | +| Description | Optimization | |
| 46 | +|--------------|--------------| |
| 47 | +| Tokenizes text and builds the inverted index. | Ensure the [searchable attributes](/reference/api/settings#searchable-attributes) list only includes the fields you want to be checked for query word matches. | |
| 48 | + |
| 49 | +### `extracting word proximity` and `merging word proximity` |
| 50 | + |
| 51 | +| Description | Optimization | |
| 52 | +|--------------|--------------| |
| 53 | +| Builds data structures for phrase and attribute ranking. | Lower the precision of this operation by setting [proximity precision](/reference/api/settings#proximity-precision) to `byAttribute` | |
| 54 | + |
| 55 | +### `waiting for database writes` |
| 56 | + |
| 57 | +| Description | Optimization | |
| 58 | +|--------------|--------------| |
| 59 | +| Time spent writing data to disk. | No direct optimization possible. Either the disk is too slow or you are writing too much data in a single operation. Avoid HDDs (Hard Disk Drives) | |
| 60 | + |
| 61 | +### `waiting for extractors` |
| 62 | + |
| 63 | +| Description | Optimization | |
| 64 | +|--------------|--------------| |
| 65 | +| Time spent waiting for CPU-bound extraction. | No direct optimization possible. Indicates a CPU bottleneck. Use more cores or scale horizontally with [sharding](/learn/advanced/sharding). | |
| 66 | + |
| 67 | +### `post processing facets > strings bulk` / `numbers bulk` |
| 68 | + |
| 69 | +| Description | Optimization | |
| 70 | +|--------------|--------------| |
| 71 | +| Processes equality or comparison filters. | - Disable unused [**filter features**](/reference/api/settings#features), such as comparison operators on string values. <br /> - Reduce the number of [**sortable attributes**](reference/api/settings#sortable-attributes). | |
| 72 | + |
| 73 | +### `post processing facets > facet search` |
| 74 | + |
| 75 | +| Description | Optimization | |
| 76 | +|--------------|--------------| |
| 77 | +| Builds structures for the [facet search API](/reference/api/facet_search). | If you don’t use the facet search API, [disable it](/reference/api/settings#update-facet-search-settings).| |
| 78 | + |
| 79 | +### Embeddings |
| 80 | + |
| 81 | +| Trace key | Description | Optimization | |
| 82 | +|------------|--------------|--------------| |
| 83 | +| `writing embeddings to database` | Time spent saving vector embeddings. | Use embedding vectors with fewer dimensions. <br/>- [Disabling embedding regeneration on document update](/reference/api/documents#vectors). <br/>- Consider enabling [binary quantization](/reference/api/settings#binaryquantized). | |
| 84 | + |
| 85 | +### `post processing words > word prefix *` |
| 86 | + |
| 87 | +| Description | Optimization | |
| 88 | +|--------------|--------------| |
| 89 | +| | Builds prefix data for autocomplete. Allows matching documents that begin with a specific query term, instead of only exact matches.| Disable [**prefix search**](/reference/api/settings#prefix-search) (`prefixSearch: disabled`). _This can severely impact search result relevancy._ | |
| 90 | + |
| 91 | +### `post processing words > word fst` |
| 92 | + |
| 93 | +| Description | Optimization | |
| 94 | +|--------------|--------------| |
| 95 | +| Builds the word FST (finite state transducer). | No direct action possible, as FST size reflect the number of different words in the database. Using documents with fewer searchable words may improve operation speed. | |
| 96 | + |
| 97 | +## Example analysis |
| 98 | + |
| 99 | +If you see: |
| 100 | + |
| 101 | +```json |
| 102 | +"processing tasks > indexing > post processing facets > facet search": "1763.06s" |
| 103 | +``` |
| 104 | + |
| 105 | +[Facet searching](/learn/filtering_and_sorting/search_with_facet_filters#searching-facet-values) is raking significant indexing time. If your application doesn’t use facets, disable the feature: |
| 106 | + |
| 107 | +```bash |
| 108 | +curl \ |
| 109 | + -X PUT 'MEILISEARCH_URL/indexes/INDEX_UID/settings/facet-search' \ |
| 110 | + -H 'Content-Type: application/json' \ |
| 111 | + --data-binary 'false' |
| 112 | +``` |
| 113 | + |
| 114 | +## Learn more |
| 115 | + |
| 116 | +- [Indexing best practices](/learn/indexing/indexing_best_practices) |
| 117 | +- [Impact of RAM and multi-threading on indexing performance |
| 118 | +](/learn/indexing/ram_multithreading_performance) |
| 119 | +- [Configuring index settings](/learn/configuration/configuring_index_settings) |
0 commit comments