huggingface · qgallouedec · May 20, 2025 · May 10, 2025 · May 20, 2025
diff --git a/embedding-quantization.md b/embedding-quantization.md
@@ -340,13 +340,13 @@ The following [demo](https://huggingface.co/spaces/sentence-transformers/quantiz
 The following scripts can be used to experiment with embedding quantization for retrieval & beyond. There are three categories:
 
 * **Recommended Retrieval**:
-  * [semantic_search_recommended.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_recommended.py): This script combines binary search with scalar rescoring, much like the above demo, for cheap, efficient, and performant retrieval.
+  * [semantic_search_recommended.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_recommended.py): This script combines binary search with scalar rescoring, much like the above demo, for cheap, efficient, and performant retrieval.
 * **Usage**:
-  * [semantic_search_faiss.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_faiss.py): This script showcases regular usage of binary or scalar quantization, retrieval, and rescoring using FAISS, by using the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> utility function.
-  * [semantic_search_usearch.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_usearch.py): This script showcases regular usage of binary or scalar quantization, retrieval, and rescoring using USearch, by using the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> utility function.
+  * [semantic_search_faiss.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_faiss.py): This script showcases regular usage of binary or scalar quantization, retrieval, and rescoring using FAISS, by using the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> utility function.
+  * [semantic_search_usearch.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_usearch.py): This script showcases regular usage of binary or scalar quantization, retrieval, and rescoring using USearch, by using the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> utility function.
 * **Benchmarks**:
-  * [semantic_search_faiss_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_faiss_benchmark.py): This script includes a retrieval speed benchmark of `float32` retrieval, binary retrieval + rescoring, and scalar retrieval + rescoring, using FAISS. It uses the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> utility function. Our benchmarks especially show show speedups for `ubinary`.
-  * [semantic_search_usearch_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_usearch_benchmark.py): This script includes a retrieval speed benchmark of `float32` retrieval, binary retrieval + rescoring, and scalar retrieval + rescoring, using USearch. It uses the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> utility function. Our experiments show large speedups on newer hardware, particularly for `int8`.
+  * [semantic_search_faiss_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_faiss_benchmark.py): This script includes a retrieval speed benchmark of `float32` retrieval, binary retrieval + rescoring, and scalar retrieval + rescoring, using FAISS. It uses the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> utility function. Our benchmarks especially show show speedups for `ubinary`.
+  * [semantic_search_usearch_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_usearch_benchmark.py): This script includes a retrieval speed benchmark of `float32` retrieval, binary retrieval + rescoring, and scalar retrieval + rescoring, using USearch. It uses the <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> utility function. Our experiments show large speedups on newer hardware, particularly for `int8`.
 
 ### Future work
 

diff --git a/matryoshka.md b/matryoshka.md
@@ -111,9 +111,9 @@ References:
 
 See the following complete scripts as examples of how to apply the `MatryoshkaLoss` in practice:
 
-* **[matryoshka_nli.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_nli.py)**: This example uses the `MultipleNegativesRankingLoss` with `MatryoshkaLoss` to train a strong embedding model using Natural Language Inference (NLI) data. It is an adaptation of the [NLI](../nli/README) documentation.
-* **[matryoshka_nli_reduced_dim.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_nli_reduced_dim.py)**: This example uses the `MultipleNegativesRankingLoss` with `MatryoshkaLoss` to train a strong embedding model with a small maximum output dimension of 256. It trains using Natural Language Inference (NLI) data, and is an adaptation of the [NLI](../nli/README) documentation.
-* **[matryoshka_sts.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_sts.py)**: This example uses the `CoSENTLoss` with `MatryoshkaLoss` to train an embedding model on the training set of the `STSBenchmark` dataset. It is an adaptation of the [STS](../sts/README) documentation.
+* **[matryoshka_nli.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/training/matryoshka/matryoshka_nli.py)**: This example uses the `MultipleNegativesRankingLoss` with `MatryoshkaLoss` to train a strong embedding model using Natural Language Inference (NLI) data. It is an adaptation of the [NLI](../nli/README) documentation.
+* **[matryoshka_nli_reduced_dim.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/training/matryoshka/matryoshka_nli_reduced_dim.py)**: This example uses the `MultipleNegativesRankingLoss` with `MatryoshkaLoss` to train a strong embedding model with a small maximum output dimension of 256. It trains using Natural Language Inference (NLI) data, and is an adaptation of the [NLI](../nli/README) documentation.
+* **[matryoshka_sts.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/training/matryoshka/matryoshka_sts.py)**: This example uses the `CoSENTLoss` with `MatryoshkaLoss` to train an embedding model on the training set of the `STSBenchmark` dataset. It is an adaptation of the [STS](../sts/README) documentation.
 
 ## How do I use 🪆 Matryoshka Embedding models?
 
@@ -129,7 +129,7 @@ Keep in mind that although processing smaller embeddings for downstream tasks (r
 
 In Sentence Transformers, you can load a Matryoshka Embedding model just like any other model, but you can specify the desired embedding size using the `truncate_dim` argument. After that, you can perform inference using the [`SentenceTransformers.encode`](https://sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode) function, and the embeddings will be automatically truncated to the specified size.
 
-Let's try to use a model that I trained using [`matryoshka_nli.py`](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_nli.py) with [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base):
+Let's try to use a model that I trained using [`matryoshka_nli.py`](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/training/matryoshka/matryoshka_nli.py) with [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base):
 
 ```python
 from sentence_transformers import SentenceTransformer
@@ -202,8 +202,8 @@ similarities = cos_sim(embeddings[0], embeddings[1:])
 
 Now that Matryoshka models have been introduced, let's look at the actual performance that we may be able to expect from a Matryoshka embedding model versus a regular embedding model. For this experiment, I have trained two models:
 
-* [tomaarsen/mpnet-base-nli-matryoshka](https://huggingface.co/tomaarsen/mpnet-base-nli-matryoshka): Trained by running [`matryoshka_nli.py`](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_nli.py) with [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base).
-* [tomaarsen/mpnet-base-nli](https://huggingface.co/tomaarsen/mpnet-base-nli): Trained by running a modified version of [`matryoshka_nli.py`](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_nli.py) where the training loss is only `MultipleNegativesRankingLoss` rather than `MatryoshkaLoss` on top of `MultipleNegativesRankingLoss`. I also use [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base) as the base model.
+* [tomaarsen/mpnet-base-nli-matryoshka](https://huggingface.co/tomaarsen/mpnet-base-nli-matryoshka): Trained by running [`matryoshka_nli.py`](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/training/matryoshka/matryoshka_nli.py) with [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base).
+* [tomaarsen/mpnet-base-nli](https://huggingface.co/tomaarsen/mpnet-base-nli): Trained by running a modified version of [`matryoshka_nli.py`](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/training/matryoshka/matryoshka_nli.py) where the training loss is only `MultipleNegativesRankingLoss` rather than `MatryoshkaLoss` on top of `MultipleNegativesRankingLoss`. I also use [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base) as the base model.
 
 Both of these models were trained on the AllNLI dataset, which is a concatenation of the [SNLI](https://huggingface.co/datasets/snli) and [MultiNLI](https://huggingface.co/datasets/multi_nli) datasets. I have evaluated these models on the [STSBenchmark](https://huggingface.co/datasets/mteb/stsbenchmark-sts) test set using multiple different embedding dimensions. The results are plotted in the following figure:
 
@@ -231,4 +231,4 @@ In this demo, you can dynamically shrink the output dimensions of the [`nomic-ai
 * Kusupati, A., Bhatt, G., Rege, A., Wallingford, M., Sinha, A., Ramanujan, V., ... & Farhadi, A. (2022). Matryoshka representation learning. Advances in Neural Information Processing Systems, 35, 30233-30249. https://arxiv.org/abs/2205.13147
 * Matryoshka Embeddings — Sentence-Transformers documentation. (n.d.). https://sbert.net/examples/training/matryoshka/README.html
 * UKPLab. (n.d.). GitHub. https://github.com/UKPLab/sentence-transformers
-* Unboxing Nomic Embed v1.5: Resizable Production Embeddings with Matryoshka Representation Learning. (n.d.). https://blog.nomic.ai/posts/nomic-embed-matryoshka
+* Unboxing Nomic Embed v1.5: Resizable Production Embeddings with Matryoshka Representation Learning. (n.d.). https://blog.nomic.ai/posts/nomic-embed-matryoshka
diff --git a/static-embeddings.md b/static-embeddings.md
@@ -1327,7 +1327,7 @@ Additionally, there are quite a few possible extensions that are likely to impro
 2. Model Souping: Combining weights from multiple models trained in the same way with different seeds or data distributions.
 3. Curriculum Learning: Train on examples of increasing difficulties.
 4. [Guided False In-Batch Negatives Filtering](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#sentence_transformers.losses.CachedGISTEmbedLoss): Exclude false negatives via an efficient pre-trained embedding model.
-5. [Seed Optimization for the Random Weight Initialization](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/data_augmentation/train_sts_seed_optimization.py): Train the first steps with various seeds to find one with a useful weight initialization.
+5. [Seed Optimization for the Random Weight Initialization](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/training/data_augmentation/train_sts_seed_optimization.py): Train the first steps with various seeds to find one with a useful weight initialization.
 6. Tokenizer Retraining: Retrain a tokenizer with modern texts and learnings.
 7. Gradient Caching: Applying GradCache via [`CachedMultipleNegativesRankingLoss`](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#sentence_transformers.losses.CachedMultipleNegativesRankingLoss) allows for larger batches, which often result in superior performance.
 8. [Model Distillation](https://sbert.net/examples/training/distillation/README.html): Rather than training exclusively using supervised training data, we can also feed unsupervised data through a larger embedding model and distil those embeddings into the static embedding-based student model. 

diff --git a/zh/embedding-quantization.md b/zh/embedding-quantization.md
@@ -339,17 +339,17 @@ int8
 
 - **推荐检索**:
 
-  - [semantic_search_recommended.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_recommended.py): 此脚本结合了二进制搜索和标量重打分，与上面的演示类似，以实现廉价、高效且性能良好的检索。
+  - [semantic_search_recommended.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_recommended.py): 此脚本结合了二进制搜索和标量重打分，与上面的演示类似，以实现廉价、高效且性能良好的检索。
 
 - **使用**:
 
-  - [semantic_search_faiss.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_faiss.py): 此脚本展示了使用 FAISS 的常规二进制或标量量化、检索和重打分的使用方式，通过使用 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> 实用函数。
-  - [semantic_search_usearch.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_usearch.py): 此脚本展示了使用 USearch 的常规二进制或标量量化、检索和重打分的使用方式，通过使用 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> 实用函数。
+  - [semantic_search_faiss.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_faiss.py): 此脚本展示了使用 FAISS 的常规二进制或标量量化、检索和重打分的使用方式，通过使用 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> 实用函数。
+  - [semantic_search_usearch.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_usearch.py): 此脚本展示了使用 USearch 的常规二进制或标量量化、检索和重打分的使用方式，通过使用 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> 实用函数。
 
 - **基准测试**:
 
-  - [semantic_search_faiss_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_faiss_benchmark.py): 此脚本包括了对 `float32` 检索、二进制检索加重打分和标量检索加重打分的检索速度基准测试，使用 FAISS。它使用了 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> 实用函数。我们的基准测试特别显示了 `ubinary` 的速度提升。
-  - [semantic_search_usearch_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/embedding-quantization/semantic_search_usearch_benchmark.py): 此脚本包括了对 `float32` 检索、二进制检索加重打分和标量检索加重打分的检索速度基准测试，使用 USearch。它使用了 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> 实用函数。我们的实验在新硬件上显示了巨大的速度提升，特别是对于 `int8` 。
+  - [semantic_search_faiss_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_faiss_benchmark.py): 此脚本包括了对 `float32` 检索、二进制检索加重打分和标量检索加重打分的检索速度基准测试，使用 FAISS。它使用了 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss"><code>semantic_search_faiss</code></a> 实用函数。我们的基准测试特别显示了 `ubinary` 的速度提升。
+  - [semantic_search_usearch_benchmark.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/sentence_transformer/applications/embedding-quantization/semantic_search_usearch_benchmark.py): 此脚本包括了对 `float32` 检索、二进制检索加重打分和标量检索加重打分的检索速度基准测试，使用 USearch。它使用了 <a href="https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch"><code>semantic_search_usearch</code></a> 实用函数。我们的实验在新硬件上显示了巨大的速度提升，特别是对于 `int8` 。
 
 ### 未来工作