Skip to content

Commit b62324a

Browse files
committed
feature #1026 [AI Bundle][Store] Add Retriever (OskarStark)
This PR was squashed before being merged into the main branch. Discussion ---------- [AI Bundle][Store] Add `Retriever` | Q | A | ------------- | --- | Bug fix? | no | New feature? | yes | Docs? | yes | Issues | -- | License | MIT ### Proof <img width="1920" height="1010" alt="CleanShot 2025-11-29 at 14 33 48@2x" src="https://github.com/user-attachments/assets/86a0c450-06ee-451b-84dc-71069fb2a5f3" /> <img width="1876" height="824" alt="CleanShot 2025-11-29 at 14 41 30@2x" src="https://github.com/user-attachments/assets/64d010e0-91ae-4c08-ab14-ce7dee3bb488" /> Commits ------- 6ed7281 [AI Bundle][Store] Add `Retriever`
2 parents 65023b2 + 6ed7281 commit b62324a

File tree

20 files changed

+1004
-79
lines changed

20 files changed

+1004
-79
lines changed

.github/workflows/integration-tests.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,10 @@ jobs:
104104
composer-options: "--no-scripts"
105105
working-directory: demo
106106

107+
- name: Link local packages
108+
working-directory: demo
109+
run: ../link
110+
107111
- run: composer run-script auto-scripts --no-interaction
108112
working-directory: demo
109113

demo/AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,8 @@ composer install
3131
echo "OPENAI_API_KEY='sk-...'" > .env.local
3232

3333
# Initialize vector store
34-
symfony console app:blog:embed -vv
35-
symfony console app:blog:query
34+
symfony console ai:store:index blog -vv
35+
symfony console ai:store:retrieve blog "Week of Symfony"
3636

3737
# Start server
3838
symfony serve -d

demo/CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ echo "OPENAI_API_KEY='sk-...'" > .env.local
3939
symfony console ai:store:index blog -vv
4040

4141
# Test vector store
42-
symfony console app:blog:query
42+
symfony console ai:store:retrieve blog "Week of Symfony"
4343

4444
# Start development server
4545
symfony serve -d

demo/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,10 +77,10 @@ To initialize the Chroma DB, you need to run the following command:
7777
symfony console ai:store:index blog -vv
7878
```
7979

80-
Now you should be able to run the test command and get some results:
80+
Now you should be able to retrieve documents from the store:
8181

8282
```shell
83-
symfony console app:blog:query
83+
symfony console ai:store:retrieve blog "Week of Symfony"
8484
```
8585

8686
**Don't forget to set up the project in your favorite IDE or editor.**

demo/config/packages/ai.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,10 @@ ai:
100100
- 'Symfony\AI\Store\Document\Transformer\TextTrimTransformer'
101101
vectorizer: 'ai.vectorizer.openai'
102102
store: 'ai.store.chroma_db.symfonycon'
103+
retriever:
104+
blog:
105+
vectorizer: 'ai.vectorizer.openai'
106+
store: 'ai.store.chroma_db.symfonycon'
103107

104108
services:
105109
_defaults:

demo/src/Blog/Command/QueryCommand.php

Lines changed: 0 additions & 71 deletions
This file was deleted.

docs/bundles/ai-bundle.rst

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -972,6 +972,69 @@ Benefits of Configured Vectorizers
972972
* **Consistency**: Ensure all indexers using the same vectorizer have identical embedding configuration
973973
* **Maintainability**: Change vectorizer settings in one place
974974

975+
Retrievers
976+
----------
977+
978+
Retrievers are the opposite of indexers. While indexers populate a vector store with documents,
979+
retrievers allow you to search for documents in a store based on a query string.
980+
They vectorize the query and retrieve similar documents from the store.
981+
982+
Configuring Retrievers
983+
~~~~~~~~~~~~~~~~~~~~~~
984+
985+
Retrievers are defined in the ``retriever`` section of your configuration:
986+
987+
.. code-block:: yaml
988+
989+
ai:
990+
retriever:
991+
default:
992+
vectorizer: 'ai.vectorizer.openai_small'
993+
store: 'ai.store.chroma_db.default'
994+
995+
research:
996+
vectorizer: 'ai.vectorizer.mistral_embed'
997+
store: 'ai.store.memory.research'
998+
999+
Using Retrievers
1000+
~~~~~~~~~~~~~~~~
1001+
1002+
The retriever can be injected into your services using the ``RetrieverInterface``::
1003+
1004+
use Symfony\AI\Store\RetrieverInterface;
1005+
1006+
final readonly class MyService
1007+
{
1008+
public function __construct(
1009+
private RetrieverInterface $retriever,
1010+
) {
1011+
}
1012+
1013+
public function search(string $query): array
1014+
{
1015+
$documents = [];
1016+
foreach ($this->retriever->retrieve($query) as $document) {
1017+
$documents[] = $document;
1018+
}
1019+
1020+
return $documents;
1021+
}
1022+
}
1023+
1024+
When you have multiple retrievers configured, you can use the ``#[Autowire]`` attribute to inject a specific one::
1025+
1026+
use Symfony\AI\Store\RetrieverInterface;
1027+
use Symfony\Component\DependencyInjection\Attribute\Autowire;
1028+
1029+
final readonly class ResearchService
1030+
{
1031+
public function __construct(
1032+
#[Autowire(service: 'ai.retriever.research')]
1033+
private RetrieverInterface $retriever,
1034+
) {
1035+
}
1036+
}
1037+
9751038
Profiler
9761039
--------
9771040

docs/components/store.rst

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,34 @@ used vector store::
3333
$document = new TextDocument('This is a sample document.');
3434
$indexer->index($document);
3535

36-
You can find more advanced usage in combination with an Agent using the store for RAG in the examples folder:
36+
You can find more advanced usage in combination with an Agent using the store for RAG in the examples folder.
37+
38+
Retrieving
39+
----------
40+
41+
The opposite of indexing is retrieving. The :class:`Symfony\\AI\\Store\\Retriever` is a higher level feature that allows you to
42+
search for documents in a store based on a query string. It vectorizes the query and retrieves similar documents from the store::
43+
44+
use Symfony\AI\Store\Retriever;
45+
46+
$retriever = new Retriever($vectorizer, $store);
47+
$documents = $retriever->retrieve('What is the capital of France?');
48+
49+
foreach ($documents as $document) {
50+
echo $document->metadata->get('source');
51+
}
52+
53+
The retriever accepts optional parameters to customize the retrieval:
54+
55+
* ``$options``: An array of options to pass to the underlying store query (e.g., limit, filters)
56+
57+
Example Usage
58+
~~~~~~~~~~~~~
59+
60+
* `Basic Retriever Example`_
61+
62+
Similarity Search Examples
63+
~~~~~~~~~~~~~~~~~~~~~~~~~~
3764

3865
* `Similarity Search with Cloudflare (RAG)`_
3966
* `Similarity Search with Manticore (RAG)`_
@@ -129,6 +156,7 @@ This leads to a store implementing two methods::
129156
}
130157

131158
.. _`Retrieval Augmented Generation`: https://en.wikipedia.org/wiki/Retrieval-augmented_generation
159+
.. _`Basic Retriever Example`: https://github.com/symfony/ai/blob/main/examples/retriever/basic.php
132160
.. _`Similarity Search with Cloudflare (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/cloudflare.php
133161
.. _`Similarity Search with Manticore (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/manticore.php
134162
.. _`Similarity Search with MariaDB (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/mariadb-gemini.php

examples/retriever/basic.php

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <[email protected]>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
use Symfony\AI\Platform\Bridge\OpenAi\PlatformFactory;
13+
use Symfony\AI\Store\Bridge\Local\InMemoryStore;
14+
use Symfony\AI\Store\Document\Loader\TextFileLoader;
15+
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer;
16+
use Symfony\AI\Store\Document\Vectorizer;
17+
use Symfony\AI\Store\Indexer;
18+
use Symfony\AI\Store\Retriever;
19+
20+
require_once dirname(__DIR__).'/bootstrap.php';
21+
22+
$store = new InMemoryStore();
23+
24+
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client());
25+
$vectorizer = new Vectorizer($platform, 'text-embedding-3-small');
26+
27+
$indexer = new Indexer(
28+
loader: new TextFileLoader(),
29+
vectorizer: $vectorizer,
30+
store: $store,
31+
source: [
32+
dirname(__DIR__, 2).'/fixtures/movies/gladiator.md',
33+
dirname(__DIR__, 2).'/fixtures/movies/inception.md',
34+
dirname(__DIR__, 2).'/fixtures/movies/jurassic-park.md',
35+
],
36+
transformers: [
37+
new TextSplitTransformer(chunkSize: 500, overlap: 100),
38+
],
39+
);
40+
$indexer->index();
41+
42+
$retriever = new Retriever(
43+
vectorizer: $vectorizer,
44+
store: $store,
45+
);
46+
47+
echo "Searching for: 'Roman gladiator revenge'\n\n";
48+
$results = $retriever->retrieve('Roman gladiator revenge', ['maxItems' => 1]);
49+
50+
foreach ($results as $i => $document) {
51+
echo sprintf("%d. Document ID: %s\n", $i + 1, $document->id);
52+
echo sprintf(" Score: %s\n", $document->score ?? 'n/a');
53+
echo sprintf(" Source: %s\n\n", $document->metadata->getSource() ?? 'unknown');
54+
}

examples/retriever/movies.php

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <[email protected]>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
use Symfony\AI\Fixtures\Movies;
13+
use Symfony\AI\Platform\Bridge\OpenAi\PlatformFactory;
14+
use Symfony\AI\Store\Bridge\Local\InMemoryStore;
15+
use Symfony\AI\Store\Document\Loader\InMemoryLoader;
16+
use Symfony\AI\Store\Document\Metadata;
17+
use Symfony\AI\Store\Document\TextDocument;
18+
use Symfony\AI\Store\Document\Vectorizer;
19+
use Symfony\AI\Store\Indexer;
20+
use Symfony\AI\Store\Retriever;
21+
use Symfony\Component\Uid\Uuid;
22+
23+
require_once dirname(__DIR__).'/bootstrap.php';
24+
25+
$store = new InMemoryStore();
26+
27+
$documents = [];
28+
foreach (Movies::all() as $movie) {
29+
$documents[] = new TextDocument(
30+
id: Uuid::v4(),
31+
content: 'Title: '.$movie['title'].\PHP_EOL.'Director: '.$movie['director'].\PHP_EOL.'Description: '.$movie['description'],
32+
metadata: new Metadata($movie),
33+
);
34+
}
35+
36+
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client());
37+
$vectorizer = new Vectorizer($platform, 'text-embedding-3-small', logger());
38+
39+
$indexer = new Indexer(new InMemoryLoader($documents), $vectorizer, $store, logger: logger());
40+
$indexer->index();
41+
42+
$retriever = new Retriever($vectorizer, $store, logger());
43+
44+
echo "Searching for movies about 'crime family mafia'\n";
45+
echo "================================================\n\n";
46+
47+
$results = $retriever->retrieve('crime family mafia');
48+
49+
foreach ($results as $i => $document) {
50+
$title = $document->metadata['title'];
51+
$director = $document->metadata['director'];
52+
$score = $document->score;
53+
54+
echo sprintf("%d. %s (Director: %s)\n", $i + 1, $title, $director);
55+
echo sprintf(" Score: %.4f\n\n", $score ?? 0);
56+
}

0 commit comments

Comments
 (0)