diff --git a/content/integrate/redisvl/api/cache.md b/content/integrate/redisvl/api/cache.md index f9cecf6dc2..34de495381 100644 --- a/content/integrate/redisvl/api/cache.md +++ b/content/integrate/redisvl/api/cache.md @@ -407,7 +407,7 @@ The default TTL, in seconds, for entries in the cache. -### `class EmbeddingsCache(name='embedcache', ttl=None, redis_client=None, redis_url='redis://localhost:6379', connection_kwargs={})` +### `class EmbeddingsCache(name='embedcache', ttl=None, redis_client=None, async_redis_client=None, redis_url='redis://localhost:6379', connection_kwargs={})` Bases: `BaseCache` @@ -418,9 +418,10 @@ Initialize an embeddings cache. * **Parameters:** * **name** (*str*) – The name of the cache. Defaults to “embedcache”. * **ttl** (*Optional* *[* *int* *]*) – The time-to-live for cached embeddings. Defaults to None. - * **redis_client** (*Optional* *[* *Redis* *]*) – Redis client instance. Defaults to None. + * **redis_client** (*Optional* *[* *SyncRedisClient* *]*) – Redis client instance. Defaults to None. * **redis_url** (*str*) – Redis URL for connection. Defaults to “redis://localhost:6379”. * **connection_kwargs** (*Dict* *[* *str* *,* *Any* *]*) – Redis connection arguments. Defaults to {}. + * **async_redis_client** (*Redis* *|* *RedisCluster* *|* *None*) * **Raises:** **ValueError** – If vector dimensions are invalid diff --git a/content/integrate/redisvl/api/query.md b/content/integrate/redisvl/api/query.md index 5144f57116..a283442020 100644 --- a/content/integrate/redisvl/api/query.md +++ b/content/integrate/redisvl/api/query.md @@ -173,6 +173,8 @@ Add fields to return fields. Use a different scoring function to evaluate document relevance. Default is TFIDF. +Since Redis 8.0 default was changed to BM25STD. + * **Parameters:** **scorer** (*str*) – The scoring function to use (e.g. TFIDF.DOCNORM or BM25) @@ -487,6 +489,8 @@ Add fields to return fields. Use a different scoring function to evaluate document relevance. Default is TFIDF. +Since Redis 8.0 default was changed to BM25STD. + * **Parameters:** **scorer** (*str*) – The scoring function to use (e.g. TFIDF.DOCNORM or BM25) @@ -1069,6 +1073,8 @@ Add fields to return fields. Use a different scoring function to evaluate document relevance. Default is TFIDF. +Since Redis 8.0 default was changed to BM25STD. + * **Parameters:** **scorer** (*str*) – The scoring function to use (e.g. TFIDF.DOCNORM or BM25) @@ -1281,6 +1287,8 @@ Add fields to return fields. Use a different scoring function to evaluate document relevance. Default is TFIDF. +Since Redis 8.0 default was changed to BM25STD. + * **Parameters:** **scorer** (*str*) – The scoring function to use (e.g. TFIDF.DOCNORM or BM25) @@ -1498,6 +1506,8 @@ Add fields to return fields. Use a different scoring function to evaluate document relevance. Default is TFIDF. +Since Redis 8.0 default was changed to BM25STD. + * **Parameters:** **scorer** (*str*) – The scoring function to use (e.g. TFIDF.DOCNORM or BM25) diff --git a/content/integrate/redisvl/api/router.md b/content/integrate/redisvl/api/router.md index b39639696b..ef3fb61724 100644 --- a/content/integrate/redisvl/api/router.md +++ b/content/integrate/redisvl/api/router.md @@ -20,7 +20,7 @@ Initialize the SemanticRouter. * **routes** (*List* *[*[Route](#route) *]*) – List of Route objects. * **vectorizer** (*BaseVectorizer* *,* *optional*) – The vectorizer used to embed route references. Defaults to default HFTextVectorizer. * **routing_config** ([RoutingConfig](#routingconfig) *,* *optional*) – Configuration for routing behavior. Defaults to the default RoutingConfig. - * **redis_client** (*Optional* *[* *Redis* *]* *,* *optional*) – Redis client for connection. Defaults to None. + * **redis_client** (*Optional* *[* *SyncRedisClient* *]* *,* *optional*) – Redis client for connection. Defaults to None. * **redis_url** (*str* *,* *optional*) – The redis url. Defaults to redis://localhost:6379. * **overwrite** (*bool* *,* *optional*) – Whether to overwrite existing index. Defaults to False. * **connection_kwargs** (*Dict* *[* *str* *,* *Any* *]*) – The connection arguments @@ -98,7 +98,7 @@ Return SemanticRouter instance from existing index. * **Parameters:** * **name** (*str*) - * **redis_client** (*Redis* *|* *None*) + * **redis_client** (*Redis* *|* *RedisCluster* *|* *None*) * **redis_url** (*str*) * **Return type:** [SemanticRouter](#semanticrouter) diff --git a/content/integrate/redisvl/api/searchindex.md b/content/integrate/redisvl/api/searchindex.md index d92838f364..8e500f4a8b 100644 --- a/content/integrate/redisvl/api/searchindex.md +++ b/content/integrate/redisvl/api/searchindex.md @@ -48,7 +48,7 @@ kwargs. * **Parameters:** * **schema** ([*IndexSchema*]({{< relref "schema/#indexschema" >}})) – Index schema object. - * **redis_client** (*Optional* *[* *redis.Redis* *]*) – An + * **redis_client** (*Optional* *[* *Redis* *]*) – An instantiated redis client. * **redis_url** (*Optional* *[* *str* *]*) – The URL of the Redis server to connect to. @@ -88,13 +88,13 @@ This method takes a list of queries and optionally query params and returns a list of Result objects for each query. Results are returned in the same order as the queries. +NOTE: Cluster users may need to incorporate hash tags into their query +to avoid cross-slot operations. + * **Parameters:** - * **queries** (*List* *[* *SearchParams* *]*) – The queries to search for. batch_size - * **(* ***int** – The number of queries to search for at a time. - Defaults to 10. - * **optional****)** – The number of queries to search for at a time. + * **queries** (*List* *[* *SearchParams* *]*) – The queries to search for. + * **batch_size** (*int* *,* *optional*) – The number of queries to search for at a time. Defaults to 10. - * **batch_size** (*int*) * **Returns:** The search results for each query. * **Return type:** @@ -105,6 +105,10 @@ returned in the same order as the queries. Clear all keys in Redis associated with the index, leaving the index available and in-place for future insertions or updates. +NOTE: This method requires custom behavior for Redis Cluster because +here, we can’t easily give control of the keys we’re clearing to the +user so they can separate them based on hash tag. + * **Returns:** Count of records deleted from Redis. * **Return type:** @@ -176,6 +180,10 @@ Remove documents from the index by their document IDs. This method converts document IDs to Redis keys automatically by applying the index’s key prefix and separator configuration. +NOTE: Cluster users will need to incorporate hash tags into their +document IDs and only call this method with documents from a single hash +tag at a time. + * **Parameters:** **ids** (*Union* *[* *str* *,* *List* *[* *str* *]* *]*) – The document ID or IDs to remove from the index. * **Returns:** @@ -261,7 +269,7 @@ Initialize from an existing search index in Redis by index name. * **Parameters:** * **name** (*str*) – Name of the search index in Redis. - * **redis_client** (*Optional* *[* *redis.Redis* *]*) – An + * **redis_client** (*Optional* *[* *Redis* *]*) – An instantiated redis client. * **redis_url** (*Optional* *[* *str* *]*) – The URL of the Redis server to connect to. @@ -436,12 +444,12 @@ Async Redis client. It is useful for cases where an external, custom-configured client is preferred instead of creating a new one. * **Parameters:** - **redis_client** (*redis.Redis*) – A Redis or Async Redis + **redis_client** (*Redis*) – A Redis or Async Redis client instance to be used for the connection. * **Raises:** **TypeError** – If the provided client is not valid. -#### `property client: Redis | None` +#### `property client: Redis | RedisCluster | None` The underlying redis-py client object. @@ -503,7 +511,7 @@ Initialize the RedisVL async search index with a schema. * **schema** ([*IndexSchema*]({{< relref "schema/#indexschema" >}})) – Index schema object. * **redis_url** (*Optional* *[* *str* *]* *,* *optional*) – The URL of the Redis server to connect to. - * **redis_client** (*Optional* *[* *aredis.Redis* *]*) – An + * **redis_client** (*Optional* *[* *AsyncRedis* *]*) – An instantiated redis client. * **connection_kwargs** (*Optional* *[* *Dict* *[* *str* *,* *Any* *]* *]*) – Redis client connection args. @@ -535,28 +543,42 @@ Asynchronously execute a batch of queries and process results. #### `async batch_search(queries, batch_size=10)` -Perform a search against the index for multiple queries. +Asynchronously execute a batch of search queries. + +This method takes a list of search queries and executes them in batches +to improve performance when dealing with multiple queries. -This method takes a list of queries and returns a list of Result objects -for each query. Results are returned in the same order as the queries. +NOTE: Cluster users may need to incorporate hash tags into their query +to avoid cross-slot operations. * **Parameters:** - * **queries** (*List* *[* *SearchParams* *]*) – The queries to search for. batch_size - * **(* ***int** – The number of queries to search for at a time. - Defaults to 10. - * **optional****)** – The number of queries to search for at a time. - Defaults to 10. - * **batch_size** (*int*) + * **queries** (*List* *[* *SearchParams* *]*) – A list of search queries to execute. + Each query can be either a string or a tuple of (query, params). + * **batch_size** (*int* *,* *optional*) – The number of queries to execute in each + batch. Defaults to 10. * **Returns:** - The search results for each query. + A list of search results corresponding to each query. * **Return type:** List[Result] +```python +queries = [ + "hello world", + ("goodbye world", {"num_results": 5}), +] + +results = await index.batch_search(queries) +``` + #### `async clear()` Clear all keys in Redis associated with the index, leaving the index available and in-place for future insertions or updates. +NOTE: This method requires custom behavior for Redis Cluster because here, +we can’t easily give control of the keys we’re clearing to the user so they +can separate them based on hash tag. + * **Returns:** Count of records deleted from Redis. * **Return type:** @@ -617,6 +639,10 @@ Remove documents from the index by their document IDs. This method converts document IDs to Redis keys automatically by applying the index’s key prefix and separator configuration. +NOTE: Cluster users will need to incorporate hash tags into their +document IDs and only call this method with documents from a single hash +tag at a time. + * **Parameters:** **ids** (*Union* *[* *str* *,* *List* *[* *str* *]* *]*) – The document ID or IDs to remove from the index. * **Returns:** @@ -700,7 +726,7 @@ Initialize from an existing search index in Redis by index name. * **Parameters:** * **name** (*str*) – Name of the search index in Redis. - * **redis_client** (*Optional* *[* *redis.Redis* *]*) – An + * **redis_client** (*Optional* *[* *Redis* *]*) – An instantiated redis client. * **redis_url** (*Optional* *[* *str* *]*) – The URL of the Redis server to connect to. @@ -872,11 +898,11 @@ results = await index.query(query) #### `async search(*args, **kwargs)` -Perform a search on this index. +Perform an async search against the index. -Wrapper around redis.search.Search that adds the index name -to the search query and passes along the rest of the arguments -to the redis-py ft.search() method. +Wrapper around the search API that adds the index name +to the query and passes along the rest of the arguments +to the redis-py ft().search() method. * **Returns:** Raw Redis search results. @@ -889,9 +915,9 @@ to the redis-py ft.search() method. This method is deprecated; please provide connection parameters in \_\_init_\_. * **Parameters:** - **redis_client** (*Redis* *|* *Redis*) + **redis_client** (*Redis* *|* *RedisCluster* *|* *Redis* *|* *RedisCluster*) -#### `property client: Redis | None` +#### `property client: Redis | RedisCluster | None` The underlying redis-py client object. diff --git a/content/integrate/redisvl/overview/cli.md b/content/integrate/redisvl/overview/cli.md index 123b78986f..0612574105 100644 --- a/content/integrate/redisvl/overview/cli.md +++ b/content/integrate/redisvl/overview/cli.md @@ -19,7 +19,7 @@ Before running this notebook, be sure to !rvl version ``` - 19:16:18 [RedisVL] INFO RedisVL version 0.5.2 + 16:41:26 [RedisVL] INFO RedisVL version 0.7.0 ## Commands @@ -74,7 +74,7 @@ fields: !rvl index create -s schema.yaml ``` - 19:16:21 [RedisVL] INFO Index created successfully + 16:43:40 [RedisVL] INFO Index created successfully @@ -83,8 +83,8 @@ fields: !rvl index listall ``` - 19:16:24 [RedisVL] INFO Indices: - 19:16:24 [RedisVL] INFO 1. vectorizers + 16:43:43 [RedisVL] INFO Indices: + 16:43:43 [RedisVL] INFO 1. vectorizers @@ -116,7 +116,7 @@ fields: !rvl index delete -i vectorizers ``` - 19:16:29 [RedisVL] INFO Index deleted successfully + 16:43:50 [RedisVL] INFO Index deleted successfully @@ -125,7 +125,7 @@ fields: !rvl index listall ``` - 19:16:32 [RedisVL] INFO Indices: + 16:43:53 [RedisVL] INFO Indices: ## Stats @@ -139,7 +139,7 @@ The ``rvl stats`` command will return some basic information about the index. Th !rvl index create -s schema.yaml ``` - 19:16:35 [RedisVL] INFO Index created successfully + 16:43:55 [RedisVL] INFO Index created successfully @@ -148,8 +148,8 @@ The ``rvl stats`` command will return some basic information about the index. Th !rvl index listall ``` - 19:16:38 [RedisVL] INFO Indices: - 19:16:38 [RedisVL] INFO 1. vectorizers + 16:43:58 [RedisVL] INFO Indices: + 16:43:58 [RedisVL] INFO 1. vectorizers @@ -206,8 +206,8 @@ By default rvl first checks if you have `REDIS_URL` environment variable defined !rvl index listall --host localhost --port 6379 ``` - 19:16:43 [RedisVL] INFO Indices: - 19:16:43 [RedisVL] INFO 1. vectorizers + 16:44:03 [RedisVL] INFO Indices: + 16:44:03 [RedisVL] INFO 1. vectorizers ### Using SSL encription @@ -220,13 +220,10 @@ You can similarly specify the username and password to construct the full Redis !rvl index listall --user jane_doe -a password123 --ssl ``` - 19:16:46 [RedisVL] ERROR Error 8 connecting to rediss:6379. nodename nor servname provided, or not known. - - ```python !rvl index destroy -i vectorizers ``` - 19:16:49 [RedisVL] INFO Index deleted successfully + 16:44:13 [RedisVL] INFO Index deleted successfully diff --git a/content/integrate/redisvl/user_guide/embeddings_cache.md b/content/integrate/redisvl/user_guide/embeddings_cache.md index 731dba6588..c5d29af55f 100644 --- a/content/integrate/redisvl/user_guide/embeddings_cache.md +++ b/content/integrate/redisvl/user_guide/embeddings_cache.md @@ -45,7 +45,17 @@ vectorizer = HFTextVectorizer( /Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm - Compiling the model with `torch.compile` and using a `torch.mps` device is not supported. Falling back to non-compiled mode. + + + 16:54:03 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:54:03 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:54:03 sentence_transformers.SentenceTransformer WARNING You try to use a model that was created with version 4.1.0, however, your version is 3.4.1. This might cause unexpected behavior or errors. In that case, try to update to the latest version. + + + + + + Batches: 100%|██████████| 1/1 [00:00<00:00, 5.57it/s] ## Initializing the EmbeddingsCache @@ -96,9 +106,14 @@ key = cache.set( print(f"Stored with key: {key[:15]}...") ``` + Batches: 100%|██████████| 1/1 [00:00<00:00, 5.20it/s] + Stored with key: embedcache:909f... + + + ### Retrieving Embeddings To retrieve an embedding from the cache, use the `get` method with the original text and model name: @@ -250,11 +265,18 @@ cache.mdrop(texts, model_name) # cache.mdrop_by_keys(keys) # Delete by keys ``` + Batches: 100%|██████████| 1/1 [00:00<00:00, 19.83it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 8.51it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.52it/s] + Stored 3 embeddings with batch operation All embeddings exist: True Retrieved 3 embeddings in one operation + + + Batch operations are particularly beneficial when working with large numbers of embeddings. They provide the same functionality as individual operations but with better performance by reducing network roundtrips. For asynchronous applications, async versions of all batch methods are also available with the `am` prefix (e.g., `amset`, `amget`, `amexists`, `amdrop`). @@ -426,6 +448,19 @@ for query in set(queries): # Use set to get unique queries example_cache.drop(text=query, model_name=model_name) ``` + 16:54:13 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:54:13 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:54:13 sentence_transformers.SentenceTransformer WARNING You try to use a model that was created with version 4.1.0, however, your version is 3.4.1. This might cause unexpected behavior or errors. In that case, try to update to the latest version. + + + + + + Batches: 100%|██████████| 1/1 [00:00<00:00, 17.63it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 16.67it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 18.67it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 19.96it/s] + Statistics: Total queries: 5 @@ -434,6 +469,9 @@ for query in set(queries): # Use set to get unique queries Cache hit rate: 40.0% + + + ## Performance Benchmark Let's run benchmarks to compare the performance of embedding with and without caching, as well as batch versus individual operations. @@ -482,17 +520,36 @@ print(f"Latency reduction: {latency_reduction:.4f} seconds per query") ``` Benchmarking without caching: - Time taken without caching: 0.4735 seconds - Average time per embedding: 0.0474 seconds + + + Batches: 100%|██████████| 1/1 [00:00<00:00, 20.36it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.74it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 20.61it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 22.23it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 22.54it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 22.79it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 22.03it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.81it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.98it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 22.68it/s] + + + Time taken without caching: 0.4814 seconds + Average time per embedding: 0.0481 seconds Benchmarking with caching: - Time taken with caching: 0.0663 seconds - Average time per embedding: 0.0066 seconds + + + Batches: 100%|██████████| 1/1 [00:00<00:00, 22.43it/s] + + + Time taken with caching: 0.0681 seconds + Average time per embedding: 0.0068 seconds Performance comparison: - Speedup with caching: 7.14x faster - Time saved: 0.4073 seconds (86.0%) - Latency reduction: 0.0407 seconds per query + Speedup with caching: 7.07x faster + Time saved: 0.4133 seconds (85.9%) + Latency reduction: 0.0413 seconds per query ## Common Use Cases for Embedding Caching diff --git a/content/integrate/redisvl/user_guide/getting_started.md b/content/integrate/redisvl/user_guide/getting_started.md index bd9f294c86..a7b6dffc0e 100644 --- a/content/integrate/redisvl/user_guide/getting_started.md +++ b/content/integrate/redisvl/user_guide/getting_started.md @@ -181,8 +181,8 @@ Use the `rvl` CLI to inspect the created index and its fields: !rvl index listall ``` - 19:17:09 [RedisVL] INFO Indices: - 19:17:09 [RedisVL] INFO 1. user_simple + 16:44:38 [RedisVL] INFO Indices: + 16:44:38 [RedisVL] INFO 1. user_simple @@ -224,30 +224,33 @@ keys = index.load(data) print(keys) ``` - ['user_simple_docs:01JT4PPPNJZMSK2395RKD208T9', 'user_simple_docs:01JT4PPPNM63J55ZESZ4TV1VR8', 'user_simple_docs:01JT4PPPNM59RCKS2YQ58B1HQW'] + ['user_simple_docs:01JWEWMDE9VCXNRVBBB3T3NG55', 'user_simple_docs:01JWEWMDECWRM26BXNTHWMBY6C', 'user_simple_docs:01JWEWMDECZ8B4Q8PNMBP9TZWY'] By default, `load` will create a unique Redis key as a combination of the index key `prefix` and a random ULID. You can also customize the key by providing direct keys or pointing to a specified `id_field` on load. -### Load invalid data +### Load INVALID data This will raise a `SchemaValidationError` if `validate_on_load` is set to true in the `SearchIndex` class. ```python # NBVAL_SKIP -keys = index.load([{"user_embedding": True}]) +try: + keys = index.load([{"user_embedding": True}]) +except Exception as e: + print(f"Failed to load data {str(e)}") ``` - 19:17:21 redisvl.index.index ERROR Schema validation error while loading data + 16:45:26 redisvl.index.index ERROR Schema validation error while loading data Traceback (most recent call last): - File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 204, in _preprocess_and_validate_objects + File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 245, in _preprocess_and_validate_objects processed_obj = self._validate(processed_obj) - File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 160, in _validate + File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 178, in _validate return validate_object(self.index_schema, obj) - File "/Users/justin.cechmanek/Documents/redisvl/redisvl/schema/validation.py", line 276, in validate_object + File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/schema/validation.py", line 276, in validate_object validated = model_class.model_validate(flat_obj) - File "/Users/justin.cechmanek/.pyenv/versions/3.13/envs/redisvl-dev/lib/python3.13/site-packages/pydantic/main.py", line 627, in model_validate + File "/Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/pydantic/main.py", line 627, in model_validate return cls.__pydantic_validator__.validate_python( ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ obj, strict=strict, from_attributes=from_attributes, context=context @@ -262,123 +265,29 @@ keys = index.load([{"user_embedding": True}]) The above exception was the direct cause of the following exception: Traceback (most recent call last): - File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/index.py", line 686, in load + File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/index.py", line 772, in load return self._storage.write( ~~~~~~~~~~~~~~~~~~~^ - self._redis_client, # type: ignore - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + self._redis_client, + ^^^^^^^^^^^^^^^^^^^ ...<6 lines>... validate=self._validate_on_load, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ - File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 265, in write + File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 306, in write prepared_objects = self._preprocess_and_validate_objects( list(objects), # Convert Iterable to List ...<3 lines>... validate=validate, ) - File "/Users/justin.cechmanek/Documents/redisvl/redisvl/index/storage.py", line 211, in _preprocess_and_validate_objects + File "/Users/tyler.hutcherson/Documents/AppliedAI/redis-vl-python/redisvl/index/storage.py", line 252, in _preprocess_and_validate_objects raise SchemaValidationError(str(e), index=i) from e redisvl.exceptions.SchemaValidationError: Validation failed for object at index 0: 1 validation error for user_simple__PydanticModel user_embedding Input should be a valid bytes [type=bytes_type, input_value=True, input_type=bool] For further information visit https://errors.pydantic.dev/2.10/v/bytes_type - - - - --------------------------------------------------------------------------- - - ValidationError Traceback (most recent call last) - - File ~/Documents/redisvl/redisvl/index/storage.py:204, in BaseStorage._preprocess_and_validate_objects(self, objects, id_field, keys, preprocess, validate) - 203 if validate: - --> 204 processed_obj = self._validate(processed_obj) - 206 # Store valid object with its key for writing - - - File ~/Documents/redisvl/redisvl/index/storage.py:160, in BaseStorage._validate(self, obj) - 159 # Pass directly to validation function and let any errors propagate - --> 160 return validate_object(self.index_schema, obj) - - - File ~/Documents/redisvl/redisvl/schema/validation.py:276, in validate_object(schema, obj) - 275 # Validate against model - --> 276 validated = model_class.model_validate(flat_obj) - 277 return validated.model_dump(exclude_none=True) - - - File ~/.pyenv/versions/3.13/envs/redisvl-dev/lib/python3.13/site-packages/pydantic/main.py:627, in BaseModel.model_validate(cls, obj, strict, from_attributes, context) - 626 __tracebackhide__ = True - --> 627 return cls.__pydantic_validator__.validate_python( - 628 obj, strict=strict, from_attributes=from_attributes, context=context - 629 ) - - - ValidationError: 1 validation error for user_simple__PydanticModel - user_embedding - Input should be a valid bytes [type=bytes_type, input_value=True, input_type=bool] - For further information visit https://errors.pydantic.dev/2.10/v/bytes_type - - - The above exception was the direct cause of the following exception: - - - SchemaValidationError Traceback (most recent call last) - - Cell In[31], line 3 - 1 # NBVAL_SKIP - ----> 3 keys = index.load([{"user_embedding": True}]) - - - File ~/Documents/redisvl/redisvl/index/index.py:686, in SearchIndex.load(self, data, id_field, keys, ttl, preprocess, batch_size) - 656 """Load objects to the Redis database. Returns the list of keys loaded - 657 to Redis. - 658 - (...) - 683 RedisVLError: If there's an error loading data to Redis. - 684 """ - 685 try: - --> 686 return self._storage.write( - 687 self._redis_client, # type: ignore - 688 objects=data, - 689 id_field=id_field, - 690 keys=keys, - 691 ttl=ttl, - 692 preprocess=preprocess, - 693 batch_size=batch_size, - 694 validate=self._validate_on_load, - 695 ) - 696 except SchemaValidationError: - 697 # Pass through validation errors directly - 698 logger.exception("Schema validation error while loading data") - - - File ~/Documents/redisvl/redisvl/index/storage.py:265, in BaseStorage.write(self, redis_client, objects, id_field, keys, ttl, preprocess, batch_size, validate) - 262 return [] - 264 # Pass 1: Preprocess and validate all objects - --> 265 prepared_objects = self._preprocess_and_validate_objects( - 266 list(objects), # Convert Iterable to List - 267 id_field=id_field, - 268 keys=keys, - 269 preprocess=preprocess, - 270 validate=validate, - 271 ) - 273 # Pass 2: Write all valid objects in batches - 274 added_keys = [] - - - File ~/Documents/redisvl/redisvl/index/storage.py:211, in BaseStorage._preprocess_and_validate_objects(self, objects, id_field, keys, preprocess, validate) - 207 prepared_objects.append((key, processed_obj)) - 209 except ValidationError as e: - 210 # Convert Pydantic ValidationError to SchemaValidationError with index context - --> 211 raise SchemaValidationError(str(e), index=i) from e - 212 except Exception as e: - 213 # Capture other exceptions with context - 214 object_id = f"at index {i}" - - - SchemaValidationError: Validation failed for object at index 0: 1 validation error for user_simple__PydanticModel + Failed to load data Validation failed for object at index 0: 1 validation error for user_simple__PydanticModel user_embedding Input should be a valid bytes [type=bytes_type, input_value=True, input_type=bool] For further information visit https://errors.pydantic.dev/2.10/v/bytes_type @@ -402,7 +311,7 @@ keys = index.load(new_data) print(keys) ``` - ['user_simple_docs:01JT4PPX63CH5YRN2BGEYB5TS2'] + ['user_simple_docs:01JWEWPKMA687KHAFRATVJ44BS'] ## Creating `VectorQuery` Objects @@ -516,7 +425,7 @@ index.schema.add_fields([ await index.create(overwrite=True, drop=False) ``` - 19:17:29 redisvl.index.index INFO Index already exists, overwriting. + 16:46:01 redisvl.index.index INFO Index already exists, overwriting. @@ -527,7 +436,7 @@ result_print(results) ``` -
vector_distanceuseragejobcredit_score
0john1engineerhigh
0mary2doctorlow
0.0566299557686tyler9engineerhigh
+
vector_distanceuseragejobcredit_score
0mary2doctorlow
0john1engineerhigh
0.0566299557686tyler9engineerhigh
## Check Index Stats @@ -559,7 +468,7 @@ Use the `rvl` CLI to check the stats for the index: │ offsets_per_term_avg │ 0 │ │ records_per_doc_avg │ 5 │ │ sortable_values_size_mb │ 0 │ - │ total_indexing_time │ 0.74400001 │ + │ total_indexing_time │ 1.58399999 │ │ total_inverted_index_blocks │ 11 │ │ vector_index_sz_mb │ 0.23560333 │ ╰─────────────────────────────┴────────────╯ diff --git a/content/integrate/redisvl/user_guide/hash_vs_json.md b/content/integrate/redisvl/user_guide/hash_vs_json.md index b141ffe93c..aff686c15a 100644 --- a/content/integrate/redisvl/user_guide/hash_vs_json.md +++ b/content/integrate/redisvl/user_guide/hash_vs_json.md @@ -44,7 +44,7 @@ table_print(data) ``` -
useragejobcredit_scoreoffice_locationuser_embedding
john18engineerhigh-122.4194,37.7749b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'
derrick14doctorlow-122.4194,37.7749b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'
nancy94doctorhigh-122.4194,37.7749b'333?\xcd\xcc\xcc=\x00\x00\x00?'
tyler100engineerhigh-122.0839,37.3861b'\xcd\xcc\xcc=\xcd\xcc\xcc>\x00\x00\x00?'
tim12dermatologisthigh-122.0839,37.3861b'\xcd\xcc\xcc>\xcd\xcc\xcc>\x00\x00\x00?'
taimur15CEOlow-122.0839,37.3861b'\x9a\x99\x19?\xcd\xcc\xcc=\x00\x00\x00?'
joe35dentistmedium-122.0839,37.3861b'fff?fff?\xcd\xcc\xcc='
+
useragejobcredit_scoreoffice_locationuser_embeddinglast_updated
john18engineerhigh-122.4194,37.7749b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'1741627789
derrick14doctorlow-122.4194,37.7749b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'1741627789
nancy94doctorhigh-122.4194,37.7749b'333?\xcd\xcc\xcc=\x00\x00\x00?'1710696589
tyler100engineerhigh-122.0839,37.3861b'\xcd\xcc\xcc=\xcd\xcc\xcc>\x00\x00\x00?'1742232589
tim12dermatologisthigh-122.0839,37.3861b'\xcd\xcc\xcc>\xcd\xcc\xcc>\x00\x00\x00?'1739644189
taimur15CEOlow-122.0839,37.3861b'\x9a\x99\x19?\xcd\xcc\xcc=\x00\x00\x00?'1742232589
joe35dentistmedium-122.0839,37.3861b'fff?fff?\xcd\xcc\xcc='1742232589
## Hash or JSON -- how to choose? @@ -138,7 +138,8 @@ data[0] 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', - 'user_embedding': b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?'} + 'user_embedding': b'\xcd\xcc\xcc=\xcd\xcc\xcc=\x00\x00\x00?', + 'last_updated': 1741627789} @@ -155,29 +156,29 @@ keys = hindex.load(data) Statistics: - ╭─────────────────────────────┬─────────────╮ - │ Stat Key │ Value │ - ├─────────────────────────────┼─────────────┤ - │ num_docs │ 7 │ - │ num_terms │ 6 │ - │ max_doc_id │ 7 │ - │ num_records │ 44 │ - │ percent_indexed │ 1 │ - │ hash_indexing_failures │ 0 │ - │ number_of_uses │ 1 │ - │ bytes_per_record_avg │ 3.40909 │ - │ doc_table_size_mb │ 0.000767708 │ - │ inverted_sz_mb │ 0.000143051 │ - │ key_table_size_mb │ 0.000248909 │ - │ offset_bits_per_record_avg │ 8 │ - │ offset_vectors_sz_mb │ 8.58307e-06 │ - │ offsets_per_term_avg │ 0.204545 │ - │ records_per_doc_avg │ 6.28571 │ - │ sortable_values_size_mb │ 0 │ - │ total_indexing_time │ 1.053 │ - │ total_inverted_index_blocks │ 18 │ - │ vector_index_sz_mb │ 0.0202332 │ - ╰─────────────────────────────┴─────────────╯ + ╭─────────────────────────────┬────────────╮ + │ Stat Key │ Value │ + ├─────────────────────────────┼────────────┤ + │ num_docs │ 7 │ + │ num_terms │ 6 │ + │ max_doc_id │ 7 │ + │ num_records │ 44 │ + │ percent_indexed │ 1 │ + │ hash_indexing_failures │ 0 │ + │ number_of_uses │ 1 │ + │ bytes_per_record_avg │ 40.2954559 │ + │ doc_table_size_mb │ 7.27653503 │ + │ inverted_sz_mb │ 0.00169086 │ + │ key_table_size_mb │ 2.21252441 │ + │ offset_bits_per_record_avg │ 8 │ + │ offset_vectors_sz_mb │ 8.58306884 │ + │ offsets_per_term_avg │ 0.20454545 │ + │ records_per_doc_avg │ 6.28571414 │ + │ sortable_values_size_mb │ 0 │ + │ total_indexing_time │ 0.25699999 │ + │ total_inverted_index_blocks │ 18 │ + │ vector_index_sz_mb │ 0.02023315 │ + ╰─────────────────────────────┴────────────╯ #### Performing Queries @@ -266,8 +267,8 @@ jindex.create(overwrite=True) !rvl index listall ``` - 11:54:18 [RedisVL] INFO Indices: - 11:54:18 [RedisVL] INFO 1. user-json + 16:51:53 [RedisVL] INFO Indices: + 16:51:53 [RedisVL] INFO 1. user-json #### Vectors as float arrays @@ -295,7 +296,8 @@ json_data[0] 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', - 'user_embedding': [0.10000000149011612, 0.10000000149011612, 0.5]} + 'user_embedding': [0.10000000149011612, 0.10000000149011612, 0.5], + 'last_updated': 1741627789} @@ -412,8 +414,17 @@ bike_schema = { } ``` - /Users/robert.shelton/.pyenv/versions/3.11.9/lib/python3.11/site-packages/huggingface_hub/file_download.py:1142: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. - warnings.warn( + /Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html + from .autonotebook import tqdm as notebook_tqdm + + + 16:51:55 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:51:55 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 + + + Batches: 100%|██████████| 1/1 [00:00<00:00, 7.52it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 1.03it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 13.12it/s] @@ -433,8 +444,8 @@ bike_index.load(bike_data) - ['bike-json:de92cb9955434575b20f4e87a30b03d5', - 'bike-json:054ab3718b984532b924946fa5ce00c6'] + ['bike-json:01JWEX1RA1AX32Q5K4DN7DMH5Z', + 'bike-json:01JWEX1RA1JCFKZJ5A03GGA9C2'] @@ -458,6 +469,9 @@ v = VectorQuery( results = bike_index.query(v) ``` + Batches: 100%|██████████| 1/1 [00:00<00:00, 12.64it/s] + + **Note:** As shown in the example if you want to retrieve a field from json object that was not indexed you will also need to supply the full path as with `$.metadata.type`. @@ -468,12 +482,12 @@ results - [{'id': 'bike-json:054ab3718b984532b924946fa5ce00c6', - 'vector_distance': '0.519989073277', + [{'id': 'bike-json:01JWEX1RA1JCFKZJ5A03GGA9C2', + 'vector_distance': '0.519989132881', 'brand': 'Trek', '$.metadata.type': 'Enduro bikes'}, - {'id': 'bike-json:de92cb9955434575b20f4e87a30b03d5', - 'vector_distance': '0.657624483109', + {'id': 'bike-json:01JWEX1RA1AX32Q5K4DN7DMH5Z', + 'vector_distance': '0.657624304295', 'brand': 'Specialized', '$.metadata.type': 'Enduro bikes'}] diff --git a/content/integrate/redisvl/user_guide/hybrid_queries.md b/content/integrate/redisvl/user_guide/hybrid_queries.md index 8b9a1d557a..047b1150c5 100644 --- a/content/integrate/redisvl/user_guide/hybrid_queries.md +++ b/content/integrate/redisvl/user_guide/hybrid_queries.md @@ -67,15 +67,16 @@ index = SearchIndex.from_dict(schema, redis_url="redis://localhost:6379") index.create(overwrite=True) ``` - 11:40:25 redisvl.index.index INFO Index already exists, overwriting. - - ```python # use the CLI to see the created index !rvl index listall ``` + 16:46:23 [RedisVL] INFO Indices: + 16:46:23 [RedisVL] INFO 1. user_queries + + ```python # load data to redis @@ -147,7 +148,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.653301358223joemedium35dentist-122.0839,37.3861
0.653301358223joemedium35dentist-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
0.653301358223joemedium35dentist-122.0839,37.38611742232589
@@ -160,7 +161,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.653301358223joemedium35dentist-122.0839,37.3861
0.653301358223joemedium35dentist-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
0.653301358223joemedium35dentist-122.0839,37.38611742232589
What about scenarios where you might want to dynamically generate a list of tags? Have no fear. RedisVL allows you to do this gracefully without having to check for the **empty case**. The **empty case** is when you attempt to run a Tag filter on a field with no defined values to match: @@ -179,7 +180,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0derricklow14doctor-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.217882037163taimurlow15CEO-122.0839,37.38611742232589
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
0.653301358223joemedium35dentist-122.0839,37.38611742232589
### Numeric Filters @@ -210,7 +211,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0derricklow14doctor-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0derricklow14doctor-122.4194,37.77491741627789
@@ -223,7 +224,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.217882037163taimurlow15CEO-122.0839,37.38611742232589
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
0.653301358223joemedium35dentist-122.0839,37.38611742232589
### Timestamp Filters @@ -270,7 +271,7 @@ result_print(index.query(v)) -
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0derricklow14doctor-122.4194,37.77491741627789
0johnhigh18engineer-122.4194,37.77491741627789
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0derricklow14doctor-122.4194,37.77491741627789
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
@@ -293,7 +294,7 @@ result_print(index.query(v)) -
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0derricklow14doctor-122.4194,37.77491741627789
0johnhigh18engineer-122.4194,37.77491741627789
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0derricklow14doctor-122.4194,37.77491741627789
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
### Text Filters @@ -325,7 +326,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.653301358223joemedium35dentist-122.0839,37.3861
0.653301358223joemedium35dentist-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.217882037163taimurlow15CEO-122.0839,37.38611742232589
0.653301358223joemedium35dentist-122.0839,37.38611742232589
@@ -338,7 +339,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0derricklow14doctor-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0derricklow14doctor-122.4194,37.77491741627789
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
@@ -351,7 +352,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
@@ -364,7 +365,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0derricklow14doctor-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
@@ -377,7 +378,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_locationlast_updated
0johnhigh18engineer-122.4194,37.77491741627789
0derricklow14doctor-122.4194,37.77491741627789
0.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.217882037163taimurlow15CEO-122.0839,37.38611742232589
0.266666650772nancyhigh94doctor-122.4194,37.77491710696589
0.653301358223joemedium35dentist-122.0839,37.38611742232589
Use raw query strings as input. Below we use the `~` flag to indicate that the full text query is optional. We also choose the BM25 scorer and return document scores along with the result. @@ -393,86 +394,69 @@ index.query(v) - [{'id': 'user_queries_docs:01JMJJHE28ZW4F33ZNRKXRHYCS', - 'score': 1.8181817787737895, - 'vector_distance': '0', - 'user': 'john', - 'credit_score': 'high', - 'age': '18', - 'job': 'engineer', - 'office_location': '-122.4194,37.7749'}, - {'id': 'user_queries_docs:01JMJJHE2899024DYPXT6424N9', - 'score': 0.0, - 'vector_distance': '0', - 'user': 'derrick', - 'credit_score': 'low', - 'age': '14', - 'job': 'doctor', - 'office_location': '-122.4194,37.7749'}, - {'id': 'user_queries_docs:01JMJJPEYCQ89ZQW6QR27J72WT', - 'score': 1.8181817787737895, + [{'id': 'user_queries_docs:01JWEWQHJX670FQM0GKCV403XE', + 'score': 0.9090908893868948, 'vector_distance': '0', 'user': 'john', 'credit_score': 'high', 'age': '18', 'job': 'engineer', - 'office_location': '-122.4194,37.7749'}, - {'id': 'user_queries_docs:01JMJJPEYD544WB1TKDBJ3Z3J9', + 'office_location': '-122.4194,37.7749', + 'last_updated': '1741627789'}, + {'id': 'user_queries_docs:01JWEWQHJY8AS66TA21M8BRE09', 'score': 0.0, 'vector_distance': '0', 'user': 'derrick', 'credit_score': 'low', 'age': '14', 'job': 'doctor', - 'office_location': '-122.4194,37.7749'}, - {'id': 'user_queries_docs:01JMJJHE28B5R6T00DH37A7KSJ', - 'score': 1.8181817787737895, - 'vector_distance': '0.109129190445', - 'user': 'tyler', - 'credit_score': 'high', - 'age': '100', - 'job': 'engineer', - 'office_location': '-122.0839,37.3861'}, - {'id': 'user_queries_docs:01JMJJPEYDPF9S5328WHCQN0ND', - 'score': 1.8181817787737895, + 'office_location': '-122.4194,37.7749', + 'last_updated': '1741627789'}, + {'id': 'user_queries_docs:01JWEWQHJYR2WTWMWZJ7PQ62N1', + 'score': 0.9090908893868948, 'vector_distance': '0.109129190445', 'user': 'tyler', 'credit_score': 'high', 'age': '100', 'job': 'engineer', - 'office_location': '-122.0839,37.3861'}, - {'id': 'user_queries_docs:01JMJJHE28G5F943YGWMB1ZX1V', - 'score': 0.0, - 'vector_distance': '0.158808946609', - 'user': 'tim', - 'credit_score': 'high', - 'age': '12', - 'job': 'dermatologist', - 'office_location': '-122.0839,37.3861'}, - {'id': 'user_queries_docs:01JMJJPEYDKA9ARKHRK1D7KPXQ', + 'office_location': '-122.0839,37.3861', + 'last_updated': '1742232589'}, + {'id': 'user_queries_docs:01JWEWQHJY30EW4B8X2EE6PPZS', 'score': 0.0, 'vector_distance': '0.158808946609', 'user': 'tim', 'credit_score': 'high', 'age': '12', 'job': 'dermatologist', - 'office_location': '-122.0839,37.3861'}, - {'id': 'user_queries_docs:01JMJJHE28NR7KF0EZEA433T2J', + 'office_location': '-122.0839,37.3861', + 'last_updated': '1739644189'}, + {'id': 'user_queries_docs:01JWEWQHJYF2EG7YHEK8QBHJ11', 'score': 0.0, 'vector_distance': '0.217882037163', 'user': 'taimur', 'credit_score': 'low', 'age': '15', 'job': 'CEO', - 'office_location': '-122.0839,37.3861'}, - {'id': 'user_queries_docs:01JMJJPEYD9EAVGJ2AZ8K9VX7Q', + 'office_location': '-122.0839,37.3861', + 'last_updated': '1742232589'}, + {'id': 'user_queries_docs:01JWEWQHJYVBNG97WEWDRQEKCD', 'score': 0.0, - 'vector_distance': '0.217882037163', - 'user': 'taimur', - 'credit_score': 'low', - 'age': '15', - 'job': 'CEO', - 'office_location': '-122.0839,37.3861'}] + 'vector_distance': '0.266666650772', + 'user': 'nancy', + 'credit_score': 'high', + 'age': '94', + 'job': 'doctor', + 'office_location': '-122.4194,37.7749', + 'last_updated': '1710696589'}, + {'id': 'user_queries_docs:01JWEWQHJY9VYQTFS165HBYYJ2', + 'score': 0.0, + 'vector_distance': '0.653301358223', + 'user': 'joe', + 'credit_score': 'medium', + 'age': '35', + 'job': 'dentist', + 'office_location': '-122.0839,37.3861', + 'last_updated': '1742232589'}] @@ -492,7 +476,7 @@ result_print(index.query(v)) ``` -
scorevector_distanceusercredit_scoreagejoboffice_location
0.45454544469344740johnhigh18engineer-122.4194,37.7749
0.45454544469344740derricklow14doctor-122.4194,37.7749
0.45454544469344740johnhigh18engineer-122.4194,37.7749
0.45454544469344740derricklow14doctor-122.4194,37.7749
0.45454544469344740.266666650772nancyhigh94doctor-122.4194,37.7749
0.45454544469344740.266666650772nancyhigh94doctor-122.4194,37.7749
+
scorevector_distanceusercredit_scoreagejoboffice_locationlast_updated
0.45454544469344740johnhigh18engineer-122.4194,37.77491741627789
0.45454544469344740derricklow14doctor-122.4194,37.77491741627789
0.45454544469344740.266666650772nancyhigh94doctor-122.4194,37.77491710696589
@@ -505,7 +489,7 @@ result_print(index.query(v)) ``` -
scorevector_distanceusercredit_scoreagejoboffice_location
0.45454544469344740johnhigh18engineer-122.4194,37.7749
0.45454544469344740derricklow14doctor-122.4194,37.7749
0.45454544469344740johnhigh18engineer-122.4194,37.7749
0.45454544469344740derricklow14doctor-122.4194,37.7749
0.45454544469344740.109129190445tylerhigh100engineer-122.0839,37.3861
0.45454544469344740.109129190445tylerhigh100engineer-122.0839,37.3861
0.45454544469344740.158808946609timhigh12dermatologist-122.0839,37.3861
0.45454544469344740.158808946609timhigh12dermatologist-122.0839,37.3861
0.45454544469344740.217882037163taimurlow15CEO-122.0839,37.3861
0.45454544469344740.217882037163taimurlow15CEO-122.0839,37.3861
+
scorevector_distanceusercredit_scoreagejoboffice_locationlast_updated
0.45454544469344740johnhigh18engineer-122.4194,37.77491741627789
0.45454544469344740derricklow14doctor-122.4194,37.77491741627789
0.45454544469344740.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.45454544469344740.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.45454544469344740.217882037163taimurlow15CEO-122.0839,37.38611742232589
0.45454544469344740.266666650772nancyhigh94doctor-122.4194,37.77491710696589
0.45454544469344740.653301358223joemedium35dentist-122.0839,37.38611742232589
@@ -518,7 +502,7 @@ result_print(index.query(v)) ``` -
scorevector_distanceusercredit_scoreagejoboffice_location
0.00.109129190445tylerhigh100engineer-122.0839,37.3861
0.00.109129190445tylerhigh100engineer-122.0839,37.3861
0.00.158808946609timhigh12dermatologist-122.0839,37.3861
0.00.158808946609timhigh12dermatologist-122.0839,37.3861
0.00.217882037163taimurlow15CEO-122.0839,37.3861
0.00.217882037163taimurlow15CEO-122.0839,37.3861
0.00.653301358223joemedium35dentist-122.0839,37.3861
0.00.653301358223joemedium35dentist-122.0839,37.3861
+
scorevector_distanceusercredit_scoreagejoboffice_locationlast_updated
0.00.109129190445tylerhigh100engineer-122.0839,37.38611742232589
0.00.158808946609timhigh12dermatologist-122.0839,37.38611739644189
0.00.217882037163taimurlow15CEO-122.0839,37.38611742232589
0.00.653301358223joemedium35dentist-122.0839,37.38611742232589
## Combining Filters @@ -601,7 +585,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_location
0.109129190445tylerhigh100engineer-122.0839,37.3861
@@ -613,7 +597,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
+
vector_distanceusercredit_scoreagejoboffice_location
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
@@ -625,7 +609,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.653301358223joemedium35dentist-122.0839,37.3861
0.653301358223joemedium35dentist-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_location
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.653301358223joemedium35dentist-122.0839,37.3861
@@ -637,7 +621,7 @@ result_print(index.query(v)) ``` -
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
+
vector_distanceusercredit_scoreagejoboffice_location
0johnhigh18engineer-122.4194,37.7749
0derricklow14doctor-122.4194,37.7749
0.109129190445tylerhigh100engineer-122.0839,37.3861
0.158808946609timhigh12dermatologist-122.0839,37.3861
0.217882037163taimurlow15CEO-122.0839,37.3861
0.266666650772nancyhigh94doctor-122.4194,37.7749
0.653301358223joemedium35dentist-122.0839,37.3861
## Non-vector Queries @@ -661,7 +645,7 @@ result_print(results) ``` -
usercredit_scoreagejob
derricklow14doctor
taimurlow15CEO
derricklow14doctor
taimurlow15CEO
+
usercredit_scoreagejob
derricklow14doctor
taimurlow15CEO
## Count Queries @@ -681,7 +665,7 @@ count = index.query(filter_query) print(f"{count} records match the filter expression {str(has_low_credit)} for the given index.") ``` - 4 records match the filter expression @credit_score:{low} for the given index. + 2 records match the filter expression @credit_score:{low} for the given index. ## Range Queries @@ -706,7 +690,7 @@ result_print(results) ``` -
vector_distanceusercredit_scoreagejob
0johnhigh18engineer
0derricklow14doctor
0johnhigh18engineer
0derricklow14doctor
0.109129190445tylerhigh100engineer
0.109129190445tylerhigh100engineer
0.158808946609timhigh12dermatologist
0.158808946609timhigh12dermatologist
+
vector_distanceusercredit_scoreagejob
0johnhigh18engineer
0derricklow14doctor
0.109129190445tylerhigh100engineer
0.158808946609timhigh12dermatologist
We can also change the distance threshold of the query object between uses if we like. Here we will set ``distance_threshold==0.1``. This means that the query object will return all matches that are within 0.1 of the query object. This is a small distance, so we expect to get fewer matches than before. @@ -719,7 +703,7 @@ result_print(index.query(range_query)) ``` -
vector_distanceusercredit_scoreagejob
0johnhigh18engineer
0derricklow14doctor
0johnhigh18engineer
0derricklow14doctor
+
vector_distanceusercredit_scoreagejob
0johnhigh18engineer
0derricklow14doctor
Range queries can also be used with filters like any other query type. The following limits the results to only include records with a ``job`` of ``engineer`` while also being within the vector range (aka distance). @@ -734,7 +718,7 @@ result_print(index.query(range_query)) ``` -
vector_distanceusercredit_scoreagejob
0johnhigh18engineer
0johnhigh18engineer
+
vector_distanceusercredit_scoreagejob
0johnhigh18engineer
## Advanced Query Modifiers @@ -757,7 +741,7 @@ result_print(result) ``` -
vector_distanceageusercredit_scorejoboffice_location
0.109129190445100tylerhighengineer-122.0839,37.3861
0.109129190445100tylerhighengineer-122.0839,37.3861
018johnhighengineer-122.4194,37.7749
018johnhighengineer-122.4194,37.7749
+
vector_distanceageusercredit_scorejoboffice_location
0.109129190445100tylerhighengineer-122.0839,37.3861
018johnhighengineer-122.4194,37.7749
### Raw Redis Query String @@ -819,14 +803,10 @@ for r in results.docs: print(r.__dict__) ``` - {'id': 'user_queries_docs:01JMJJHE28G5F943YGWMB1ZX1V', 'payload': None, 'user': 'tim', 'age': '12', 'job': 'dermatologist', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '>>\x00\x00\x00?'} - {'id': 'user_queries_docs:01JMJJHE28ZW4F33ZNRKXRHYCS', 'payload': None, 'user': 'john', 'age': '18', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '==\x00\x00\x00?'} - {'id': 'user_queries_docs:01JMJJHE28B5R6T00DH37A7KSJ', 'payload': None, 'user': 'tyler', 'age': '100', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '=>\x00\x00\x00?'} - {'id': 'user_queries_docs:01JMJJHE28EX13NEE7BGBM8FH3', 'payload': None, 'user': 'nancy', 'age': '94', 'job': 'doctor', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '333?=\x00\x00\x00?'} - {'id': 'user_queries_docs:01JMJJPEYCQ89ZQW6QR27J72WT', 'payload': None, 'user': 'john', 'age': '18', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '==\x00\x00\x00?'} - {'id': 'user_queries_docs:01JMJJPEYDAN0M3V7EQEVPS6HX', 'payload': None, 'user': 'nancy', 'age': '94', 'job': 'doctor', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '333?=\x00\x00\x00?'} - {'id': 'user_queries_docs:01JMJJPEYDPF9S5328WHCQN0ND', 'payload': None, 'user': 'tyler', 'age': '100', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '=>\x00\x00\x00?'} - {'id': 'user_queries_docs:01JMJJPEYDKA9ARKHRK1D7KPXQ', 'payload': None, 'user': 'tim', 'age': '12', 'job': 'dermatologist', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '>>\x00\x00\x00?'} + {'id': 'user_queries_docs:01JWEWQHJX670FQM0GKCV403XE', 'payload': None, 'user': 'john', 'age': '18', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '==\x00\x00\x00?', 'last_updated': '1741627789'} + {'id': 'user_queries_docs:01JWEWQHJYVBNG97WEWDRQEKCD', 'payload': None, 'user': 'nancy', 'age': '94', 'job': 'doctor', 'credit_score': 'high', 'office_location': '-122.4194,37.7749', 'user_embedding': '333?=\x00\x00\x00?', 'last_updated': '1710696589'} + {'id': 'user_queries_docs:01JWEWQHJYR2WTWMWZJ7PQ62N1', 'payload': None, 'user': 'tyler', 'age': '100', 'job': 'engineer', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '=>\x00\x00\x00?', 'last_updated': '1742232589'} + {'id': 'user_queries_docs:01JWEWQHJY30EW4B8X2EE6PPZS', 'payload': None, 'user': 'tim', 'age': '12', 'job': 'dermatologist', 'credit_score': 'high', 'office_location': '-122.0839,37.3861', 'user_embedding': '>>\x00\x00\x00?', 'last_updated': '1739644189'} diff --git a/content/integrate/redisvl/user_guide/llmcache.md b/content/integrate/redisvl/user_guide/llmcache.md index bc98756950..07ba9e93a6 100644 --- a/content/integrate/redisvl/user_guide/llmcache.md +++ b/content/integrate/redisvl/user_guide/llmcache.md @@ -43,7 +43,7 @@ def ask_openai(question: str) -> str: print(ask_openai("What is the capital of France?")) ``` - 19:17:51 httpx INFO HTTP Request: POST https://api.openai.com/v1/completions "HTTP/1.1 200 OK" + 16:49:16 httpx INFO HTTP Request: POST https://api.openai.com/v1/completions "HTTP/1.1 200 OK" The capital of France is Paris. @@ -53,22 +53,29 @@ print(ask_openai("What is the capital of France?")) ```python +import warnings +warnings.filterwarnings('ignore') + from redisvl.extensions.cache.llm import SemanticCache -from redisvl.utils .vectorize import HFTextVectorizer +from redisvl.utils.vectorize import HFTextVectorizer llmcache = SemanticCache( name="llmcache", # underlying search index name redis_url="redis://localhost:6379", # redis connection url string distance_threshold=0.1, # semantic cache distance threshold - vectorizer=HFTextVectorizer("redis/langcache-embed-v1"), # embedding model + vectorizer=HFTextVectorizer("redis/langcache-embed-v1"), # embdding model ) ``` - 19:17:51 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps - 19:17:51 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:49:16 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:49:16 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:49:16 sentence_transformers.SentenceTransformer WARNING You try to use a model that was created with version 4.1.0, however, your version is 3.4.1. This might cause unexpected behavior or errors. In that case, try to update to the latest version. + + + - Batches: 100%|██████████| 1/1 [00:00<00:00, 17.57it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.18it/s] @@ -113,7 +120,7 @@ else: print("Empty cache") ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 18.30it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 14.60it/s] Empty cache @@ -134,7 +141,7 @@ llmcache.store( ) ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 26.10it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 18.57it/s] @@ -155,12 +162,14 @@ else: print("Empty cache") ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 12.36it/s] - + Batches: 100%|██████████| 1/1 [00:00<00:00, 20.09it/s] [{'prompt': 'What is the capital of France?', 'response': 'Paris', 'metadata': {'city': 'Paris', 'country': 'france'}, 'key': 'llmcache:115049a298532be2f181edb03f766770c0db84c22aff39003fec340deaec7545'}] + + + ```python # Check for a semantically similar result @@ -168,7 +177,7 @@ question = "What actually is the capital of France?" llmcache.check(prompt=question)[0]['response'] ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 12.22it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 16.65it/s] @@ -199,7 +208,7 @@ question = "What is the capital city of the country in Europe that also has a ci llmcache.check(prompt=question)[0]['response'] ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 19.20it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 17.66it/s] @@ -214,11 +223,11 @@ llmcache.check(prompt=question)[0]['response'] # Invalidate the cache completely by clearing it out llmcache.clear() -# should be empty now +# Should be empty now llmcache.check(prompt=question) ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 26.71it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 20.65it/s] @@ -247,7 +256,7 @@ llmcache.store("This is a TTL test", "This is a TTL test response") time.sleep(6) ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 20.45it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 18.25it/s] @@ -258,7 +267,7 @@ result = llmcache.check("This is a TTL test") print(result) ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 17.02it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 13.91it/s] [] @@ -310,14 +319,14 @@ print(f"Without caching, a call to openAI to answer this simple question took {e llmcache.store(prompt=question, response="George Washington") ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 14.88it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 18.02it/s] - 19:18:04 httpx INFO HTTP Request: POST https://api.openai.com/v1/completions "HTTP/1.1 200 OK" - Without caching, a call to openAI to answer this simple question took 0.8826751708984375 seconds. + 16:49:27 httpx INFO HTTP Request: POST https://api.openai.com/v1/completions "HTTP/1.1 200 OK" + Without caching, a call to openAI to answer this simple question took 1.6722779273986816 seconds. - Batches: 100%|██████████| 1/1 [00:00<00:00, 18.38it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 12.05it/s] @@ -343,19 +352,19 @@ print(f"Avg time taken with LLM cache enabled: {avg_time_with_cache}") print(f"Percentage of time saved: {round(((end - start) - avg_time_with_cache) / (end - start) * 100, 2)}%") ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 13.65it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 27.94it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 27.19it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 27.53it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 28.12it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 27.38it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 25.39it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 26.34it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 28.07it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 27.35it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 16.95it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 20.13it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.62it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.25it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.84it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.82it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.21it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 20.62it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.13it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.55it/s] - Avg time taken with LLM cache enabled: 0.0463670015335083 - Percentage of time saved: 94.75% + Avg time taken with LLM cache enabled: 0.05201866626739502 + Percentage of time saved: 96.89% @@ -388,7 +397,7 @@ print(f"Percentage of time saved: {round(((end - start) - avg_time_with_cache) / │ offsets_per_term_avg │ 0.75862067 │ │ records_per_doc_avg │ 29 │ │ sortable_values_size_mb │ 0 │ - │ total_indexing_time │ 3.875 │ + │ total_indexing_time │ 0.29899999 │ │ total_inverted_index_blocks │ 21 │ │ vector_index_sz_mb │ 3.01609802 │ ╰─────────────────────────────┴────────────╯ @@ -425,14 +434,17 @@ private_cache.store( ) ``` - 19:18:07 [RedisVL] WARNING The default vectorizer has changed from `sentence-transformers/all-mpnet-base-v2` to `redis/langcache-embed-v1` in version 0.6.0 of RedisVL. For more information about this model, please refer to https://arxiv.org/abs/2504.02268 or visit https://huggingface.co/redis/langcache-embed-v1. To continue using the old vectorizer, please specify it explicitly in the constructor as: vectorizer=HFTextVectorizer(model='sentence-transformers/all-mpnet-base-v2') - 19:18:07 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps - 19:18:07 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:49:30 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:49:30 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:49:30 sentence_transformers.SentenceTransformer WARNING You try to use a model that was created with version 4.1.0, however, your version is 3.4.1. This might cause unexpected behavior or errors. In that case, try to update to the latest version. + + + - Batches: 100%|██████████| 1/1 [00:00<00:00, 8.98it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 24.89it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 26.95it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 16.14it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 19.67it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.51it/s] @@ -458,7 +470,7 @@ response = private_cache.check( print(f"found {len(response)} entry \n{response[0]['response']}") ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 27.98it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.55it/s] found 1 entry The number on file is 123-555-0000 @@ -509,16 +521,19 @@ complex_cache.store( ) ``` - 19:18:09 [RedisVL] WARNING The default vectorizer has changed from `sentence-transformers/all-mpnet-base-v2` to `redis/langcache-embed-v1` in version 0.6.0 of RedisVL. For more information about this model, please refer to https://arxiv.org/abs/2504.02268 or visit https://huggingface.co/redis/langcache-embed-v1. To continue using the old vectorizer, please specify it explicitly in the constructor as: vectorizer=HFTextVectorizer(model='sentence-transformers/all-mpnet-base-v2') - 19:18:09 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps - 19:18:09 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:49:31 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:49:31 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:49:31 sentence_transformers.SentenceTransformer WARNING You try to use a model that was created with version 4.1.0, however, your version is 3.4.1. This might cause unexpected behavior or errors. In that case, try to update to the latest version. + + + - Batches: 100%|██████████| 1/1 [00:00<00:00, 13.54it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 16.76it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 21.82it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 28.80it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 21.04it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 20.21it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 17.24it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 16.95it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.26it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 19.48it/s] @@ -547,7 +562,7 @@ print(f'found {len(response)} entry') print(response[0]["response"]) ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 28.15it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 21.76it/s] found 1 entry Your most recent transaction was for $350 diff --git a/content/integrate/redisvl/user_guide/message_history.md b/content/integrate/redisvl/user_guide/message_history.md index 61a4628034..7ba8e607ba 100644 --- a/content/integrate/redisvl/user_guide/message_history.md +++ b/content/integrate/redisvl/user_guide/message_history.md @@ -15,12 +15,10 @@ This notebook will show how to use Redis to structure and store and retrieve thi ```python from redisvl.extensions.message_history import MessageHistory + chat_history = MessageHistory(name='student tutor') ``` - 12:24:11 redisvl.index.index INFO Index already exists, not overwriting. - - To align with common LLM APIs, Redis stores messages with `role` and `content` fields. The supported roles are "system", "user" and "llm". @@ -130,7 +128,23 @@ semantic_history = SemanticMessageHistory(name='tutor') semantic_history.add_messages(chat_history.get_recent(top_k=8)) ``` - 12:24:15 redisvl.index.index INFO Index already exists, not overwriting. + /Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html + from .autonotebook import tqdm as notebook_tqdm + + + 16:52:21 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:52:21 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 + + + Batches: 100%|██████████| 1/1 [00:00<00:00, 6.25it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 3.21it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 11.70it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 13.56it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 59.68it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 63.31it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 8.70it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 13.22it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 59.67it/s] @@ -142,8 +156,12 @@ for message in context: print(message) ``` + Batches: 100%|██████████| 1/1 [00:00<00:00, 62.76it/s] + {'role': 'user', 'content': 'what is the size of England compared to Portugal?'} - {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'} + + + You can adjust the degree of semantic similarity needed to be included in your context. @@ -159,10 +177,16 @@ for message in larger_context: print(message) ``` + Batches: 100%|██████████| 1/1 [00:00<00:00, 61.35it/s] + {'role': 'user', 'content': 'what is the size of England compared to Portugal?'} {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'} {'role': 'user', 'content': 'What is the population of Great Britain?'} {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'} + {'role': 'user', 'content': 'And what is the capital of Spain?'} + + + ## Conversation control @@ -174,7 +198,7 @@ LLMs can hallucinate on occasion and when this happens it can be useful to prune semantic_history.store( prompt="what is the smallest country in Europe?", response="Monaco is the smallest country in Europe at 0.78 square miles." # Incorrect. Vatican City is the smallest country in Europe - ) +) # get the key of the incorrect message context = semantic_history.get_recent(top_k=1, raw=True) @@ -186,6 +210,10 @@ for message in corrected_context: print(message) ``` + Batches: 100%|██████████| 1/1 [00:00<00:00, 61.00it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 8.48it/s] + + {'role': 'user', 'content': 'What is the population of Great Britain?'} {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'} {'role': 'user', 'content': 'what is the size of England compared to Portugal?'} diff --git a/content/integrate/redisvl/user_guide/semantic_router.md b/content/integrate/redisvl/user_guide/semantic_router.md index 74ea12c04e..7f67610c2d 100644 --- a/content/integrate/redisvl/user_guide/semantic_router.md +++ b/content/integrate/redisvl/user_guide/semantic_router.md @@ -90,14 +90,18 @@ router = SemanticRouter( ) ``` - 19:18:32 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps - 19:18:32 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 + /Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html + from .autonotebook import tqdm as notebook_tqdm - Batches: 100%|██████████| 1/1 [00:00<00:00, 17.78it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 37.43it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 27.28it/s] - Batches: 100%|██████████| 1/1 [00:00<00:00, 48.76it/s] + 16:52:49 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:52:49 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 + + + Batches: 100%|██████████| 1/1 [00:00<00:00, 7.67it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 8.97it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 5.24it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 48.90it/s] @@ -146,13 +150,13 @@ route_match = router("Can you tell me about the latest in artificial intelligenc route_match ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 6.40it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 8.83it/s] - RouteMatch(name='technology', distance=0.419145842393) + RouteMatch(name='technology', distance=0.419145941734) @@ -163,7 +167,7 @@ route_match = router("are aliens real?") route_match ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 39.83it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 12.45it/s] @@ -182,14 +186,14 @@ route_matches = router.route_many("How is AI used in basketball?", max_k=3) route_matches ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 40.50it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 10.98it/s] - [RouteMatch(name='technology', distance=0.556493878365), - RouteMatch(name='sports', distance=0.671060125033)] + [RouteMatch(name='technology', distance=0.556493639946), + RouteMatch(name='sports', distance=0.671060085297)] @@ -202,13 +206,13 @@ route_matches = router.route_many("How is AI used in basketball?", aggregation_m route_matches ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 66.18it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 52.93it/s] - [RouteMatch(name='technology', distance=0.556493878365), + [RouteMatch(name='technology', distance=0.556493639946), RouteMatch(name='sports', distance=0.629264354706)] @@ -232,13 +236,13 @@ route_matches = router.route_many("Lebron James") route_matches ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 41.89it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 10.93it/s] - [RouteMatch(name='sports', distance=0.663254022598)] + [RouteMatch(name='sports', distance=0.663253903389)] @@ -286,13 +290,13 @@ router2 = SemanticRouter.from_dict(router.to_dict(), redis_url="redis://localhos assert router2.to_dict() == router.to_dict() ``` - 19:18:38 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps - 19:18:38 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 + 16:52:53 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:52:53 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 - Batches: 100%|██████████| 1/1 [00:00<00:00, 54.94it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 45.24it/s] - 19:18:40 redisvl.index.index INFO Index already exists, not overwriting. + 16:52:54 redisvl.index.index INFO Index already exists, not overwriting. @@ -310,13 +314,13 @@ router3 = SemanticRouter.from_yaml("router.yaml", redis_url="redis://localhost:6 assert router3.to_dict() == router2.to_dict() == router.to_dict() ``` - 19:18:40 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps - 19:18:40 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 + 16:52:54 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:52:54 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2 - Batches: 100%|██████████| 1/1 [00:00<00:00, 18.77it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 53.94it/s] - 19:18:41 redisvl.index.index INFO Index already exists, not overwriting. + 16:52:54 redisvl.index.index INFO Index already exists, not overwriting. @@ -329,7 +333,7 @@ assert router3.to_dict() == router2.to_dict() == router.to_dict() router.add_route_references(route_name="technology", references=["latest AI trends", "new tech gadgets"]) ``` - Batches: 100%|██████████| 1/1 [00:00<00:00, 13.22it/s] + Batches: 100%|██████████| 1/1 [00:00<00:00, 7.24it/s] @@ -352,26 +356,26 @@ refs - [{'id': 'topic-router:technology:7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f', - 'reference_id': '7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f', + [{'id': 'topic-router:technology:85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37', + 'reference_id': '85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37', 'route_name': 'technology', - 'reference': 'new tech gadgets'}, + 'reference': 'tell me about the newest gadgets'}, + {'id': 'topic-router:technology:851f51cce5a9ccfbbcb66993908be6b7871479af3e3a4b139ad292a1bf7e0676', + 'reference_id': '851f51cce5a9ccfbbcb66993908be6b7871479af3e3a4b139ad292a1bf7e0676', + 'route_name': 'technology', + 'reference': 'what are the latest advancements in AI?'}, {'id': 'topic-router:technology:f243fb2d073774e81c7815247cb3013794e6225df3cbe3769cee8c6cefaca777', 'reference_id': 'f243fb2d073774e81c7815247cb3013794e6225df3cbe3769cee8c6cefaca777', 'route_name': 'technology', 'reference': 'latest AI trends'}, - {'id': 'topic-router:technology:851f51cce5a9ccfbbcb66993908be6b7871479af3e3a4b139ad292a1bf7e0676', - 'reference_id': '851f51cce5a9ccfbbcb66993908be6b7871479af3e3a4b139ad292a1bf7e0676', + {'id': 'topic-router:technology:7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f', + 'reference_id': '7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f', 'route_name': 'technology', - 'reference': 'what are the latest advancements in AI?'}, + 'reference': 'new tech gadgets'}, {'id': 'topic-router:technology:149a9c9919c58534aa0f369e85ad95ba7f00aa0513e0f81e2aff2ea4a717b0e0', 'reference_id': '149a9c9919c58534aa0f369e85ad95ba7f00aa0513e0f81e2aff2ea4a717b0e0', 'route_name': 'technology', - 'reference': "what's trending in tech?"}, - {'id': 'topic-router:technology:85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37', - 'reference_id': '85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37', - 'route_name': 'technology', - 'reference': 'tell me about the newest gadgets'}] + 'reference': "what's trending in tech?"}] @@ -385,10 +389,10 @@ refs - [{'id': 'topic-router:technology:7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f', - 'reference_id': '7e4bca5853c1c3298b4d001de13c3c7a79a6e0f134f81acc2e7cddbd6845961f', + [{'id': 'topic-router:technology:85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37', + 'reference_id': '85cc73a1437df27caa2f075a29c497e5a2e532023fbb75378aedbae80779ab37', 'route_name': 'technology', - 'reference': 'new tech gadgets'}] + 'reference': 'tell me about the newest gadgets'}] diff --git a/content/integrate/redisvl/user_guide/threshold_optimization.md b/content/integrate/redisvl/user_guide/threshold_optimization.md index 43722609ef..1b638ae3fa 100644 --- a/content/integrate/redisvl/user_guide/threshold_optimization.md +++ b/content/integrate/redisvl/user_guide/threshold_optimization.md @@ -34,18 +34,21 @@ rabat_key = sem_cache.store(prompt="what is the capital of morocco?", response=" ``` - /Users/justin.cechmanek/.pyenv/versions/3.13/envs/redisvl-dev/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html + /Users/tyler.hutcherson/Library/Caches/pypoetry/virtualenvs/redisvl-VnTEShF2-py3.13/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm - 16:16:11 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps - 16:16:11 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:53:13 sentence_transformers.SentenceTransformer INFO Use pytorch device_name: mps + 16:53:13 sentence_transformers.SentenceTransformer INFO Load pretrained SentenceTransformer: redis/langcache-embed-v1 + 16:53:13 sentence_transformers.SentenceTransformer WARNING You try to use a model that was created with version 4.1.0, however, your version is 3.4.1. This might cause unexpected behavior or errors. In that case, try to update to the latest version. + + + - Batches: 0%| | 0/1 [00:00