|
1 | | -# Redis LangCache Demo |
| 1 | +# Redis LangCache — English Demo (Gradio UI) |
2 | 2 |
|
3 | | -A minimal Python demo showing how to use [Redis LangCache](https://redis.io/docs/latest/solutions/semantic-caching/langcache/) with OpenAI to implement semantic caching for LLM queries. |
4 | | -This example caches responses based on semantic similarity, reducing latency and API usage costs. |
| 3 | +A fully functional demo showing **Redis LangCache** + **OpenAI** in action, implementing **semantic caching** with **scoped isolation** by Company / Business Unit / Person — all in a **Gradio web interface**. |
| 4 | + |
| 5 | +> Main demo file: [`main_demo_released.py`](https://github.com/Redislabs-Solution-Architects/redis-langcache-python-example/blob/main/main_demo_released.py) |
| 6 | +
|
| 7 | +--- |
| 8 | + |
| 9 | +## ✨ What This Demo Does |
| 10 | + |
| 11 | +- Demonstrates **semantic caching** for LLM responses to reduce **latency** and **API cost**. |
| 12 | +- **Scoped reuse** of answers by **Company / Business Unit / Person** — adjustable isolation levels. |
| 13 | +- **Domain disambiguation**: ambiguous questions (“cell”, “network”, “bank”) are automatically interpreted in the correct domain. |
| 14 | +- **Identity handling**: |
| 15 | + - **Name** → not cached (display only when asked). |
| 16 | + - **Role/Function** → stored under exact key (`[IDENTITY:ROLE]`) and supports “set” (e.g., “My role is …”). |
| 17 | +- **Cache management UI**: clear cached entries by scope (A, B, or both) — *the index is never deleted.* |
| 18 | +- **Real-time KPIs**: cache hits, misses, hit rate, estimated tokens saved, and $ savings. |
5 | 19 |
|
6 | 20 | --- |
7 | 21 |
|
8 | | -## 📂 Project Structure |
| 22 | +## 📁 Project Structure |
9 | 23 |
|
10 | 24 | ``` |
11 | 25 | . |
12 | | -├── main.py # Main script for running the demo |
13 | | -├── requirements.txt # Python dependencies |
14 | | -├── .env.EXAMPLE # Example environment variable configuration |
15 | | -└── .env # Your actual environment variables (not committed) |
| 26 | +├── main_demo_released.py # Main Gradio app (this demo) |
| 27 | +├── requirements.txt # Python dependencies |
| 28 | +├── Dockerfile # Docker build |
| 29 | +├── docker-compose.yml # Example local orchestration |
| 30 | +└── .env # Environment variables (not committed) |
16 | 31 | ``` |
17 | 32 |
|
| 33 | +> The repository also includes additional examples (RAG, attribute-based caching, etc.). |
| 34 | +> This demo uses **`main_demo_released.py`** as its entry point. |
| 35 | +
|
18 | 36 | --- |
19 | 37 |
|
20 | | -## 🚀 Prerequisites |
| 38 | +## 🔐 Environment Variables |
21 | 39 |
|
22 | | -- Python **3.10+** |
23 | | -- A Redis LangCache instance (Redis Cloud) |
24 | | -- An OpenAI API key |
| 40 | +Create a `.env` file in the project root with: |
25 | 41 |
|
26 | | ---- |
| 42 | +```env |
| 43 | +# OpenAI |
| 44 | +OPENAI_API_KEY=sk-proj-<your-openai-key> |
| 45 | +OPENAI_MODEL=gpt-4o-mini |
| 46 | +
|
| 47 | +# LangCache (Redis Cloud) |
| 48 | +LANGCACHE_SERVICE_KEY=<your-service-key> # or LANGCACHE_API_KEY |
| 49 | +LANGCACHE_CACHE_ID=<your-cache-id> |
| 50 | +LANGCACHE_BASE_URL=https://gcp-us-east4.langcache.redis.io |
27 | 51 |
|
28 | | -## ⚙️ Setup |
29 | | - |
30 | | -1. **Clone this repository** |
31 | | - ```bash |
32 | | - git clone https://github.com/<your-repo>/gabs-redis-langcache.git |
33 | | - cd gabs-redis-langcache |
34 | | - ``` |
35 | | - |
36 | | -2. **Create and activate a virtual environment** |
37 | | - ```bash |
38 | | - python3 -m venv .venv |
39 | | - source .venv/bin/activate # Mac/Linux |
40 | | - .venv\Scripts\activate # Windows |
41 | | - ``` |
42 | | - |
43 | | -3. **Install dependencies** |
44 | | - ```bash |
45 | | - pip install -r requirements.txt |
46 | | - ``` |
47 | | - |
48 | | -4. **Configure environment variables** |
49 | | - - Copy `.env.EXAMPLE` to `.env` |
50 | | - - Fill in your credentials: |
51 | | - ```env |
52 | | - OPENAI_API_KEY=sk-proj-<your-openai-key> |
53 | | - OPENAI_MODEL=gpt-4o |
54 | | -
|
55 | | - LANGCACHE_SERVICE_KEY=<your-langcache-service-key> |
56 | | - LANGCACHE_CACHE_ID=<your-langcache-cache-id> |
57 | | - LANGCACHE_BASE_URL=https://gcp-us-east4.langcache.redis.io |
58 | | - ``` |
| 52 | +# (Optional) Redis local or other configs |
| 53 | +REDIS_URL=redis://localhost:6379/0 |
| 54 | +
|
| 55 | +# Embedding model (for RAG examples) |
| 56 | +EMBED_MODEL=text-embedding-3-small |
| 57 | +EMBED_DIM=1536 |
| 58 | +``` |
| 59 | + |
| 60 | +> `LANGCACHE_API_KEY` and `LANGCACHE_SERVICE_KEY` are interchangeable for this app — use one of them. |
59 | 61 |
|
60 | 62 | --- |
61 | 63 |
|
62 | | -## ▶️ Usage |
| 64 | +## 🚀 Running the Demo |
| 65 | + |
| 66 | +### 1) Locally (Python) |
| 67 | + |
| 68 | +```bash |
| 69 | +python -m venv .venv |
| 70 | +source .venv/bin/activate # Linux/Mac |
| 71 | +# .venv\Scripts\activate # Windows PowerShell |
| 72 | +pip install -r requirements.txt |
| 73 | + |
| 74 | +# Ensure your .env is configured |
| 75 | +python main_demo_released.py |
| 76 | +``` |
| 77 | + |
| 78 | +The UI will start at: **http://localhost:7860** |
| 79 | + |
| 80 | +--- |
63 | 81 |
|
64 | | -Run the demo: |
| 82 | +### 2) With Docker (prebuilt image) |
65 | 83 |
|
66 | 84 | ```bash |
67 | | -python main.py |
| 85 | +docker run -d \ |
| 86 | + --name langcache-demo \ |
| 87 | + --env-file .env \ |
| 88 | + -p 7860:7860 \ |
| 89 | + gacerioni/gabs-redis-langcache:1.0.5 |
68 | 90 | ``` |
69 | 91 |
|
70 | | -Example interaction: |
| 92 | +> Apple Silicon (arm64): if needed, add `--platform linux/amd64` when running the image. |
71 | 93 |
|
| 94 | +--- |
| 95 | + |
| 96 | +### 3) Docker Compose (optional) |
| 97 | + |
| 98 | +```yaml |
| 99 | +# docker-compose.yml |
| 100 | +version: "3.9" |
| 101 | +services: |
| 102 | + langcache-demo: |
| 103 | + image: gacerioni/gabs-redis-langcache:1.0.5 |
| 104 | + # platform: linux/amd64 # uncomment on Apple Silicon if needed |
| 105 | + env_file: |
| 106 | + - .env |
| 107 | + ports: |
| 108 | + - "7860:7860" |
| 109 | + restart: unless-stopped |
| 110 | + logging: |
| 111 | + driver: "json-file" |
| 112 | + options: |
| 113 | + max-size: "10m" |
| 114 | + max-file: "3" |
72 | 115 | ``` |
73 | | -LangCache Semantic Cache Chat - Type 'exit' to quit. |
74 | | -
|
75 | | -Ask something: What is Redis LangCache? |
76 | | -[CACHE MISS] |
77 | | -[Latency] Cache miss search took 0.023 seconds |
78 | | -[Latency] OpenAI response took 0.882 seconds |
79 | | -Response: Redis LangCache is a semantic caching solution... |
80 | | ------------------------------------------------------------- |
81 | | -Ask something: Tell me about LangCache |
82 | | -[CACHE HIT] |
83 | | -[Latency] Cache hit in 0.002 seconds |
84 | | -Response: Redis LangCache is a semantic caching solution... |
85 | | ------------------------------------------------------------- |
| 116 | +
|
| 117 | +```bash |
| 118 | +docker compose up -d |
86 | 119 | ``` |
87 | 120 |
|
88 | 121 | --- |
89 | 122 |
|
| 123 | +## 🧑💻 Using the UI |
| 124 | + |
| 125 | +1. Set **Company**, **Business Unit**, and **Person** for both **Scenario A and B**. |
| 126 | +2. Ask questions in both panels to observe **cache hits/misses** and **domain-aware disambiguation**. |
| 127 | +3. Use the **🧹 Clear Cache** buttons to delete entries by scope (A, B, or both). |
| 128 | + > ⚠️ This clears cached **entries only** — the index is **never deleted**. |
| 129 | +
|
| 130 | +Recommended questions for demonstration: |
| 131 | + |
| 132 | +- “**My role is Doctor.**” / “**My role is Software Engineer.**” |
| 133 | +- “**What is my role in the company?**” |
| 134 | +- “**What is a cell?**” (see difference between healthcare vs software) |
| 135 | +- “**Explain what machine learning is.**” / “**What is machine learning?**” |
| 136 | +- “**What is my name?**” |
| 137 | + |
| 138 | +--- |
| 139 | + |
90 | 140 | ## 🧠 How It Works |
91 | 141 |
|
92 | | -1. **Search** in Redis LangCache for a semantically similar question. |
93 | | -2. If a **cache hit** is found (above the similarity threshold), return it instantly. |
94 | | -3. If a **cache miss** occurs: |
95 | | - - Query OpenAI. |
96 | | - - Store the response in Redis LangCache for future reuse. |
| 142 | +1. **Search** Redis LangCache for semantically similar prompts. |
| 143 | +2. If a **cache hit** (above threshold) is found, return the cached response. |
| 144 | +3. If a **miss** occurs: |
| 145 | + - Query OpenAI. |
| 146 | + - Store a **neutral** response (no user identity) in the cache. |
| 147 | +4. Isolation is managed via attributes: `company`, `business_unit`, and `person`. |
| 148 | +5. Ambiguous prompts are internally **rewritten** with explicit domain context (e.g., “(in the context of healthcare)”). |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +## ⚙️ CI/CD Pipeline (optional) |
| 153 | + |
| 154 | +You can automate Docker build & release with GitHub Actions. |
| 155 | +The existing workflow builds a **multi-arch** image and publishes it on new tags (`vX.Y.Z`). |
| 156 | + |
| 157 | +Required repository secrets: |
| 158 | +- `DOCKERHUB_USERNAME` |
| 159 | +- `DOCKERHUB_TOKEN` (Docker Hub PAT) |
| 160 | +- `GITHUB_TOKEN` (provided automatically) |
| 161 | + |
| 162 | +--- |
| 163 | + |
| 164 | +## 🔗 Useful Links |
| 165 | + |
| 166 | +- **Redis LangCache Documentation:** https://redis.io/docs/latest/solutions/semantic-caching/langcache/ |
| 167 | +- **Redis Website:** https://redis.io/ |
| 168 | +- **LinkedIn (Gabriel Cerioni):** https://www.linkedin.com/in/gabrielcerioni/ |
97 | 169 |
|
98 | 170 | --- |
99 | 171 |
|
100 | | -## 📄 License |
| 172 | +## 📜 License |
101 | 173 |
|
102 | | -MIT - Feel free to fork it! |
| 174 | +MIT — feel free to use, adapt, and share. |
0 commit comments