-
Notifications
You must be signed in to change notification settings - Fork 214
Description
Hi Memori team,
I made an earlier issue #68 which asked about per-user memories where you reassured me that per-user memories do exist and this is further supported by your multi-user scenario example. My previous issue was primarily to address the support and to inquite about the term used to identify individual users -- "namespaces" -- which are said to be isolated per user.
I'm opening a new issue because the repository currently advertises per-user isolation, but the shipping code can't deliver it.
How to reproduce (verbatim example)
- From repo root:
uv sync --extra devuv pip install python-dotenv fastapi uvicornuv run python examples/multiple-users/fastapi_multiuser_app.py
- In another terminal:
curl -s -X POST http://127.0.0.1:8000/chat -H 'Content-Type: application/json' -d '{"user_id":"alice","message":"alice: please remember that I like pizza"}'
curl -s -X POST http://127.0.0.1:8000/chat -H 'Content-Type: application/json' -d '{"user_id":"bob","message":"bob: remind me to call alice"}'
- Inspect the database:
sqlite3 fastapi_multiuser_memory.db "SELECT namespace, user_input FROM chat_history ORDER BY timestamp;"
Observed
- Both messages appear in both namespaces (fastapi_user_alice and fastapi_user_bob). There is no isolation.
- The DB explodes in size after just those two curls (e.g., ~8MB), filled with repeated "CONVERSATION CONTEXT" JSON blocks. -- This is likely fixed in Improved performance, implemented caching improved error handling #102
Analysis
- each Memori.enable() registers a LiteLLM success callback into a shared global list. Every completion triggers every registered callback, and each calls record_conversation() using its own namespace. With two enabled instances, each chat is written twice into two namespaces.
- record_conversation() schedules long-term processing, which itself prompts the LLM for summarization -- THIS IS A RECURSION. Because both instances record the same chat, both launch their own processing prompts, multiplying requests and writes. This is why TPM and DB size blow up. Again, likely fixed in Improved performance, implemented caching improved error handling #102 but I will keep it here in case this is a missed aspect of the issue.
- The FastAPI example never routes a request to a specific Memori instance; it globally calls litellm.completion() and relies on side effects, which is exactly what causes the fan-out.
Impact
- Data leakage across "user" namespaces (privacy / tenancy risk).
- Corrupted memory due to duplicated short-/long-term rows.
README markets "Simple Multi-User" and "FastAPI Multi-User App" as working isolation examples. In practice, namespaces are just a column; no request-scoped routing or callback selection ensures isolation. The examples are misleading.
Can the maintainers please confirm the issue?