-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathdocker-compose.yml
More file actions
55 lines (54 loc) · 3 KB
/
docker-compose.yml
File metadata and controls
55 lines (54 loc) · 3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# One-command local Qdrant for Memex.
#
# docker compose up -d qdrant # or: bash scripts/start-qdrant.sh
#
# Memex talks to Qdrant over gRPC on :6334 by default (override with
# MEMEX_QDRANT_URL). The REST API + health endpoints live on :6333.
# Pinned to the same Qdrant version Memex was built and tested against.
services:
qdrant:
image: qdrant/qdrant:v1.18.1
container_name: memex-qdrant
restart: unless-stopped
# Resource ceilings so Qdrant stays within budget on small machines
# (e.g. an 8 GB M1 MacBook Air, where macOS + Memex + the embedder must
# also fit). `mem_limit` is a CEILING, not a reservation — Docker does not
# pre-allocate it; it only caps how far Qdrant may grow before the kernel
# OOM-kills it. So a generous ceiling LOWERS risk (a too-tight cap that
# OOM-kills a healthy Qdrant is worse than high RAM), while actual usage
# stays small via quantization.
#
# Sizing per Qdrant's official capacity-planning formula
# (qdrant.tech/documentation/guides/capacity-planning/):
# memory_size ≈ num_vectors * dim * 4 bytes * 1.5
# For a single user's session corpus (~thousands of sessions, 384-dim,
# 5 dense + a multivector, TurboQuant bits2 + always_ram) the in-RAM index
# estimates to well under 1 GB. Qdrant's own docs caution these are
# "estimates at best ... test your data set in practice", so we set the
# ceiling at 4 GB — comfortably above the estimate, still leaving room on
# an 8 GB host (the limit is a backstop Qdrant won't actually reach).
# Raise it if you index a very large corpus; lower it only after measuring.
# `cpus` keeps Qdrant's optimizer from competing with the embedder for
# every core during warm-up. Both are honored by `docker compose up`
# (non-swarm); override via a compose override file for more headroom.
mem_limit: 4g
cpus: 4.0
ports:
# THR-06: bind to loopback only. Memex (host app, or the all-in-one image
# where Qdrant + web share one container) reaches Qdrant via localhost, so
# restricting to 127.0.0.1 keeps the indexed corpus off the LAN. Binding
# 0.0.0.0 (the Docker default) would expose the un-authenticated REST/gRPC
# API — and the whole session corpus — to anyone on the network.
# (API-key auth via QDRANT__SERVICE__API_KEY is a future add-on; it needs
# client-side key support in indexer::connect(), not yet wired.)
- "127.0.0.1:6333:6333" # REST API + /healthz /readyz /livez + web dashboard
- "127.0.0.1:6334:6334" # gRPC API (this is what Memex connects to)
volumes:
# Named volume so your indexed corpus survives `docker compose down`.
# Snapshots written via `memex snapshot export` land here too.
- qdrant_storage:/qdrant/storage
# Qdrant's image ships no curl/wget, so an in-container HTTP healthcheck
# is unreliable. Health is verified from the host instead — see
# scripts/start-qdrant.sh, which polls http://localhost:6333/readyz.
volumes:
qdrant_storage: