Package hnsw
implements Hierarchical Navigable Small World graphs in Go. You
can read up about how they work here. In essence,
they allow for fast approximate nearest neighbor searches with high-dimensional
vector data.
This package can be thought of as an in-memory alternative to your favorite vector database (e.g. Pinecone, Weaviate). It implements just the essential operations:
Operation | Complexity | Description |
---|---|---|
Insert | Insert a vector into the graph | |
Delete | Delete a vector from the graph | |
Search | Search for the nearest neighbors of a vector | |
Lookup | Retrieve a vector by ID |
Note
Complexities are approximate where
go get github.com/coder/hnsw@main
g := hnsw.NewGraph[hnsw.Vector]()
g.Add(
hnsw.MakeVector("1", []float32{1, 1, 1}),
hnsw.MakeVector("2", []float32{1, -1, 0.999}),
hnsw.MakeVector("3", []float32{1, 0, -0.5}),
)
neighbors := g.Search(
[]float32{0.5, 0.5, 0.5},
1,
)
fmt.Printf("best friend: %v\n", neighbors[0].Embedding())
// Output: best friend: [1 1 1]
By and large the greatest effect you can have on the performance of the graph is reducing the dimensionality of your data. At 1536 dimensions (OpenAI default), 70% of the query process under default parameters is spent in the distance function.
If you're struggling with slowness / latency, consider:
- Reducing dimensionality
- Increasing
$M$
And, if you're struggling with excess memory usage, consider:
- Reducing
$M$ a.k.aGraph.M
(the maximum number of neighbors each node can have) - Reducing
$m_L$ a.k.aGraph.Ml
(the level generation parameter)
- #3 Persistence / serialization