distinguish types of entries in the vector DB

Currently we put every type of document into one vector DB:
- GitHub issues
- sections of Go documentation
- gerrit CLs
- and so on.

Our Related Entities API (#22) may want to (a) let users ask for a subset of the possible types, and (b) classify results by type.
 
As far as classifying the results, currently all the IDs are URLs, and it is easy to tell the type of doc from the form of the URL. I think we can continue that indefinitely. So we don't need separate namespaces or metadata to identify the type of doc.

To support asking for a subset of types, we can just search for more documents and throw out the ones that don't match. That can be expensive, though, since we might have to do multiple searches with increasing limits until we get the docs we want. If we only let the user provide a threshold (max distance from the query) instead of a limit (number of documents), then a single call will do.

An alternative is to use a separate namespace for each type. Advantages are that the type of document would be more evident, and we could query different types concurrently. Disadvantages are that we'd have to rewrite everything, and we'd have to perform N queries instead of one and merge the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

distinguish types of entries in the vector DB #23

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

distinguish types of entries in the vector DB #23

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions