Skip to content

Topicer Interface - Support for using DB embeddings #40

@mdocekal

Description

@mdocekal

I think we missed support for using database embeddings in discover_topics_in_db_* methods. The current state is as follows

@abstractmethod
  async def discover_topics_in_db_sparse(self, db_request: DBRequest, n: int | None = None) -> DiscoveredTopicsSparse:
      """
      Discover topics based on a database request and return a sparse representation.

      :param db_request: Database request to fetch texts for topic discovery.
      :param n: Optional number of topics to propose, if None uses the default value.
      :return: DiscoveredTopicsSparse
      """
      ...

  @abstractmethod
  async def discover_topics_in_db_dense(self, db_request: DBRequest, n: int | None = None) -> DiscoveredTopics:
      """
      Discover topics based on a database request and return a dense representation.

      :param db_request: Database request to fetch texts for topic discovery.
      :param n: Optional number of topics to propose, if None uses the default value.
      :return: DiscoveredTopics
      """
      ...

I believe we should have

@abstractmethod
  async def discover_topics_in_db_sparse(self, db_request: DBRequest, n: int | None = None, db_embeddings: bool | None = None) -> DiscoveredTopicsSparse:
      """
      Discover topics based on a database request and return a sparse representation.

      :param db_request: Database request to fetch texts for topic discovery.
      :param n: Optional number of topics to propose, if None uses the default value.
      :param db_embeddings: Obtain text representations from database, if None uses the default. 
      :return: DiscoveredTopicsSparse
      """
      ...

  @abstractmethod
  async def discover_topics_in_db_dense(self, db_request: DBRequest, n: int | None = None, db_embeddings: bool | None = None) -> DiscoveredTopics:
      """
      Discover topics based on a database request and return a dense representation.

      :param db_request: Database request to fetch texts for topic discovery.
      :param n: Optional number of topics to propose, if None uses the default value.
      :param db_embeddings: Obtain text representations from database, if None uses the default. 
      :return: DiscoveredTopics
      """
      ...

to allow usage of precomputed representations.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions