Skip to content

Conversation

NathalieCharbel
Copy link
Contributor

Description

This PR extends the rate limiting functionality (previously available only for LLMs) to all embedding providers, and ensures consistent error handling

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Low

Complexity:

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@NathalieCharbel NathalieCharbel requested a review from a team as a code owner September 20, 2025 18:08
Copy link
Contributor

@stellasia stellasia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just worried about the tests, if you have some time to explain. The other comments are marginal and non blocking.


def is_rate_limit_error(exception: Exception) -> bool:
"""Check if an exception is a rate limit error from any LLM provider.
"""Check if an exception is a rate limit error from any LLM provider or embedder.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can move this file somewhere else now that it's not only used for LLMs?

)
embedder = OpenAIEmbeddings(api_key="my key")
with pytest.raises(
EmbeddingsGenerationError, match="Failed to generate embedding with OpenAI"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this test, shouldn't this error be retried? I'm not sure this test captures it? I would add a check to make sure the function is called the expected number of times.

self._rate_limit_handler = rate_limit_handler
else:
self._rate_limit_handler = DEFAULT_RATE_LIMIT_HANDLER

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider having the embed_query method in the base class, that deals with the common logic of retries, and calls an _embed_query (or another name) in the subclasses. In this way, the implementation in the subclasses doesn't have to change (just update the method name):

In short:

base class:

@rate_limit_handler
def embed_query(self, query):
     try:
         return self._embed_query(query)
     except Exception as e:
         raise EmbeddingGenerationError() from e

@abstractmethod
def _embed_query(self, query):
    ...

and in the children:

# no need to add the decorator
def _embed_query(self, query):
   # rest of the code

(up to you if you want to do it this way, it's not a request for change now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants