[RFC] Executor: making Ragas faster and more reliable #394

jjmachan · 2023-12-19T14:33:07Z

Problem - ragas is slow and unreliable

Ragas is not exploiting concurrency options provided via ThreadPoolExecutor and asyncio modules. This is because ragas took a batching approach to evaluation ie evaluated metrics in batches
Not every service has async support - need options to keep sync and no concurrency at all
need these primitives for [RFC] Testset Generation: making it faster and easy to use #380 and potentially others as well

Core Components

BaseMetric - a metric that evaluates a single score row with but score() and ascore()
RagasLLM that is based on langchain-core llms
1. Prompt object with provision for instruction and demonstrations that convert to messages or prompts that is supported by both langchain chat based on completion based
2. LLMResult object that supports both chat and text-based outputs
Exector that runs BaseMetric . It should also be able to run testset generators so this should be a common paradigm
new evaluate() function that makes it easier to
1. change llm and embeddings - this is the new method where BaseMetrc by default will have llm=None and will take the default llm from the evaluate() function. If metric.llm != None then the provided metric is used
2. switch between async vs threading
3. supports callbacks throughout

Base classes

`Metric`

class Metric:
    def score(
      row, # just 1 row
      callbacks: t.Optional[Callbacks] = None,
    )-> float:
		    ...
    async def ascore(
        row, # just 1 row
        callbacks: t.Optional[Callbacks] = None,
    )-> float:
        ...

`evaluation()`

def evaluate(
    dataset: Dataset,
    metrics: list[Metric] | None = None,
    llm: t.Optional[BaseRagasLLM] = None,
    embeddings: t.Optional[RagasEmbeddings] = None,
    callbacks: Callbacks = [],
    is_async: bool = True,
    max_workers: t.Optional[int] = None,
    raise_exceptions: bool = True,
    column_map: t.Dict[str, str] = {},
) -> Result:

`BaseRagasLLM`

@dataclass
class BaseRagasLLM(ABC):
    @abstractmethod
    def generate_text(
        self,
        prompt: Prompt,
        n: int = 1,
        temperature: float = 1e-8,
        stop: t.Optional[t.List[str]] = None,
        callbacks: t.Optional[Callbacks] = None,
    ) -> LLMResult:
        ...

    @abstractmethod
    async def agenerate_text(
        self,
        prompt: Prompt,
        n: int = 1,
        temperature: float = 1e-8,
        stop: t.Optional[t.List[str]] = None,
        callbacks: t.Optional[Callbacks] = None,
    ) -> LLMResult:
        ...

The text was updated successfully, but these errors were encountered:

jjmachan · 2024-01-08T10:14:45Z

list of issues this will address

Make embeddings faster

AzureOpenAIEmbeddings no longer exists on langchain #361

iterakhtaras · 2024-03-28T14:51:50Z

Hey @jjmachan ! Thanks for all your work on ragas, I really appreciate it. I am trying to use it to evaluate my chatbot created with llama-index. Has there been any workarounds discovered for issue #271 ?

These are my dependencies:
`%pip install ragas==0.0.22

%pip install pypdf

%pip install llama-index==0.8.52

%pip install langchain==0.0.331rc3

%pip install openai==0.28.1`

jjmachan mentioned this issue Jan 3, 2024

Disable LangSmith #238

Closed

joy13975 mentioned this issue Feb 20, 2024

Async Executor/Runner slows to a halt with jobs that auto-retry with default (high) max_wait #642

Closed

dosubot bot added the stale label May 19, 2024

dosubot bot closed this as not planned Jun 1, 2024

dosubot bot removed the stale label Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Executor: making Ragas faster and more reliable #394

[RFC] Executor: making Ragas faster and more reliable #394

jjmachan commented Dec 19, 2023

jjmachan commented Jan 8, 2024 •

edited

Loading

iterakhtaras commented Mar 28, 2024

[RFC] Executor: making Ragas faster and more reliable #394

[RFC] Executor: making Ragas faster and more reliable #394

Comments

jjmachan commented Dec 19, 2023

Problem - ragas is slow and unreliable

Core Components

Base classes

Metric

evaluation()

BaseRagasLLM

jjmachan commented Jan 8, 2024 • edited Loading

iterakhtaras commented Mar 28, 2024

`Metric`

`evaluation()`

`BaseRagasLLM`

jjmachan commented Jan 8, 2024 •

edited

Loading