Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instability in answer relevance score #1889

Open
laughinghugs opened this issue Jan 30, 2025 · 1 comment
Open

Instability in answer relevance score #1889

laughinghugs opened this issue Jan 30, 2025 · 1 comment
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@laughinghugs
Copy link

For evaluating my RAG pipeline, in real time, I am implementing 'Answer Relevance' score. But what I found that for same set of question, answer, and context it is generating different scores. And these scores are varying by 10-15%. I understand as it is a probabilistic measure (using LLM, re-engineering questions from answer etc.), the score may not be the same always. But varying 10-15% is a trust issue. Any solution to this?

@laughinghugs laughinghugs added the question Further information is requested label Jan 30, 2025
@dosubot dosubot bot added the bug Something isn't working label Jan 30, 2025
@jjmachan
Copy link
Member

hey @laughinghugs this is a problem we are aware of and hoping to solve it with approaches like the once mentioned in https://blog.ragas.io/aligning-llm-as-judge-with-human-evaluators.

@shahules786 would you have any suggestions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants