Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new metrics - Perplexity & Text Readability. #919

Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions ads/aqua/evaluation/evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -943,6 +943,33 @@ def get_supported_metrics(self) -> dict:
),
"args": {},
},
{
"use_case": ["text_generation"],
"key": "perplexity_score",
"name": "perplexity_score",
"description": (
"Perplexity is a metric to evaluate the quality of language models, particularly for \`Text Generation\" task type. "
"Perplexity quantifies how well a LLM can predict the next word in a sequence of words. "
"A high perplexity score indicates that the LLM is not confident in its text generation — that is, the model is \"perplexed\" — "
"whereas a low perplexity score indicates that the LLM is confident in its generation."
),
"args": {},
},
{
"use_case": ["text_generation"],
"key": "text_readability",
"name": "text_readability",
"description": (
"Text quality/readability metrics offer valuable insights into the quality and suitability of generated responses. "
"Monitoring these metrics helps ensure that Language Model (LLM) outputs are clear, concise, and appropriate for the target audience. "
"Evaluating text complexity and grade level helps tailor the generated content to the intended readers. "
"By considering aspects such as sentence structure, vocabulary, and domain-specific needs, we can make sure the LLM produces"
"responses that match the desired reading level and professional context. "
"Additionally, metrics like syllable count, word count, and character count allow you to keep track of the "
"length and structure of the generated text."
),
"args": {},
}
]

@telemetry(entry_point="plugin=evaluation&action=load_metrics", name="aqua")
Expand Down