-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
enhancementNew feature or requestNew feature or requestquestionFurther information is requestedFurther information is requested
Description
Is your feature request related to a problem? Please describe.
In RAG-scenarious, I think it would be a great help to differentiate if a LLM is hallucinating or retrieving its informations from the given context, when we could get an attention score for all input-tokens per generated token.
Describe the solution you'd like
Having a callback-mechanism for every generated token, similar to the LogitsProcessor, that receives a list of scores.
Describe alternatives you've considered
Calculating the scores by myself. But my knowledge of transformers is not sufficient.
Additional context
I would like to build something like the "Attention tracing" in this repository, but with llama.cpp as backend.
reuank
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestquestionFurther information is requestedFurther information is requested