Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: translate instruction when adapting prompt #1529

Merged
merged 2 commits into from
Oct 19, 2024

Conversation

Yunnglin
Copy link
Contributor

When adapting prompts, both the examples and the corresponding instructions are important and need to be translated.

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Oct 18, 2024
@shahules786 shahules786 self-requested a review October 18, 2024 10:26
@shahules786
Copy link
Member

shahules786 commented Oct 18, 2024

Hey @Yunnglin Did you observe better results when translating both? We observed pretty good results with only translating a few shot examples. Another reason why we did not translate instruction was that time it introduces ambiguity. FOr example, let's say if the instruction was "Output False if a number greater than zero", this would also translate the word "False", which then causes issues while post-processing.

@Yunnglin
Copy link
Contributor Author

Whether to translate the "instruction" could perhaps be made into an option. When I was generating a Chinese dataset, I found that some "extractor prompts" would lack examples, which led to the generated data not being very effective.

@shahules786
Copy link
Member

@Yunnglin that suggestion makes sense. Could you make a PR with the same?
@jjmachan Please share if you have any opinion on this.

@jjmachan
Copy link
Member

This makes a ton of sense

what we can do is take another argument for

async def adapt(
self, target_language: str, llm: BaseRagasLLM
) -> "PydanticPrompt[InputModel, OutputModel]":
"""
Adapt the prompt to a new language.
"""

which converts it

and it can be controlled through

async def adapt_prompts(

would be really helpful 🙂

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Oct 19, 2024
@Yunnglin
Copy link
Contributor Author

  • Add adapt_instruction: bool=False parameter.

    Now you can adapt prompt as follows:

    import asyncio
    from ragas.metrics import Faithfulness
    from ragas.llms import LangchainLLMWrapper
    
    
    instance = Faithfulness()
    adapted_prompts = asyncio.run(instance.adapt_prompts(language="chinese", llm=LangchainLLMWrapper(chat_model), adapt_instruction=True))
    print(adapted_prompts)
  • Add Ensure that the number of output data rows is equal to the number of input data rows. to the translation prompt since sometimes LLM breaks the single line into multiple lines.

Copy link
Member

@shahules786 shahules786 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot.

@jjmachan jjmachan merged commit 5481246 into explodinggradients:main Oct 19, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants