Skip to content

feat: Implement LLM-Powered Adaptive Replay and Auto-Documentation #954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shephinphilip
Copy link

@shephinphilip shephinphilip commented Jun 5, 2025

This commit introduces an LLM-powered adaptive replay strategy and an auto-documentation feature.

Key changes include:

  1. LLMAdaptiveStrategy (openadapt/strategies/llm_adaptive_strategy.py):

    • A new replay strategy that inherits from BaseReplayStrategy.
    • Intent Abstraction: I use an LLM (via generate_action_event.j2 prompt) to determine the next action based on recorded actions, current UI state, and your task description.
    • Semantic Matching & Adaptation: I implemented a UI consistency check (_is_ui_consistent_for_next_original_action) to decide whether to replay a recorded action directly or use the LLM for adaptation if the UI has changed. This involves comparing window titles, dimensions, and screenshot similarity.
    • Basic Error Recovery: I overrode the run method to include a post-action check using prompt_is_action_complete. If an action doesn't complete as expected, this is logged, and I implicitly handle the new state in the next cycle.
    • Action history is consistently managed in self.action_events.
  2. Auto-Documentation Script (openadapt/scripts/generate_documentation.py):

    • A new script that takes a recording timestamp.
    • It loads the recording, prepares context (action details, window states, screenshots).
    • It uses the describe_recording.j2 prompt to ask an LLM to generate a human-readable summary of the recording.
    • It prints the generated documentation to the console.
  3. Integration & Prompts:

    • The new strategy is dynamically discovered by the system.
    • It leverages existing prompt infrastructure and LLM adapter configurations.
    • Relevant prompts (generate_action_event.j2, describe_recording.j2, is_action_complete.j2, system.j2) are utilized.

Steps I Took:

  • Initial planning and codebase exploration.
  • Created LLMAdaptiveStrategy class structure.
  • Implemented LLM-based intent abstraction in get_next_action_event.
  • Added semantic replay matching logic to intelligently choose between replaying original actions or using LLM for adaptation.
  • Implemented basic error detection in the strategy's run method.
  • Ensured the new strategy integrates with the existing system.
  • Developed the generate_documentation.py script for auto-documentation.

This work fulfills the core requirements of the issue to create an intelligent replay system using LLMs to generalize, abstract, and execute workflows across varying UI states, and to auto-document recordings. I planned unit tests as the next step.

to run , we need to use

python -m openadapt.replay LLMAdaptiveStrategy --timestamp YOUR_RECORDING_TIMESTAMP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant