Skip to content

Updating for last run#2

Open
javidsegura wants to merge 13 commits intoagnosticAgentfrom
main
Open

Updating for last run#2
javidsegura wants to merge 13 commits intoagnosticAgentfrom
main

Conversation

@javidsegura
Copy link
Copy Markdown

No description provided.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Sep 19, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch main

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @javidsegura, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refines the AlphaFold protein binding pipeline by introducing and formalizing a 'Separate Pipelines Design' for handling multiple structures. It includes comprehensive documentation of the new and existing pipeline architectures, alongside core logic changes that enable child pipelines to operate more efficiently by skipping redundant initial steps. The update also incorporates minor fixes for path handling and enhances logging for better traceability of adaptive decisions within the pipeline.

Highlights

  • New Pipeline Design Documentation: A new README file has been added to explain two distinct AlphaFold pipeline designs: a 'Single Pipeline with Parallel Structures' and a 'Separate Pipelines Design', detailing how each handles multiple structures and GPU binding.
  • Enhanced Child Pipeline Management: The pipeline now supports more sophisticated management of child pipelines, introducing is_child and start_pass parameters to configure their behavior, particularly allowing them to skip initial processing steps.
  • Optimized Child Pipeline Execution: Child pipelines can now conditionally skip the MPNN and sequence ranking steps during their initial pass, preventing redundant computations and improving overall efficiency.
  • Improved Logging and Path Handling: Logging for adaptive decisions has been added, and path construction for AlphaFold output directories has been corrected and expanded to include necessary subdirectories.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an adaptive optimization strategy by allowing pipelines to spawn child pipelines for proteins with degrading quality scores. The changes are generally well-structured, including a new README to explain the design. However, I've identified a high-severity logic bug where proteins could be inadvertently dropped from processing. I've also noted several medium-severity issues related to Python best practices, such as the use of mutable default arguments and unnecessary code, along with a minor typo in a log message. Addressing these points will improve the robustness and maintainability of the new feature.

Comment on lines +113 to +114
else:
pipeline.previous_scores = copy.deepcopy(pipeline.current_scores)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a potential data loss bug here. If sub_iter_seqs is not empty but a child pipeline cannot be created (e.g., pipeline.sub_order >= MAX_SUB_PIPELINES), this else block is executed. The proteins in sub_iter_seqs have already been removed from pipeline.iter_seqs on line 76. Since this block only updates previous_scores, those proteins are effectively dropped from any further processing. They should be added back to the current pipeline's iter_seqs if they are not moved to a child.

Consider this implementation:

    else:
        if sub_iter_seqs:
            # If a child pipeline could not be created, add the sequences back to the parent.
            pipeline.iter_seqs.update(sub_iter_seqs)
        pipeline.previous_scores = copy.deepcopy(pipeline.current_scores)


decision = await adaptive_criteria(curr_score, pipeline.previous_scores[protein])

pipeline.logger.pipeline_log(f'Adaptive descision: {decision}')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a typo in the log message. 'descision' should be 'decision'.

Suggested change
pipeline.logger.pipeline_log(f'Adaptive descision: {decision}')
pipeline.logger.pipeline_log(f'Adaptive decision: {decision}')

async def s1(task_description=None):
if task_description is None:
task_description = {"ranks": 1}
async def s1(task_description={"gpus_per_rank": 1}): # noqa: B006
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a mutable default argument like a dictionary is generally discouraged as it can lead to unexpected behavior if the object is modified. While it is not modified here, the previous pattern of using None as a default and creating the dictionary within the function is safer and improves maintainability. The noqa: B006 indicates awareness, but adhering to best practices would be better in the long run.

A safer implementation would be:

async def s1(task_description=None):
    if task_description is None:
        task_description = {"gpus_per_rank": 1}
    # ... rest of function

async def s4(target_fasta, task_description=None):
if task_description is None:
task_description = {"gpus_per_rank": 1}
async def s4(target_fasta, task_description={"gpus_per_rank": 1}): # noqa: B006
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the comment on s1, using a mutable default argument here is risky for future maintenance. It's safer to initialize the default to None and create the dictionary inside the function body to avoid potential side effects if the function logic changes.

A safer implementation would be:

async def s4(target_fasta, task_description=None):
    if task_description is None:
        task_description = {"gpus_per_rank": 1}
    # ... rest of function

if task_description is None:
task_description = {}
@self.auto_register_task() # pLDTT_extract
async def s5(task_description={}): # noqa: B006
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

As with s1 and s4, using a mutable default argument is not recommended. To improve code safety and maintainability, please consider using None as the default and creating the dictionary inside the function.

A safer implementation would be:

async def s5(task_description=None):
    if task_description is None:
        task_description = {}
    # ... rest of function

"in the current pass only."
)

pass
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pass statement here is unnecessary because the if block already contains a logging statement and is not empty. It can be safely removed for cleaner code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants