chore: add bin/fix_acceptance_tests_yml.py #761

maxi297 · 2025-09-12T18:46:07Z

What

https://airbytehq-team.slack.com/archives/C02U9R3AF37/p1757701152311889?thread_ts=1757603749.279779&cid=C02U9R3AF37

Summary by CodeRabbit

Chores
- Added a CLI tool to migrate acceptance test configuration files across connectors to a new nested format.
- Validates configs, skips already-migrated files, and reports clear outcomes and errors for each processed file.
- Preserves YAML formatting and improves list indentation to maintain consistency after migration.

github-actions · 2025-09-12T18:46:19Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/update_acceptance_test_config_yml#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/update_acceptance_test_config_yml

Helpful Resources

CDK API Reference

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

/autofix - Fixes most formatting and linting issues
/poetry-lock - Updates poetry.lock file
/test - Runs connector tests with the updated CDK
/poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
/poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

coderabbitai · 2025-09-12T18:52:38Z

📝 Walkthrough

Walkthrough

Adds a new CLI script that migrates connector acceptance-test YAML files from a top-level tests mapping to a nested acceptance_tests structure, including custom YAML dumping, validation, per-file transformation, and error handling via a main entrypoint.

Changes

Cohort / File(s)	Summary
CLI migration script `bin/fix_acceptance_tests_yml.py`	New script: defines `AlreadyUpdatedError`; `FixingListIndentationDumper` to control YAML list indentation; `transform(file_path: Path)` to validate, migrate `tests` → `acceptance_tests`, and write YAML; `main()` to glob connector config files, call `transform`, and report errors; CLI entrypoint (`if __name__ == "__main__": main()`).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Dev as Developer
  participant CLI as fix_acceptance_tests_yml.py
  participant FS as Filesystem
  participant YAML as YAML Parser

  Dev->>CLI: Run with repo_path
  CLI->>FS: Glob airbyte-integrations/connectors/source-*/acceptance-test-config.yml
  loop For each matched file
    CLI->>FS: Read file
    CLI->>YAML: safe_load(content)
    alt acceptance_tests already present
      CLI-->>Dev: Skip file (AlreadyUpdatedError)
    else "tests" present and valid
      CLI->>CLI: Build acceptance_tests structure (per-type {'tests': ...})
      CLI->>YAML: dump(updated, FixingListIndentationDumper, sort_keys=False)
      CLI->>FS: Write file
      CLI-->>Dev: Print success
    else invalid/missing "tests"
      CLI-->>Dev: Print error (ValueError or YAMLError)
    end
  end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

aaronsteers
aldogonzalez8
dbgold17

Would you like me to suggest small unit tests or example input/output fixtures for this script to speed review? wdyt?

Pre-merge checks (2 passed, 1 warning)

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title accurately and concisely reflects the main change—adding the new script bin/fix_acceptance_tests_yml.py—so it is on-topic and understandable to reviewers scanning the history.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e79545c and c40a57e.

📒 Files selected for processing (1)

bin/fix_acceptance_tests_yml.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

bin/fix_acceptance_tests_yml.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: Check: destination-motherduck
GitHub Check: Check: source-intercom
GitHub Check: Check: source-pokeapi
GitHub Check: Check: source-shopify
GitHub Check: Check: source-hardcoded-records
GitHub Check: Pytest (All, Python 3.13, Ubuntu)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Pytest (All, Python 3.12, Ubuntu)
GitHub Check: SDM Docker Image Build
GitHub Check: Manifest Server Docker Image Build
GitHub Check: Pytest (Fast)

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch maxi297/update_acceptance_test_config_yml

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

bin/fix_acceptance_tests_yml.py (4)
49-51: Be tolerant if a test section is already nested

If some files already have test_type: { tests: [...] }, we’d double-nest. Prefer a conditional to pass through already-correct blocks, wdyt?
-    for test_type, test_content in tests_data.items():
-        data['acceptance_tests'][test_type] = {'tests': test_content}
+    for test_type, test_content in tests_data.items():
+        if isinstance(test_content, dict) and "tests" in test_content:
+            section = test_content
+        else:
+            section = {"tests": ([] if test_content is None else test_content)}
+        data["acceptance_tests"][test_type] = section
52-55: Clarify formatting claim or preserve comments via ruamel.yaml

PyYAML will drop comments; the “preserved formatting” comment is misleading. Do you want to switch to ruamel.yaml to preserve comments/round-tripping, or just update the comment and keep PyYAML, wdyt?

Option A (keep PyYAML; fix comment and use UTF-8):
-    # Write back to file with preserved formatting
-    with open(file_path, 'w') as f:
-        yaml.dump(data, f, default_flow_style=False, sort_keys=False, indent=2)
+    # Write back to file (note: comments/formatting may not be preserved)
+    with open(file_path, "w", encoding="utf-8") as f:
+        yaml.dump(data, f, default_flow_style=False, sort_keys=False, indent=2, allow_unicode=True)
Option B (preserve comments):

Replace PyYAML with ruamel.yaml’s YAML() round-trip loader/dumper (happy to provide a follow-up patch if you prefer).

60-63: Make usage resilient to script path

Would you prefer the usage use the invoked name so it’s correct whether run via bin/ or directly, wdyt?
-        print("Usage: python fix_acceptance_tests_yml.py <airbyte_repo_path>")
+        print(f"Usage: {Path(sys.argv[0]).name} <airbyte_repo_path>")
18-22: Trim unused imports

typing imports aren’t used. Shall we drop them to keep Ruff happy, wdyt?
-from typing import Dict, Any

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a075a1 and fc3888f.

📒 Files selected for processing (1)

bin/fix_acceptance_tests_yml.py (1 hunks)

🧰 Additional context used

🪛 GitHub Actions: Linters

bin/fix_acceptance_tests_yml.py

[error] 1-1: Ruff format check failed. 1 file would be reformatted (bin/fix_acceptance_tests_yml.py). Command: 'poetry run ruff format --diff .'

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: Check: source-hardcoded-records
GitHub Check: Check: source-intercom
GitHub Check: Check: source-shopify
GitHub Check: Check: destination-motherduck
GitHub Check: Check: source-pokeapi
GitHub Check: Pytest (Fast)
GitHub Check: Manifest Server Docker Image Build
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: SDM Docker Image Build
GitHub Check: Pytest (All, Python 3.13, Ubuntu)
GitHub Check: Pytest (All, Python 3.12, Ubuntu)

🔇 Additional comments (1)

bin/fix_acceptance_tests_yml.py (1)

1-79: Fix Ruff formatting failure

CI shows "Ruff format check failed." I tried running the formatter here but poetry isn't available in the environment (poetry: command not found). Could you run locally: poetry run ruff format bin/fix_acceptance_tests_yml.py and push the formatted file, or should I apply the formatting and open a follow-up patch? wdyt?

coderabbitai · 2025-09-12T18:52:41Z

bin/fix_acceptance_tests_yml.py

+def transform(file_path: Path) -> None:
+    with open(file_path, 'r') as f:
+        data = yaml.safe_load(f)
+
+    if 'acceptance_tests' in data:
+        raise AlreadyUpdatedError()
+
+    if 'tests' not in data:
+        raise ValueError(f"No 'tests' key found in {file_path}, skipping transformation")
+
+    # Extract the tests data
+    tests_data = data.pop('tests')
+
+    if not isinstance(tests_data, dict):
+        raise ValueError(f"Error: 'tests' key in {file_path} is not a dictionary")
+


🛠️ Refactor suggestion

Harden YAML load (empty/non-mapping files) and use UTF-8

Empty YAML evaluates to None and non-mapping roots will raise TypeError on key checks. Shall we guard and add explicit UTF-8, plus clearer error messages, wdyt?

def transform(file_path: Path) -> None: - with open(file_path, 'r') as f: - data = yaml.safe_load(f) - - if 'acceptance_tests' in data: - raise AlreadyUpdatedError() - - if 'tests' not in data: - raise ValueError(f"No 'tests' key found in {file_path}, skipping transformation") - - # Extract the tests data - tests_data = data.pop('tests') - - if not isinstance(tests_data, dict): - raise ValueError(f"Error: 'tests' key in {file_path} is not a dictionary") + with open(file_path, "r", encoding="utf-8") as f: + data = yaml.safe_load(f) or {} + + if not isinstance(data, dict): + raise ValueError(f"{file_path}: top-level YAML must be a mapping, got {type(data).__name__}") + + if "acceptance_tests" in data: + raise AlreadyUpdatedError() + + if "tests" not in data: + raise ValueError(f"{file_path}: no 'tests' key; skipping transformation") + + # Extract the tests data + tests_data = data.pop("tests") or {} + + if not isinstance(tests_data, dict): + raise ValueError(f"{file_path}: 'tests' must be a mapping, got {type(tests_data).__name__}")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In bin/fix_acceptance_tests_yml.py around lines 29–44, open the file with explicit UTF-8 (open(file_path, 'r', encoding='utf-8')), call yaml.safe_load, then guard against empty or non-mapping YAML by checking if data is None and raising a clear ValueError("Empty YAML file: {file_path}") and then if not isinstance(data, dict) raise ValueError("Unexpected YAML root type: expected mapping in {file_path}"); only after those guards check for 'acceptance_tests' and 'tests' keys and validate that tests_data is a dict, keeping the existing AlreadyUpdatedError and the other error messages but making them clearer per above.

coderabbitai · 2025-09-12T18:52:41Z

bin/fix_acceptance_tests_yml.py

+    for file_path in repo_path.glob('airbyte-integrations/connectors/source-*/acceptance-test-config.yml'):
+        try:
+            transform(file_path)
+        except AlreadyUpdatedError:
+            print(f"File {file_path} has already been updated, skipping transformation")
+        except yaml.YAMLError as e:
+            print(f"Error parsing YAML file {file_path}: {e}")
+        except Exception as e:
+            print(f"Error transforming {file_path}: {e}")
+


🛠️ Refactor suggestion

Broaden file search to destinations and .yaml; print errors to stderr

Many connectors are destination-* and some configs use .yaml. Shall we cover both and route errors to stderr, wdyt?

- for file_path in repo_path.glob('airbyte-integrations/connectors/source-*/acceptance-test-config.yml'): - try: - transform(file_path) - except AlreadyUpdatedError: - print(f"File {file_path} has already been updated, skipping transformation") - except yaml.YAMLError as e: - print(f"Error parsing YAML file {file_path}: {e}") - except Exception as e: - print(f"Error transforming {file_path}: {e}") + patterns = [ + "airbyte-integrations/connectors/source-*/acceptance-test-config.yml", + "airbyte-integrations/connectors/source-*/acceptance-test-config.yaml", + "airbyte-integrations/connectors/destination-*/acceptance-test-config.yml", + "airbyte-integrations/connectors/destination-*/acceptance-test-config.yaml", + ] + for pattern in patterns: + for file_path in repo_path.glob(pattern): + try: + transform(file_path) + except AlreadyUpdatedError: + print(f"File {file_path} has already been updated, skipping transformation") + except yaml.YAMLError as e: + print(f"Error parsing YAML file {file_path}: {e}", file=sys.stderr) + except Exception as e: + print(f"Error transforming {file_path}: {e}", file=sys.stderr)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In bin/fix_acceptance_tests_yml.py around lines 66 to 75, broaden the file search to include destination-* connectors and files ending with both .yml and .yaml, and route error output to stderr; change the glob loop to iterate over a list of patterns (e.g. for patterns like 'airbyte-integrations/connectors/source-*/acceptance-test-config.yml', '.../source-*/acceptance-test-config.yaml', '.../destination-*/acceptance-test-config.yml', '.../destination-*/acceptance-test-config.yaml') or generate matches with multiple glob calls, then for each matched file call transform(file_path) and on exceptions print the same messages to stderr using print(..., file=sys.stderr) (including AlreadyUpdatedError, yaml.YAMLError as e, and generic Exception as e).

maxi297 · 2025-09-12T18:59:34Z

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

github-actions · 2025-09-12T19:07:44Z

PyTest Results (Fast)

3 746 tests ±0 3 734 ✅ ±0 6m 20s ⏱️ +2s
1 suites ±0 12 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit c40a57e. ± Comparison against base commit 9a075a1.

github-actions · 2025-09-12T19:12:23Z

PyTest Results (Full)

3 749 tests ±0 3 737 ✅ ±0 10m 59s ⏱️ ±0s
1 suites ±0 12 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit c40a57e. ± Comparison against base commit 9a075a1.

add bin/fix_acceptance_tests_yml.py

fc3888f

github-actions bot added the chore label Sep 12, 2025

coderabbitai bot requested changes Sep 12, 2025

View reviewed changes

fix indentation

e79545c

Auto-fix lint and format issues

c40a57e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: add bin/fix_acceptance_tests_yml.py #761

chore: add bin/fix_acceptance_tests_yml.py #761

Uh oh!

maxi297 commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Sep 12, 2025

Uh oh!

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 12, 2025

Uh oh!

coderabbitai bot Sep 12, 2025

Uh oh!

maxi297 commented Sep 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 12, 2025

Uh oh!

github-actions bot commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chore: add bin/fix_acceptance_tests_yml.py #761

Are you sure you want to change the base?

chore: add bin/fix_acceptance_tests_yml.py #761

Uh oh!

Conversation

maxi297 commented Sep 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Summary by CodeRabbit

Uh oh!

github-actions bot commented Sep 12, 2025

👋 Greetings, Airbyte Team Member!

Testing This CDK Version

Helpful Resources

PR Slash Commands

Uh oh!

coderabbitai bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Pre-merge checks (2 passed, 1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

maxi297 commented Sep 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 12, 2025

PyTest Results (Fast)

Uh oh!

github-actions bot commented Sep 12, 2025

PyTest Results (Full)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

maxi297 commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

maxi297 commented Sep 12, 2025 •

edited by github-actions bot

Loading