-
Notifications
You must be signed in to change notification settings - Fork 30
chore: add bin/fix_acceptance_tests_yml.py #761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/update_acceptance_test_config_yml#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/update_acceptance_test_config_yml Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
📝 WalkthroughWalkthroughAdds a new CLI script that migrates connector acceptance-test YAML files from a top-level Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev as Developer
participant CLI as fix_acceptance_tests_yml.py
participant FS as Filesystem
participant YAML as YAML Parser
Dev->>CLI: Run with repo_path
CLI->>FS: Glob airbyte-integrations/connectors/source-*/acceptance-test-config.yml
loop For each matched file
CLI->>FS: Read file
CLI->>YAML: safe_load(content)
alt acceptance_tests already present
CLI-->>Dev: Skip file (AlreadyUpdatedError)
else "tests" present and valid
CLI->>CLI: Build acceptance_tests structure (per-type {'tests': ...})
CLI->>YAML: dump(updated, FixingListIndentationDumper, sort_keys=False)
CLI->>FS: Write file
CLI-->>Dev: Print success
else invalid/missing "tests"
CLI-->>Dev: Print error (ValueError or YAMLError)
end
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
Would you like me to suggest small unit tests or example input/output fixtures for this script to speed review? wdyt? Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
bin/fix_acceptance_tests_yml.py (4)
49-51
: Be tolerant if a test section is already nestedIf some files already have test_type: { tests: [...] }, we’d double-nest. Prefer a conditional to pass through already-correct blocks, wdyt?
- for test_type, test_content in tests_data.items(): - data['acceptance_tests'][test_type] = {'tests': test_content} + for test_type, test_content in tests_data.items(): + if isinstance(test_content, dict) and "tests" in test_content: + section = test_content + else: + section = {"tests": ([] if test_content is None else test_content)} + data["acceptance_tests"][test_type] = section
52-55
: Clarify formatting claim or preserve comments via ruamel.yamlPyYAML will drop comments; the “preserved formatting” comment is misleading. Do you want to switch to ruamel.yaml to preserve comments/round-tripping, or just update the comment and keep PyYAML, wdyt?
Option A (keep PyYAML; fix comment and use UTF-8):
- # Write back to file with preserved formatting - with open(file_path, 'w') as f: - yaml.dump(data, f, default_flow_style=False, sort_keys=False, indent=2) + # Write back to file (note: comments/formatting may not be preserved) + with open(file_path, "w", encoding="utf-8") as f: + yaml.dump(data, f, default_flow_style=False, sort_keys=False, indent=2, allow_unicode=True)Option B (preserve comments):
- Replace PyYAML with ruamel.yaml’s YAML() round-trip loader/dumper (happy to provide a follow-up patch if you prefer).
60-63
: Make usage resilient to script pathWould you prefer the usage use the invoked name so it’s correct whether run via bin/ or directly, wdyt?
- print("Usage: python fix_acceptance_tests_yml.py <airbyte_repo_path>") + print(f"Usage: {Path(sys.argv[0]).name} <airbyte_repo_path>")
18-22
: Trim unused importstyping imports aren’t used. Shall we drop them to keep Ruff happy, wdyt?
-from typing import Dict, Any
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
bin/fix_acceptance_tests_yml.py
(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
bin/fix_acceptance_tests_yml.py
[error] 1-1: Ruff format check failed. 1 file would be reformatted (bin/fix_acceptance_tests_yml.py). Command: 'poetry run ruff format --diff .'
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-shopify
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-pokeapi
- GitHub Check: Pytest (Fast)
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
🔇 Additional comments (1)
bin/fix_acceptance_tests_yml.py (1)
1-79
: Fix Ruff formatting failureCI shows "Ruff format check failed." I tried running the formatter here but poetry isn't available in the environment (poetry: command not found). Could you run locally: poetry run ruff format bin/fix_acceptance_tests_yml.py and push the formatted file, or should I apply the formatting and open a follow-up patch? wdyt?
bin/fix_acceptance_tests_yml.py
Outdated
def transform(file_path: Path) -> None: | ||
with open(file_path, 'r') as f: | ||
data = yaml.safe_load(f) | ||
|
||
if 'acceptance_tests' in data: | ||
raise AlreadyUpdatedError() | ||
|
||
if 'tests' not in data: | ||
raise ValueError(f"No 'tests' key found in {file_path}, skipping transformation") | ||
|
||
# Extract the tests data | ||
tests_data = data.pop('tests') | ||
|
||
if not isinstance(tests_data, dict): | ||
raise ValueError(f"Error: 'tests' key in {file_path} is not a dictionary") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Harden YAML load (empty/non-mapping files) and use UTF-8
Empty YAML evaluates to None and non-mapping roots will raise TypeError on key checks. Shall we guard and add explicit UTF-8, plus clearer error messages, wdyt?
def transform(file_path: Path) -> None:
- with open(file_path, 'r') as f:
- data = yaml.safe_load(f)
-
- if 'acceptance_tests' in data:
- raise AlreadyUpdatedError()
-
- if 'tests' not in data:
- raise ValueError(f"No 'tests' key found in {file_path}, skipping transformation")
-
- # Extract the tests data
- tests_data = data.pop('tests')
-
- if not isinstance(tests_data, dict):
- raise ValueError(f"Error: 'tests' key in {file_path} is not a dictionary")
+ with open(file_path, "r", encoding="utf-8") as f:
+ data = yaml.safe_load(f) or {}
+
+ if not isinstance(data, dict):
+ raise ValueError(f"{file_path}: top-level YAML must be a mapping, got {type(data).__name__}")
+
+ if "acceptance_tests" in data:
+ raise AlreadyUpdatedError()
+
+ if "tests" not in data:
+ raise ValueError(f"{file_path}: no 'tests' key; skipping transformation")
+
+ # Extract the tests data
+ tests_data = data.pop("tests") or {}
+
+ if not isinstance(tests_data, dict):
+ raise ValueError(f"{file_path}: 'tests' must be a mapping, got {type(tests_data).__name__}")
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In bin/fix_acceptance_tests_yml.py around lines 29–44, open the file with
explicit UTF-8 (open(file_path, 'r', encoding='utf-8')), call yaml.safe_load,
then guard against empty or non-mapping YAML by checking if data is None and
raising a clear ValueError("Empty YAML file: {file_path}") and then if not
isinstance(data, dict) raise ValueError("Unexpected YAML root type: expected
mapping in {file_path}"); only after those guards check for 'acceptance_tests'
and 'tests' keys and validate that tests_data is a dict, keeping the existing
AlreadyUpdatedError and the other error messages but making them clearer per
above.
bin/fix_acceptance_tests_yml.py
Outdated
for file_path in repo_path.glob('airbyte-integrations/connectors/source-*/acceptance-test-config.yml'): | ||
try: | ||
transform(file_path) | ||
except AlreadyUpdatedError: | ||
print(f"File {file_path} has already been updated, skipping transformation") | ||
except yaml.YAMLError as e: | ||
print(f"Error parsing YAML file {file_path}: {e}") | ||
except Exception as e: | ||
print(f"Error transforming {file_path}: {e}") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Broaden file search to destinations and .yaml; print errors to stderr
Many connectors are destination-* and some configs use .yaml. Shall we cover both and route errors to stderr, wdyt?
- for file_path in repo_path.glob('airbyte-integrations/connectors/source-*/acceptance-test-config.yml'):
- try:
- transform(file_path)
- except AlreadyUpdatedError:
- print(f"File {file_path} has already been updated, skipping transformation")
- except yaml.YAMLError as e:
- print(f"Error parsing YAML file {file_path}: {e}")
- except Exception as e:
- print(f"Error transforming {file_path}: {e}")
+ patterns = [
+ "airbyte-integrations/connectors/source-*/acceptance-test-config.yml",
+ "airbyte-integrations/connectors/source-*/acceptance-test-config.yaml",
+ "airbyte-integrations/connectors/destination-*/acceptance-test-config.yml",
+ "airbyte-integrations/connectors/destination-*/acceptance-test-config.yaml",
+ ]
+ for pattern in patterns:
+ for file_path in repo_path.glob(pattern):
+ try:
+ transform(file_path)
+ except AlreadyUpdatedError:
+ print(f"File {file_path} has already been updated, skipping transformation")
+ except yaml.YAMLError as e:
+ print(f"Error parsing YAML file {file_path}: {e}", file=sys.stderr)
+ except Exception as e:
+ print(f"Error transforming {file_path}: {e}", file=sys.stderr)
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In bin/fix_acceptance_tests_yml.py around lines 66 to 75, broaden the file
search to include destination-* connectors and files ending with both .yml and
.yaml, and route error output to stderr; change the glob loop to iterate over a
list of patterns (e.g. for patterns like
'airbyte-integrations/connectors/source-*/acceptance-test-config.yml',
'.../source-*/acceptance-test-config.yaml',
'.../destination-*/acceptance-test-config.yml',
'.../destination-*/acceptance-test-config.yaml') or generate matches with
multiple glob calls, then for each matched file call transform(file_path) and on
exceptions print the same messages to stderr using print(..., file=sys.stderr)
(including AlreadyUpdatedError, yaml.YAMLError as e, and generic Exception as
e).
/autofix
|
What
https://airbytehq-team.slack.com/archives/C02U9R3AF37/p1757701152311889?thread_ts=1757603749.279779&cid=C02U9R3AF37
Summary by CodeRabbit