- 
                Notifications
    You must be signed in to change notification settings 
- Fork 30
          fix(declarative): Pass extra_fields in global_substream_cursor
          #195
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 📝 WalkthroughWalkthroughThe pull request modifies the  Changes
 Sequence DiagramsequenceDiagram
    participant GS as GlobalSubstreamCursor
    participant SS as StreamSlice
    
    GS->>SS: Create StreamSlice
    SS-->>GS: StreamSlice with extra_fields
Possibly related PRs
 Suggested labels
 Suggested reviewers
 What do you think about the suggested labels and reviewers? Would you like to make any adjustments? 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
 🚧 Files skipped from review as they are similar to previous changes (2)
 ⏰ Context from checks skipped due to timeout of 90000ms (2)
 Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit: 
 
 Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
 Other keywords and placeholders
 CodeRabbit Configuration File ( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/incremental/global_substream_cursor.py (1)
131-131: Balanced approach forgenerate_slices_from_partition.
Great to see the same approach for passingextra_fieldshere. Would you be open to adding a simple test to confirm thatextra_fieldsis successfully set in these generated slices as well? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
- airbyte_cdk/sources/declarative/incremental/global_substream_cursor.py(2 hunks)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/incremental/global_substream_cursor.py (1)
115-115: Consider verifying the presence ofextra_fieldsin the partition.
Would you like to add a quick check or default behavior in caseextra_fieldsis missing or unexpected in the partition? wdyt?
extra_fields in global_substream_cursorextra_fields in global_substream_cursor
      | /autofix | 
| I do not see how the failing test has any connection with the code I am changing.... | 
| Any interest in getting this merged? | 
| Yep! We're backlogged, but yes interest. Let me kick the tires on CI again to see what's going on. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
unit_tests/sources/declarative/incremental/test_per_partition_cursor.py (1)
732-739: Consider renaming the cursor variable to avoid shadowing.The variable
cursoris reassigned on line 739, which shadows the previous assignment. This could be confusing to readers. What do you think about renaming the first instance tomock_cursorto better reflect its purpose? wdyt?- cursor = ( + mock_cursor = ( MockedCursorBuilder() .with_stream_slices([{CURSOR_SLICE_FIELD: "first slice cursor value"}]) .build() ) - mocked_cursor_factory.create.return_value = cursor + mocked_cursor_factory.create.return_value = mock_cursor
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
- unit_tests/sources/declarative/incremental/test_per_partition_cursor.py(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-the-guardian-api' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (1)
unit_tests/sources/declarative/incremental/test_per_partition_cursor.py (1)
754-780: LGTM! Well-structured test for GlobalSubstreamCursor.The test effectively verifies that
extra_fieldsare correctly preserved in theStreamSlicewhen usingGlobalSubstreamCursor. The assertions are comprehensive and the test setup is clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
unit_tests/sources/declarative/incremental/test_per_partition_cursor.py (2)
723-751: The test structure looks good! A few suggestions to make it even better.The test follows good practices with clear arrange-act-assert structure. Would you consider these improvements?
- Add a docstring explaining the test's purpose
- Rename the second
cursorvariable (line 732) tomock_underlying_cursorto avoid confusion with thecursoron line 739- Add test cases for empty or None
extra_fieldsdef test_per_partition_cursor_partition_router_extra_fields( mocked_cursor_factory, mocked_partition_router ): + """ + Test that PerPartitionCursor correctly preserves extra_fields from the partition router + when creating stream slices. + """ first_partition = {"first_partition_key": "first_partition_value"} mocked_partition_router.stream_slices.return_value = [ StreamSlice( partition=first_partition, cursor_slice={}, extra_fields={"extra_field": "extra_value"} ), ] - cursor = ( + mock_underlying_cursor = ( MockedCursorBuilder() .with_stream_slices([{CURSOR_SLICE_FIELD: "first slice cursor value"}]) .build() ) - mocked_cursor_factory.create.return_value = cursor + mocked_cursor_factory.create.return_value = mock_underlying_cursor
754-780: Similar improvements could enhance this test too.The test structure is solid, but what do you think about these suggestions?
- Add a docstring explaining the test's purpose
- Rename the first
cursorvariable (line 763) tomock_underlying_cursorfor consistency- Add test cases for edge cases with empty or None
extra_fieldsdef test_global_cursor_partition_router_extra_fields( mocked_cursor_factory, mocked_partition_router ): + """ + Test that GlobalSubstreamCursor correctly preserves extra_fields from the partition router + when creating stream slices. + """ first_partition = {"first_partition_key": "first_partition_value"} mocked_partition_router.stream_slices.return_value = [ StreamSlice( partition=first_partition, cursor_slice={}, extra_fields={"extra_field": "extra_value"} ), ] - cursor = ( + mock_underlying_cursor = ( MockedCursorBuilder() .with_stream_slices([{CURSOR_SLICE_FIELD: "first slice cursor value"}]) .build() ) - global_cursor = GlobalSubstreamCursor(cursor, mocked_partition_router) + global_cursor = GlobalSubstreamCursor(mock_underlying_cursor, mocked_partition_router)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
- airbyte_cdk/sources/declarative/incremental/global_substream_cursor.py(2 hunks)
- unit_tests/sources/declarative/incremental/test_per_partition_cursor.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/sources/declarative/incremental/global_substream_cursor.py
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
| @tolik0 Any way to move this forward? | 
* main: fix: update cryptography package to latest version to address CVE (airbytehq#377) fix: (CDK) (HttpRequester) - Make the `HttpRequester.path` optional (airbytehq#370) feat: improved custom components handling (airbytehq#350) feat: add microseconds timestamp format (airbytehq#373) fix: Replace Unidecode with anyascii for permissive license (airbytehq#367) feat: add IncrementingCountCursor (airbytehq#346) feat: (low-code cdk) datetime format with milliseconds (airbytehq#369) fix: (CDK) (AsyncRetriever) - Improve UX on variable naming and interpolation (airbytehq#368) fix: (CDK) (AsyncRetriever) - Add the `request` and `response` to each `async` operations (airbytehq#356) fix: (CDK) (ConnectorBuilder) - Add `auxiliary requests` to slice; support `TestRead` for AsyncRetriever (part 1/2) (airbytehq#355) feat(concurrent perpartition cursor): Add parent state updates (airbytehq#343) fix: update csv parser for builder compatibility (airbytehq#364) feat(low-code cdk): add interpolation for limit field in Rate (airbytehq#353) feat(low-code cdk): add AbstractStreamFacade processing as concurrent streams in declarative source (airbytehq#347) fix: (CDK) (CsvParser) - Fix the `\\` escaping when passing the `delimiter` from Builder's UI (airbytehq#358) feat: expose `str_to_datetime` jinja macro (airbytehq#351) fix: update CDK migration for 6.34.0 (airbytehq#348) feat: Removes `stream_state` interpolation from CDK (airbytehq#320) fix(declarative): Pass `extra_fields` in `global_substream_cursor` (airbytehq#195) feat(concurrent perpartition cursor): Refactor ConcurrentPerPartitionCursor (airbytehq#331) feat(HttpMocker): adding support for PUT requests and bytes responses (airbytehq#342) chore: use certified source for manifest-only test (airbytehq#338) feat: check for request_option mapping conflicts in individual components (airbytehq#328) feat(file-based): sync file acl permissions and identities (airbytehq#260) fix: (CDK) (Connector Builder) - refactor the `MessageGrouper` > `TestRead` (airbytehq#332) fix(low code): Fix missing cursor for ClientSideIncrementalRecordFilterDecorator (airbytehq#334) feat(low-code): Add API Budget (airbytehq#314) chore(decoder): clean decoders and make csvdecoder available (airbytehq#326)
per_partition_cursor correctly passes
extra_fieldsto StreamSlice from partition.global_substream_cursordoes not.I did not have an idea how to test this properly.
Summary by CodeRabbit
GlobalSubstreamCursorto support additional contextual information in stream slices.extra_fieldsinStreamSliceobjects for bothPerPartitionCursorandGlobalSubstreamCursor.