Skip to content

Conversation

@jaceklaskowski
Copy link
Contributor

What changes were proposed in this pull request?

Improved the SDP Programming Guide

Why are the changes needed?

Better docs

Does this PR introduce any user-facing change?

Yes, given that the changes are user docs-related.

How was this patch tested?

Reviewed locally

Was this patch authored or co-authored using generative AI tooling?

No (unless auto-completion in IntelliJ IDEA counts)

@github-actions github-actions bot added the DOCS label Dec 5, 2025
@jaceklaskowski
Copy link
Contributor Author

@sryza Please review 🙏

@jaceklaskowski
Copy link
Contributor Author

@sryza Please please please 🙏

Copy link
Contributor

@sryza sryza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for this @jaceklaskowski – important to maximize clarity of the docs. My main feedback is on the headings.

```

### Loading Data from a Streaming Source
### Loading Data from Streaming Source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "from a Streaming Source" is more grammatically correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not in the titles IMHO (but I'm not an English native speaking either)

Let's ask ChatGPT...it's not clear too 👎

But a "hack" is to use plurals 👍

- The function used to define a dataset must return a Spark DataFrame.
- The function used to define a dataset must return a Spark `pyspark.sql.DataFrame`.
- Never use methods that save or write to files or tables as part of your SDP dataset code.
- When using the `for` loop pattern to define datasets in Python, ensure that the list of values passed to the `for` loop is always additive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be clearer to actually just leave this out – we don't discuss a for loop pattern above.

Copy link
Contributor Author

@jaceklaskowski jaceklaskowski Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, in Creating Tables in For Loop in Python earlier

@jaceklaskowski
Copy link
Contributor Author

How does it look now, Mr. @sryza? 🤔 Care to review again? 🙏

@jaceklaskowski jaceklaskowski requested a review from sryza December 15, 2025 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants