-
Notifications
You must be signed in to change notification settings - Fork 109
feat: add lesson about using the platform #1424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We renamed the workflow in apify-sdk-python and apify-client-python.
Update LangChain integration to remove wrong links and information about RAG-Web-Browser @TC-MO please check English 🙏🏻 --------- Co-authored-by: Michał Olender <[email protected]>
This is a quite important part of sending Standby requests, and it was missing. I'm having a hard time coming up with some nice formatting, @TC-MO feel free to restructure as you deem fit. 
Docs of new input schema property `resourceType`
Add new datepicker `dateType` property to input schema specification.
Co-authored-by: Jiri Spilka <[email protected]>
Hide datepicker `dateType` property from docs based on: https://apify.slack.com/archives/C010Q0FBYG3/p1731684960993549

a308047
to
d44c772
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Was there a discussion about where this content should live before? There is quite a lot of duplication both with https://docs.apify.com/platform and https://docs.apify.com/academy/apify-platform. The approach is JS was to have the scraping tutorial separate from the platform.
I'm not against having the whole thing follow in the Python course (as it can specialize to Python devs needs) but then we will have to maintain duplicate content which tends to be a bit annoying.
@metalwarrior665 The discussion has happened here: #1015 (comment) I don't want a duplicate content, but this is a logical ending of the course:
The lesson is specific to the scraper we're building over the course of the lessons. You could say the same about the previous lesson about Crawlee, where the same content could be covered by Crawlee docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we use webp
instead of png
? The images would be about 5x smaller. Changes to the Python code are okay.
I don't mind using webp, but it's just about the size of this repo. The site has the images optimized automatically, at least that's what I remember @B4nan saying somewhere else in the comments. |
The size of the repo is also important, if the difference is 5x let's just go with webp. We use them pretty much exclusively in the crawlee blog posts too for the same reason. |
I’m already putting new images through optimizt every time before commit,
and it has an option to convert to webp. I just wonder if we could have it
as a pre-commit so I don’t have to think about it. But changing extension
breaks the path to the image, so I guess this must be a manual process…
Unless I write myself a Python script to do the magic… 😅
…On Tue 18. 3. 2025 at 17:24, Martin Adámek ***@***.***> wrote:
The size of the repo is also important, if the difference is 5x let's just
go with webp. We use them pretty much exclusively in the crawlee blog posts
too for the same reason.
—
Reply to this email directly, view it on GitHub
<#1424 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACFGMKHOE76ARI2IR25QZ32VBCD7AVCNFSM6AAAAABVUWEMUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZTHA4TQNJQG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
[image: B4nan]*B4nan* left a comment (apify/apify-docs#1424)
<#1424 (comment)>
The size of the repo is also important, if the difference is 5x let's just
go with webp. We use them pretty much exclusively in the crawlee blog posts
too for the same reason.
—
Reply to this email directly, view it on GitHub
<#1424 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACFGMKHOE76ARI2IR25QZ32VBCD7AVCNFSM6AAAAABVUWEMUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMZTHA4TQNJQG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I'm moving the discussion about images to a separate issue: #1549 Regarding this particular PR, the images here are technically already a part of this git repo now, so converting them would only add size, but if you want me to change them to webp, I'll do it. |
I think that the structure that @honzajavorek proposed is the correct way of creating this course and integrating it with Academy. It aligns especially well with current work on trimming down |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few questions & changes suggested otherwise LGTM
sources/academy/webscraping/scraping_basics_python/13_platform.md
Outdated
Show resolved
Hide resolved
sources/academy/webscraping/scraping_basics_python/13_platform.md
Outdated
Show resolved
Hide resolved
|
||
## Registering | ||
|
||
First, let's [create a new Apify account](https://console.apify.com/sign-up). You'll go through a few checks to confirm you're human and your email is valid—annoying but necessary to prevent abuse of the platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be opportunity to send to our docs? (though it should be trivial to sign-up so I'll just leave it up to your consideration) ¯\_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, should be trivial, but it's not, I tried it in anonymous window.
With the extra sentence I wanted to provide some comfort to people who are following the lesson, but are not 100% sure they want to register to Apify. They may feel that the lesson manipulates them into registering with a random service, so they might already go into the process with a suspicion. And suddenly during the account creation, the service also wants their phone number. Especially in Czechia, this is a big ask and IMHO would result in people just closing the tab and giving up on the lesson.
I cannot change your login procedure and I kinda understand why it is like this, so I try to reduce the harm by comforting... or somehow preparing the person that they're about to go through this, it's expected, and explain why it's necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also intentionally don't mention the phone number, but dance around it. I don't even know if you require it always or just if the platform suspects a bot or something.
```text | ||
$ apify login | ||
... | ||
Success: You are logged in to Apify as user1234! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually used `<YOUR_XYZ> as placeholders, not sure what is your experience with that. Is the mock username in that format better received by users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I think here it doesn't matter, because the only purpose of the example is that compared to what the user sees in their terminal, it has similar sentences and shape. So if they glance at it, it'll appear similar and they'll assume "I'm on the right path".
I don't expect them to compare it letter by letter, or check whether the username checks out - I don't expect they juggle four Apify accounts in this case. In these cases I opt for having the variable parts as low-key sample values rather than well visible placeholders which stand out and feel important.
There is no science behind it though, and I have no strong opinion on this. It's just my freestyle and this is how I think about it. Feel free to guide me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sources/academy/webscraping/scraping_basics_python/13_platform.md
Outdated
Show resolved
Hide resolved
They are part of your branch only, which will be wiped after we merge. |
Thousand years ago I happened to randomly stumble upon two chapters (out of many) covering Git internals and something tells me it's not that simple. But that's just my hunch, I'm no Git expert. I consulted with LLM and they also say it's not this simple. I'm still happy to provide webp images, if you don't mind the risk of added size. |
Well, it is that simple, nobody is going to pull your branch, especially not given it will be wiped on merge. They will pull master, that's what we need to guard against getting fat. edit: also if you would do this in a branch that would be long living, you could just fix the history and force push, no need to "surrender because its already there" :] |
f9923c9
to
6680bbb
Compare
This didn't go well. I rewrote history with |
New PR: #1556 |
I messed up #1424 trying to remove PNG files from commit history. This is a new PR with (hopefully) all the original commits correctly rewritten and cherry-picked. --------- Co-authored-by: Michał Olender <[email protected]>
Introducing the final lesson of the course about deploying to the platform. This was quite challenging as with every other sentence I grappled with bugs or behavior, which wasn't really intuitive to me. On my journey I filed these:
apify init
apify-cli#761apify init
toapify actorize
apify-cli#762I explored several approaches, which were dead ends. The lesson now takes an approach where it starts a new project from a template and replaces parts of the template with the original scraper. That completely avoids
apify init
and should be robust with regard to possible future changes, such as migrating touv
, and so on.I find the UI of the Apify console rather confusing and super complex, especially navigation, even as a user who regularly visits the interface for the past year. Also the UI seems to remember my last location or something like that, so every time I open it, it defaults to a different tab. Once it's Input, other time it's Last run, etc.
I'm no UX designer, so I can't help with that, just sharing it here as a feedback and a fact, which I took into account when creating the lesson. The only way to mitigate the confusion which came to my mind was to provide as many screenshots as possible. Also I didn't dare to rely on where the student might land, and I make sure to re-iterate on which screen and in which tab they should be.
The lesson intentionally goes through updating the Actor so that the student knows how to do it and how to push new changes and build and run the Actor again and again. I opted to keep the student using the Input tab as the place from which they start the Actor, even though in reality they could press the Start button from other tabs too. I feel like that way it's less confusing, makes most sense, and they won't get distracted by all the other options that much.
I did my best to structure the lesson so that it leads from stating shortcomings of the current solution to understanding how the platform helps to solve them, because I think that's the most honest way to "sell" the platform.
Let me know what you think!