You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rename the JS course from "Web scraping for beginners" to "Web scraping basics for JavaScript devs",
so that it is aligned with the Python course and the design described in #1015
This change attempts to isolate the change to the course name only. The name appears at many places which
could be improved or which are questionable, but this change is not intended to be a complete overhaul of the course.
Copy file name to clipboardExpand all lines: sources/academy/homepage_content.json
+1-1
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
{
2
2
"Beginner courses": [
3
3
{
4
-
"title": "Web scraping for beginners",
4
+
"title": "Web scraping basics for JavaScript devs",
5
5
"link": "academy/web-scraping-for-beginners",
6
6
"description": "Learn how to develop web scrapers on your own computer with open-source tools. This web scraping course teaches you all the basics a scraper developer needs to know.",
Copy file name to clipboardExpand all lines: sources/academy/platform/expert_scraping_with_apify/actors_webhooks.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ Thus far, you've run Actors on the platform and written an Actor of your own, wh
15
15
16
16
## Advanced Actor overview {#advanced-actors}
17
17
18
-
In this course, we'll be working out of the Amazon scraper project from the **Web scraping for beginners** course. If you haven't already built that project, you can do it in three short lessons [here](../../webscraping/scraping_basics_javascript/challenge/index.md). We've made a few small modifications to the project with the Apify SDK, but 99% of the code is still the same.
18
+
In this course, we'll be working out of the Amazon scraper project from the **Web scraping basics for JavaScript devs** course. If you haven't already built that project, you can do it in three short lessons [here](../../webscraping/scraping_basics_javascript/challenge/index.md). We've made a few small modifications to the project with the Apify SDK, but 99% of the code is still the same.
19
19
20
20
Take another look at the files within your Amazon scraper project. You'll notice that there is a **Dockerfile**. Every single Actor has a Dockerfile (the Actor's **Image**) which tells Docker how to spin up a container on the Apify platform which can successfully run the Actor's code. "Apify Actors" is a serverless platform that runs multiple Docker containers. For a deeper understanding of Actor Dockerfiles, refer to the [Apify Actor Dockerfile docs](/sdk/js/docs/guides/docker-images#example-dockerfile).
21
21
@@ -41,7 +41,7 @@ Prior to moving forward, please read over these resources:
41
41
42
42
## Our task {#our-task}
43
43
44
-
In this task, we'll be building on top of what we already created in the [Web scraping for beginners](/academy/web-scraping-for-beginners/challenge) course's final challenge, so keep those files safe!
44
+
In this task, we'll be building on top of what we already created in the [Web scraping basics for JavaScript devs](/academy/web-scraping-for-beginners/challenge) course's final challenge, so keep those files safe!
45
45
46
46
Once our Amazon Actor has completed its run, we will, rather than sending an email to ourselves, call an Actor through a webhook. The Actor called will be a new Actor that we will create together, which will take the dataset ID as input, then subsequently filter through all of the results and return only the cheapest one for each product. All of the results of the Actor will be pushed to its default dataset.
Copy file name to clipboardExpand all lines: sources/academy/platform/expert_scraping_with_apify/index.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -18,15 +18,15 @@ This course will teach you the nitty gritty of what it takes to build pro-level
18
18
19
19
Before developing a pro-level Apify scraper, there are some important things you should have at least a bit of knowledge about (knowing the basics of each is enough to continue through this section), as well as some things that you should have installed on your system.
20
20
21
-
> If you've already gone through the [Web scraping for beginners course](../../webscraping/scraping_basics_javascript/index.md) and the first courses of the [Apify platform category](../apify_platform.md), you will be more than well equipped to continue on with the lessons in this course.
21
+
> If you've already gone through the [Web scraping basics for JavaScript devs](../../webscraping/scraping_basics_javascript/index.md) and the first courses of the [Apify platform category](../apify_platform.md), you will be more than well equipped to continue on with the lessons in this course.
[Puppeteer](https://pptr.dev/) is a library for running and controlling a [headless browser](../../webscraping/scraping_basics_javascript/crawling/headless_browser.md) in Node.js, and was developed at Google. The team working on it was hired by Microsoft to work on the [Playwright](https://playwright.dev/) project; therefore, many parallels can be seen between both the `puppeteer` and `playwright` packages. Proficiency in at least one of these will be good enough. -->
26
26
27
27
### Crawlee, Apify SDK, and the Apify CLI {#crawlee-apify-sdk-and-cli}
28
28
29
-
If you're feeling ambitious, you don't need to have any prior experience with Crawlee to get started with this course; however, at least 5–10 minutes of exposure is recommended. If you haven't yet tried out Crawlee, you can refer to [this lesson](../../webscraping/scraping_basics_javascript/crawling/pro_scraping.md) in the **Web scraping for beginners** course (and ideally follow along). To familiarize yourself with the Apify SDK, you can refer to the [Apify Platform](../apify_platform.md) category.
29
+
If you're feeling ambitious, you don't need to have any prior experience with Crawlee to get started with this course; however, at least 5–10 minutes of exposure is recommended. If you haven't yet tried out Crawlee, you can refer to [this lesson](../../webscraping/scraping_basics_javascript/crawling/pro_scraping.md) in the **Web scraping basics for JavaScript devs** course (and ideally follow along). To familiarize yourself with the Apify SDK, you can refer to the [Apify Platform](../apify_platform.md) category.
30
30
31
31
The Apify CLI will play a core role in the running and testing of the Actor you will build, so if you haven't gotten it installed already, please refer to [this short lesson](../../glossary/tools/apify_cli.md).
The core crawling problem comes to down to ensuring that we reliably find all detail pages on the target website or inside its categories. This is trivial for small sites. We just open the home page or category pages and paginate to the end as we did in the [Web Scraping for Beginners course](/academy/web-scraping-for-beginners).
8
+
The core crawling problem comes to down to ensuring that we reliably find all detail pages on the target website or inside its categories. This is trivial for small sites. We just open the home page or category pages and paginate to the end as we did in the [Web scraping basics for JavaScript devs](/academy/web-scraping-for-beginners) course.
9
9
10
10
Unfortunately, _most modern websites restrict pagination_ only to somewhere between 1 and 10,000 products. Solving this problem might seem relatively straightforward at first but there are multiple hurdles that we will explore in this lesson.
Copy file name to clipboardExpand all lines: sources/academy/webscraping/advanced_web_scraping/index.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ category: web scraping & automation
6
6
slug: /advanced-web-scraping
7
7
---
8
8
9
-
In [Web scraping for beginners](/academy/web-scraping-for-beginners) course, we have learned the necessary basics required to create a scraper. In the following courses, we learned more about specific practices and techniques that will help us to solve most of the problems we will face.
9
+
In the [Web scraping basics for JavaScript devs](/academy/web-scraping-for-beginners) course, we have learned the necessary basics required to create a scraper. In the following courses, we learned more about specific practices and techniques that will help us to solve most of the problems we will face.
10
10
11
11
In this course, we will take all of that knowledge, add a few more advanced concepts, and apply them to learn how to build a production-ready web scraper.
In the [**Web scraping for beginners**](../../scraping_basics_javascript/crawling/pro_scraping.md) course, we learned about the power of Crawlee, and how it can streamline the development process of web crawlers. You've already seen how powerful the `crawlee` package is; however, what you've been exposed to thus far is only the tip of the iceberg.
14
+
In the [**Web scraping basics for JavaScript devs**](../../scraping_basics_javascript/crawling/pro_scraping.md) course, we learned about the power of Crawlee, and how it can streamline the development process of web crawlers. You've already seen how powerful the `crawlee` package is; however, what you've been exposed to thus far is only the tip of the iceberg.
15
15
16
16
Because proxies are so widely used in the scraping world, Crawlee has built-in features for implementing them in an effective way. One of the main functionalities that comes baked into Crawlee is proxy rotation, which is when each request is sent through a different proxy from a proxy pool.
17
17
18
18
## Implementing proxies in a scraper {#implementing-proxies}
19
19
20
-
Let's borrow some scraper code from the end of the [pro-scraping](../../scraping_basics_javascript/crawling/pro_scraping.md) lesson in our **Web Scraping for Beginners** course and paste it into a new file called **proxies.js**. This code enqueues all of the product links on [demo-webstore.apify.org](https://demo-webstore.apify.org)'s on-sale page, then makes a request to each product page and scrapes data about each one:
20
+
Let's borrow some scraper code from the end of the [pro-scraping](../../scraping_basics_javascript/crawling/pro_scraping.md) lesson in our **Web scraping basics for JavaScript devs** course and paste it into a new file called **proxies.js**. This code enqueues all of the product links on [demo-webstore.apify.org](https://demo-webstore.apify.org)'s on-sale page, then makes a request to each product page and scrapes data about each one:
Copy file name to clipboardExpand all lines: sources/academy/webscraping/puppeteer_playwright/index.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -63,7 +63,7 @@ npm install puppeteer
63
63
</TabItem>
64
64
</Tabs>
65
65
66
-
> For a more in-depth guide on how to set up the basic environment we'll be using in this tutorial, check out the [**Computer preparation**](../scraping_basics_javascript/data_extraction/computer_preparation.md) lesson in the **Web scraping for beginners** course
66
+
> For a more in-depth guide on how to set up the basic environment we'll be using in this tutorial, check out the [**Computer preparation**](../scraping_basics_javascript/data_extraction/computer_preparation.md) lesson in the **Web scraping basics for JavaScript devs** course
Copy file name to clipboardExpand all lines: sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -55,7 +55,7 @@ With `page.click()`, Puppeteer and Playwright actually drag the mouse and click,
55
55
56
56
Notice that in the Playwright example, we are using a different selector than in the Puppeteer example. This is because Playwright supports [many custom CSS selectors](https://playwright.dev/docs/other-locators#css-elements-matching-one-of-the-conditions), such as the **has-text** pseudo class. As a rule of thumb, using text selectors is much more preferable to using regular selectors, as they are much less likely to break. If Google makes the sibling above the **Accept all** button a `<div>` element instead of a `<button>` element, our `button + button` selector will break. However, the button will always have the text **Accept all**; therefore, `button:has-text("Accept all")` is more reliable.
57
57
58
-
> If you're not already familiar with CSS selectors and how to find them, we recommend referring to [this lesson](../../scraping_basics_javascript/data_extraction/using_devtools.md) in the **Web scraping for beginners** course.
58
+
> If you're not already familiar with CSS selectors and how to find them, we recommend referring to [this lesson](../../scraping_basics_javascript/data_extraction/using_devtools.md) in the **Web scraping basics for JavaScript devs** course.
59
59
60
60
Then, we can type some text into an input field `<textarea>` with `page.type()`; passing a CSS selector as the first, and the string to input as the second parameter:
Copy file name to clipboardExpand all lines: sources/academy/webscraping/scraping_basics_javascript/challenge/modularity.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -123,7 +123,7 @@ Then, the labels can be used by importing `labels` and accessing `labels.START`,
123
123
124
124
This is not necessary, but it is best practice, as it can prevent dumb typos that can cause nasty bugs 🐞 For the rest of this lesson, all of the examples using labels will be using the imported versions.
125
125
126
-
> If you haven't already read the **Best practices** lesson in the **Web scraping for beginners** course, please [give it a read](../best_practices.md).
126
+
> If you haven't already read the **Best practices** lesson in the **Web scraping basics for JavaScript devs** course, please [give it a read](../best_practices.md).
Copy file name to clipboardExpand all lines: sources/academy/webscraping/scraping_basics_javascript/challenge/scraping_amazon.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -214,4 +214,4 @@ log.info('Crawl finished.');
214
214
215
215
Nice work! You've officially built your first scraper with Crawlee! You're now ready to take on the rest of the Apify Academy with confidence.
216
216
217
-
For now, this is the last section of the **Web scraping for beginners** course. If you want to learn more about web scraping, we recommend checking venturing out and following the other lessons in the Academy. We will keep updating the Academy with more content regularly until we cover all the advanced and expert topics we promised at the beginning.
217
+
For now, this is the last section of the **Web scraping basics for JavaScript devs** course. If you want to learn more about web scraping, we recommend checking venturing out and following the other lessons in the Academy. We will keep updating the Academy with more content regularly until we cover all the advanced and expert topics we promised at the beginning.
And this is it for the [**Basics of crawling**](./index.md) section of the [**Web scraping for beginners**](../index.md) course. If you want to learn more, test your knowledge of the methods and concepts you learned in this course by moving forward with the [**challenge**](../challenge/index.md).
112
+
And this is it for the [**Basics of crawling**](./index.md) section of the [**Web scraping basics for JavaScript devs**](../index.md) course. If you want to learn more, test your knowledge of the methods and concepts you learned in this course by moving forward with the [**challenge**](../challenge/index.md).
Welcome to the second section of our **Web scraping for beginners** course. In the [Basics of data extraction](../data_extraction/index.md) section, we learned how to extract data from a web page. Specifically, a template Shopify site called [Warehouse store](https://warehouse-theme-metal.myshopify.com/).
15
+
Welcome to the second section of our **Web scraping basics for JavaScript devs** course. In the [Basics of data extraction](../data_extraction/index.md) section, we learned how to extract data from a web page. Specifically, a template Shopify site called [Warehouse store](https://warehouse-theme-metal.myshopify.com/).
16
16
17
17

We finished off the [first section](../data_extraction/index.md) of the _Web Scraping for Beginners_ course by creating a web scraper in Node.js. The scraper collected all the on-sale products from [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales). Let's see the code with some comments added.
14
+
We finished off the [first section](../data_extraction/index.md) of the _Web scraping basics for JavaScript devs_ course by creating a web scraper in Node.js. The scraper collected all the on-sale products from [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales). Let's see the code with some comments added.
15
15
16
16
```js
17
17
// First, we imported all the libraries we needed to
Copy file name to clipboardExpand all lines: sources/academy/webscraping/scraping_basics_javascript/data_extraction/node_continued.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -164,7 +164,7 @@ After running the code, you will see this output in your terminal:
164
164
];
165
165
```
166
166
167
-
Congratulations! You completed the **Basics of data extraction** section of the Web scraping for beginners course. A quick recap of what you learned:
167
+
Congratulations! You completed the **Basics of data extraction** section of the Web scraping basics for JavaScript devs course. A quick recap of what you learned:
168
168
169
169
1. The basic terminology around web scraping, crawling, HTML, CSS and JavaScript.
170
170
2. How to use browser DevTools and Console to inspect web pages and manipulate them using CSS and JavaScript.
Copy file name to clipboardExpand all lines: sources/academy/webscraping/scraping_basics_javascript/data_extraction/save_to_csv.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -136,7 +136,7 @@ Finally, run it with `node main.js` in your terminal. After running it, you will
136
136
137
137

138
138
139
-
This marks the end of the **Basics of data extraction** section of Web scraping for beginners. If you enjoyed the course, give us a thumbs up down below and if you're eager to learn more...
139
+
This marks the end of the **Basics of data extraction** section of Web scraping basics for JavaScript devs. If you enjoyed the course, give us a thumbs up down below and if you're eager to learn more...
Copy file name to clipboardExpand all lines: sources/academy/webscraping/scraping_basics_javascript/index.md
+5-5
Original file line number
Diff line number
Diff line change
@@ -1,18 +1,18 @@
1
1
---
2
-
title: Web scraping for beginners
2
+
title: Web scraping basics for JavaScript devs
3
3
description: Learn how to develop web scrapers with this comprehensive and practical course. Go from beginner to expert, all in one place.
4
4
sidebar_position: 1
5
5
category: web scraping & automation
6
6
slug: /web-scraping-for-beginners
7
7
---
8
8
9
-
# Web scraping for beginners {#welcome}
9
+
# Web scraping basics for JavaScript devs {#welcome}
10
10
11
11
**Learn how to develop web scrapers with this comprehensive and practical course. Go from beginner to expert, all in one place.**
12
12
13
13
---
14
14
15
-
Welcome to **Web scraping for beginners**, a comprehensive, practical and long form web scraping course that will take you from an absolute beginner to a successful web scraper developer. If you're looking for a quick start, we recommend trying [this tutorial](https://blog.apify.com/web-scraping-javascript-nodejs/) instead.
15
+
Welcome to **Web scraping basics for JavaScript devs**, a comprehensive, practical and long form web scraping course that will take you from an absolute beginner to a successful web scraper developer. If you're looking for a quick start, we recommend trying [this tutorial](https://blog.apify.com/web-scraping-javascript-nodejs/) instead.
16
16
17
17
This course is made by [Apify](https://apify.com), the web scraping and automation platform, but we will use only open-source technologies throughout all academy lessons. This means that the skills you learn will be applicable to any scraping project, and you'll be able to run your scrapers on any computer. No Apify account needed.
18
18
@@ -30,9 +30,9 @@ Scraper development is a fun and challenging way to learn web development, web t
30
30
31
31
When we set out to create the Academy, we wanted to build a complete guide to web scraping - a course that a beginner could use to create their first scraper, as well as a resource that professionals will continuously use to learn about advanced and niche web scraping techniques and technologies. All lessons include code examples and code-along exercises that you can use to immediately put your scraping skills into action.
32
32
33
-
This is what you'll learn in the **Web scraping for beginners** course:
33
+
This is what you'll learn in the **Web scraping basics for JavaScript devs** course:
34
34
35
-
*[Web scraping for beginners](./index.md)
35
+
*[Web scraping basics for JavaScript devs](./index.md)
36
36
*[Basics of data extraction](./data_extraction/index.md)
0 commit comments