` element. And `document.querySelector('.my-class')` (notice the period `.`) will find the first element with the class `my-class`, such as `
` or `
`.
@@ -65,7 +64,7 @@ When we look more closely by hovering over the result in the Console, we find th

-We need a different function: [`document.querySelectorAll()`](../../../glossary/concepts/querying_css_selectors.md) (notice the `All` at the end). This function does not find only the first element, but all the elements that match the provided selector.
+We need a different function: [`document.querySelectorAll()`](../../glossary/concepts/querying_css_selectors.md) (notice the `All` at the end). This function does not find only the first element, but all the elements that match the provided selector.
Run the following function in the Console:
@@ -204,4 +203,4 @@ price.textContent.match(/((\d+,?)+.?(\d+)?)/)[0];
## Next up {#next}
-This concludes our lesson on extracting and cleaning data using DevTools. Using CSS selectors, we were able to find the HTML element that contains data about our favorite Sony subwoofer and then extract the data. In the [next lesson](./devtools_continued.md), we will learn how to extract information not only about the subwoofer, but about all the products on the page.
+This concludes our lesson on extracting and cleaning data using DevTools. Using CSS selectors, we were able to find the HTML element that contains data about our favorite Sony subwoofer and then extract the data. In the [next lesson](./05_devtools_continued.md), we will learn how to extract information not only about the subwoofer, but about all the products on the page.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/devtools_continued.md b/sources/academy/webscraping/scraping_basics_javascript/05_devtools_continued.md
similarity index 95%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/devtools_continued.md
rename to sources/academy/webscraping/scraping_basics_javascript/05_devtools_continued.md
index 79278386a..e798fcbcf 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/devtools_continued.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/05_devtools_continued.md
@@ -1,7 +1,6 @@
---
-title: Extracting data with DevTools
+title: "Data extraction: Extracting data with DevTools"
description: Continue learning how to extract data from a website using browser DevTools, CSS selectors, and JavaScript via the DevTools console.
-sidebar_position: 3
slug: /web-scraping-for-beginners/data-extraction/devtools-continued
---
@@ -94,4 +93,4 @@ And third, we wrapped this data extraction logic in a **loop** to automatically
And that's it! With a bit of trial and error, you will be able to extract data from any webpage that's loaded in your browser. This is a useful skill on its own. It will save you time copy-pasting stuff when you need data for a project.
-More importantly though, it taught you the basics to start programming your own scrapers. In the [next lessons](./computer_preparation.md), we will teach you how to create your own web data extraction script using JavaScript and Node.js.
+More importantly though, it taught you the basics to start programming your own scrapers. In the [next lessons](./06_computer_preparation.md), we will teach you how to create your own web data extraction script using JavaScript and Node.js.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/computer_preparation.md b/sources/academy/webscraping/scraping_basics_javascript/06_computer_preparation.md
similarity index 94%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/computer_preparation.md
rename to sources/academy/webscraping/scraping_basics_javascript/06_computer_preparation.md
index c4b9baf78..347d1ca3d 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/computer_preparation.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/06_computer_preparation.md
@@ -1,7 +1,6 @@
---
-title: Computer preparation
+title: "Data extraction: Computer preparation"
description: Set up your computer to be able to code scrapers with Node.js and JavaScript. Download Node.js and npm and run a Hello World script.
-sidebar_position: 4
slug: /web-scraping-for-beginners/data-extraction/computer-preparation
---
@@ -64,4 +63,4 @@ You should see **Hello World** printed in your terminal. If you do, congratulati
## Next up {#next}
-You have your computer set up correctly for development, and you've run your first script. Great! In the [next lesson](./project_setup.md) we'll set up your project to download a website's HTML using Node.js instead of a browser.
+You have your computer set up correctly for development, and you've run your first script. Great! In the [next lesson](./07_project_setup.md) we'll set up your project to download a website's HTML using Node.js instead of a browser.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/project_setup.md b/sources/academy/webscraping/scraping_basics_javascript/07_project_setup.md
similarity index 95%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/project_setup.md
rename to sources/academy/webscraping/scraping_basics_javascript/07_project_setup.md
index 72b146a40..cb2393c50 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/project_setup.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/07_project_setup.md
@@ -1,7 +1,6 @@
---
-title: Project setup
+title: "Data extraction: Project setup"
description: Create a new project with npm and Node.js. Install necessary libraries, and test that everything works before starting the next lesson.
-sidebar_position: 5
slug: /web-scraping-for-beginners/data-extraction/project-setup
---
@@ -78,4 +77,4 @@ If you see **it works!** printed in your terminal, great job! You set up everyth
## Next up {#next}
-With the project set up, the [next lesson](./node_js_scraper.md) will show you how to use **got-scraping** to download the website's HTML and extract data from it with Cheerio.
+With the project set up, the [next lesson](./08_node_js_scraper.md) will show you how to use **got-scraping** to download the website's HTML and extract data from it with Cheerio.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/node_js_scraper.md b/sources/academy/webscraping/scraping_basics_javascript/08_node_js_scraper.md
similarity index 93%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/node_js_scraper.md
rename to sources/academy/webscraping/scraping_basics_javascript/08_node_js_scraper.md
index 746cd7103..6d79420bc 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/node_js_scraper.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/08_node_js_scraper.md
@@ -1,7 +1,6 @@
---
-title: Scraping with Node.js
+title: "Data extraction: Scraping with Node.js"
description: Learn how to use JavaScript and Node.js to create a web scraper, plus take advantage of the Cheerio and Got-scraping libraries to make your job easier.
-sidebar_position: 6
slug: /web-scraping-for-beginners/data-extraction/node-js-scraper
---
@@ -13,7 +12,7 @@ Finally, we have everything ready to start scraping! Yes, the setup was a bit da
## Downloading HTML {#downloading-html}
-We will use the `got-scraping` library to download the HTML of products that are [on sale in the Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales). We already worked with this page earlier in the [Extracting Data with DevTools](./using_devtools.md) lessons.
+We will use the `got-scraping` library to download the HTML of products that are [on sale in the Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales). We already worked with this page earlier in the [Extracting Data with DevTools](./04_using_devtools.md) lessons.
Replace the contents of your **main.js** file with this code:
@@ -34,7 +33,7 @@ Now run the script using the `node main.js` command from the previous lesson. Af
## Parsing HTML {#parsing-html}
-Having the HTML printed to the terminal is not very helpful. To extract the data, we first have to parse it. Parsing the HTML allows us to query the individual HTML elements, similarly to the way we did it in the browser in the [Extracting Data with DevTools](./using_devtools.md) lessons.
+Having the HTML printed to the terminal is not very helpful. To extract the data, we first have to parse it. Parsing the HTML allows us to query the individual HTML elements, similarly to the way we did it in the browser in the [Extracting Data with DevTools](./04_using_devtools.md) lessons.
To parse the HTML with the `cheerio` library. Replace the code in your **main.js** with the following code:
@@ -70,4 +69,4 @@ The script first downloaded the page's HTML using the Got Scraping library. Then
## Next up {#next}
-In the [next lesson](./node_continued.md) we will learn more about Cheerio and use it to extract all the products' data from Fakestore.
+In the [next lesson](./09_node_continued.md) we will learn more about Cheerio and use it to extract all the products' data from Fakestore.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/node_continued.md b/sources/academy/webscraping/scraping_basics_javascript/09_node_continued.md
similarity index 96%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/node_continued.md
rename to sources/academy/webscraping/scraping_basics_javascript/09_node_continued.md
index 1fdb51e7e..bac4abacd 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/node_continued.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/09_node_continued.md
@@ -1,7 +1,6 @@
---
-title: Extracting data with Node.js
+title: "Data extraction: Extracting data with Node.js"
description: Continue learning how to create a web scraper with Node.js and Cheerio. Learn how to parse HTML and print the results of the data your scraper has collected.
-sidebar_position: 7
slug: /web-scraping-for-beginners/data-extraction/node-continued
---
@@ -9,7 +8,7 @@ slug: /web-scraping-for-beginners/data-extraction/node-continued
---
-In the first part of the Node.js tutorial we downloaded the HTML of our [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales) and parsed it with Cheerio. Now, we will replicate the extraction logic from the [Extracting Data with DevTools](./using_devtools.md) lessons and finish our scraper.
+In the first part of the Node.js tutorial we downloaded the HTML of our [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales) and parsed it with Cheerio. Now, we will replicate the extraction logic from the [Extracting Data with DevTools](./04_using_devtools.md) lessons and finish our scraper.
## Querying data with Cheerio {#querying-with-cheerio}
@@ -175,4 +174,4 @@ Great job! ππ
# Next up {#next}
-What's next? While we were able to extract the data, it's not super useful to have it printed to the terminal. In the [next, bonus lesson](./save_to_csv.md), we will learn how to convert the data to a CSV and save it to a file.
+What's next? While we were able to extract the data, it's not super useful to have it printed to the terminal. In the [next, bonus lesson](./10_save_to_csv.md), we will learn how to convert the data to a CSV and save it to a file.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/save_to_csv.md b/sources/academy/webscraping/scraping_basics_javascript/10_save_to_csv.md
similarity index 91%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/save_to_csv.md
rename to sources/academy/webscraping/scraping_basics_javascript/10_save_to_csv.md
index b6ec1b7df..a81242ca5 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/save_to_csv.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/10_save_to_csv.md
@@ -1,7 +1,6 @@
---
-title: Saving results to CSV
+title: "Data extraction: Saving results to CSV"
description: Learn how to save the results of your scraper's collected data to a CSV file that can be opened in Excel, Google Sheets, or any other spreadsheets program.
-sidebar_position: 8
slug: /web-scraping-for-beginners/data-extraction/save-to-csv
---
@@ -140,4 +139,4 @@ This marks the end of the **Basics of data extraction** section of Web scraping
## Next up {#next}
-Next up are the [**Basics of crawling**](../crawling/index.md). You already know how to build a scraper that finds all the products on sale in the [Warehouse Store](https://warehouse-theme-metal.myshopify.com/collections/sales). In the [**Basics of crawling**](../crawling/index.md) section you will learn how to open individual product pages of those products and scrape information that's not available on the listing page, like SKUs, descriptions or reviews.
+Next up are the [**Basics of crawling**](./11_crawling.md). You already know how to build a scraper that finds all the products on sale in the [Warehouse Store](https://warehouse-theme-metal.myshopify.com/collections/sales). In the [**Basics of crawling**](./11_crawling.md) section you will learn how to open individual product pages of those products and scrape information that's not available on the listing page, like SKUs, descriptions or reviews.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/index.md b/sources/academy/webscraping/scraping_basics_javascript/11_crawling.md
similarity index 68%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/index.md
rename to sources/academy/webscraping/scraping_basics_javascript/11_crawling.md
index 14ba32761..4882f6855 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/index.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/11_crawling.md
@@ -1,7 +1,6 @@
---
-title: Basics of crawling
+title: Crawling
description: Learn how to crawl the web with your scraper. How to extract links and URLs from web pages and how to manage the collected links to visit new pages.
-sidebar_position: 1.3
category: courses
slug: /web-scraping-for-beginners/crawling
---
@@ -12,7 +11,7 @@ slug: /web-scraping-for-beginners/crawling
---
-Welcome to the second section of our **Web scraping basics for JavaScript devs** course. In the [Basics of data extraction](../data_extraction/index.md) section, we learned how to extract data from a web page. Specifically, a template Shopify site called [Warehouse store](https://warehouse-theme-metal.myshopify.com/).
+Welcome to the second section of our **Web scraping basics for JavaScript devs** course. In the [Basics of data extraction](./02_data_extraction.md) section, we learned how to extract data from a web page. Specifically, a template Shopify site called [Warehouse store](https://warehouse-theme-metal.myshopify.com/).

@@ -20,7 +19,7 @@ In this section, we will take a look at moving between web pages, which we call
## How do you crawl? {#how-to-crawl}
-Crawling websites is a fairly straightforward process. We'll start by opening the first web page and extracting all the links (URLs) that lead to the other pages we want to visit. To do that, we'll use the skills learned in the [Basics of data extraction](../data_extraction/index.md) course. We'll add some extra filtering to make sure we only get the correct URLs. Then, we'll save those URLs, so in case our scraper crashes with an error, we won't have to extract them again. And, finally, we will visit those URLs one by one.
+Crawling websites is a fairly straightforward process. We'll start by opening the first web page and extracting all the links (URLs) that lead to the other pages we want to visit. To do that, we'll use the skills learned in the [Basics of data extraction](./02_data_extraction.md) course. We'll add some extra filtering to make sure we only get the correct URLs. Then, we'll save those URLs, so in case our scraper crashes with an error, we won't have to extract them again. And, finally, we will visit those URLs one by one.
At any point, we can extract URLs, data, or both. Crawling can be separate from data extraction, but it's not a requirement and, in most projects, it's actually easier and faster to do both at the same time. To summarize, it goes like this:
@@ -31,4 +30,4 @@ At any point, we can extract URLs, data, or both. Crawling can be separate from
## Next up {#next}
-First, let's make sure we all understand the foundations. In the [next lesson](./recap_extraction_basics.md) we will review the scraper code we already have from the [Basics of data extraction](../data_extraction/index.md) section of the course.
+First, let's make sure we all understand the foundations. In the [next lesson](./12_recap_extraction_basics.md) we will review the scraper code we already have from the [Basics of data extraction](./02_data_extraction.md) section of the course.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/recap_extraction_basics.md b/sources/academy/webscraping/scraping_basics_javascript/12_recap_extraction_basics.md
similarity index 74%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/recap_extraction_basics.md
rename to sources/academy/webscraping/scraping_basics_javascript/12_recap_extraction_basics.md
index cdeea8cd5..dec56f856 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/recap_extraction_basics.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/12_recap_extraction_basics.md
@@ -1,7 +1,6 @@
---
-title: Recap - Data extraction
+title: "Crawling: Recap - Data extraction"
description: Review our e-commerce website scraper and refresh our memory about its code and the programming techniques we used to extract and save the data.
-sidebar_position: 1
slug: /web-scraping-for-beginners/crawling/recap-extraction-basics
---
@@ -11,7 +10,7 @@ slug: /web-scraping-for-beginners/crawling/recap-extraction-basics
---
-We finished off the [first section](../data_extraction/index.md) of the _Web scraping basics for JavaScript devs_ course by creating a web scraper in Node.js. The scraper collected all the on-sale products from [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales). Let's see the code with some comments added.
+We finished off the [first section](./02_data_extraction.md) of the _Web scraping basics for JavaScript devs_ course by creating a web scraper in Node.js. The scraper collected all the on-sale products from [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales). Let's see the code with some comments added.
```js
// First, we imported all the libraries we needed to
@@ -66,16 +65,16 @@ writeFileSync('products.csv', csv);
:::tip
-If some of the code is hard for you to understand, please review the [Basics of data extraction](../data_extraction/index.md) section. We will not go through the details again in this section about crawling.
+If some of the code is hard for you to understand, please review the [Basics of data extraction](./02_data_extraction.md) section. We will not go through the details again in this section about crawling.
:::
:::caution
-We are using JavaScript features like `import` statements and top-level `await`. If you see errors like _Cannot use import outside of a module_, please review the [Project setup lesson](../data_extraction/project_setup.md#modern-javascript), where we explain how to enable those features.
+We are using JavaScript features like `import` statements and top-level `await`. If you see errors like _Cannot use import outside of a module_, please review the [Project setup lesson](./07_project_setup.md#modern-javascript), where we explain how to enable those features.
:::
## Next up {#next}
-The [next lesson](./finding_links.md) is all about finding links to crawl on the [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales).
+The [next lesson](./13_finding_links.md) is all about finding links to crawl on the [Warehouse store](https://warehouse-theme-metal.myshopify.com/collections/sales).
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/finding_links.md b/sources/academy/webscraping/scraping_basics_javascript/13_finding_links.md
similarity index 90%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/finding_links.md
rename to sources/academy/webscraping/scraping_basics_javascript/13_finding_links.md
index 785d9396a..864f27851 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/finding_links.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/13_finding_links.md
@@ -1,11 +1,10 @@
---
-title: Finding links
+title: "Crawling: Finding links"
description: Learn what a link looks like in HTML and how to find and extract their URLs when web scraping. Using both DevTools and Node.js.
-sidebar_position: 2
slug: /web-scraping-for-beginners/crawling/finding-links
---
-import Example from '!!raw-loader!roa-loader!./finding_links.js';
+import Example from '!!raw-loader!roa-loader!./code_blocks/finding_links.js';
# Finding links {#finding-links}
@@ -49,7 +48,7 @@ Go to the [Warehouse store Sales category](https://warehouse-theme-metal.myshopi
DevTools Console is a fun playground, but Node.js is way more useful. Let's create a new file in our project called **crawler.js** and add some basic crawling code that prints all the links from the [Sales category of Warehouse](https://warehouse-theme-metal.myshopify.com/collections/sales).
-We'll start from a boilerplate that's very similar to the scraper we built in [Basics of data extraction](../data_extraction/node_js_scraper.md).
+We'll start from a boilerplate that's very similar to the scraper we built in [Basics of data extraction](./08_node_js_scraper.md).
{Example}
@@ -61,4 +60,4 @@ When you run the above code, you'll see quite a lot of links in the terminal. So
## Next Up {#next}
-The [next lesson](./filtering_links.md) will teach you how to select and filter links, so that your crawler will always work only with valid and useful URLs.
+The [next lesson](./14_filtering_links.md) will teach you how to select and filter links, so that your crawler will always work only with valid and useful URLs.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/filtering_links.md b/sources/academy/webscraping/scraping_basics_javascript/14_filtering_links.md
similarity index 96%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/filtering_links.md
rename to sources/academy/webscraping/scraping_basics_javascript/14_filtering_links.md
index 34d4961aa..010d1a5fd 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/filtering_links.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/14_filtering_links.md
@@ -1,7 +1,6 @@
---
-title: Filtering links
+title: "Crawling: Filtering links"
description: When you extract links from a web page, you often end up with a lot of irrelevant URLs. Learn how to filter the links to only keep the ones you need.
-sidebar_position: 3
slug: /web-scraping-for-beginners/crawling/filtering-links
---
@@ -153,4 +152,4 @@ With that said, yes, filtering with CSS selectors is often the better and more r
## Next Up {#next}
-In the [next lesson](./relative_urls.md) we'll see how rewriting this code to Node.js is not so simple and learn about absolute and relative URLs in the process.
+In the [next lesson](./15_relative_urls.md) we'll see how rewriting this code to Node.js is not so simple and learn about absolute and relative URLs in the process.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/relative_urls.md b/sources/academy/webscraping/scraping_basics_javascript/15_relative_urls.md
similarity index 94%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/relative_urls.md
rename to sources/academy/webscraping/scraping_basics_javascript/15_relative_urls.md
index f9487c80a..29cd38bcb 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/relative_urls.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/15_relative_urls.md
@@ -1,7 +1,6 @@
---
-title: Relative URLs
+title: "Crawling: Relative URLs"
description: Learn about absolute and relative URLs used on web pages and how to work with them when parsing HTML with Cheerio in your scraper.
-sidebar_position: 4
slug: /web-scraping-for-beginners/crawling/relative-urls
---
@@ -31,7 +30,7 @@ We'll see why the difference between relative URLs and absolute URLs is importan
## Browser vs Node.js: The Differences {#browser-vs-node}
-Let's update the Node.js code from the [Finding links lesson](./finding_links.md) to see why links with relative URLs can be a problem.
+Let's update the Node.js code from the [Finding links lesson](./13_finding_links.md) to see why links with relative URLs can be a problem.
```js title=crawler.js
import { gotScraping } from 'got-scraping';
@@ -97,4 +96,4 @@ Cheerio can't resolve the URL itself, because until you provide the necessary in
## Next up {#next}
-The [next lesson](./first_crawl.md) will teach you how to use the collected URLs to crawl all the individual product pages.
+The [next lesson](./16_first_crawl.md) will teach you how to use the collected URLs to crawl all the individual product pages.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/first_crawl.md b/sources/academy/webscraping/scraping_basics_javascript/16_first_crawl.md
similarity index 91%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/first_crawl.md
rename to sources/academy/webscraping/scraping_basics_javascript/16_first_crawl.md
index 432d06f64..d22bfab84 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/first_crawl.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/16_first_crawl.md
@@ -1,7 +1,6 @@
---
-title: Your first crawl
+title: "Crawling: Your first crawl"
description: Learn how to crawl the web using Node.js, Cheerio and an HTTP client. Extract URLs from pages and use them to visit more websites.
-sidebar_position: 5
slug: /web-scraping-for-beginners/crawling/first-crawl
---
@@ -13,7 +12,7 @@ slug: /web-scraping-for-beginners/crawling/first-crawl
In the previous lessons, we learned what crawling is and how to extract URLs from a page's HTML. The only thing that remains is to write the codeβlet's get right to it!
-> If the code starts to look too complex to you, don't worry. We're showing it for educational purposes, so that you can learn how crawling works. Near the end of this course, we'll show you a much easier and faster way to crawl, using a specialized scraping library. If you want, you can skip the details and [go there now](./pro_scraping.md).
+> If the code starts to look too complex to you, don't worry. We're showing it for educational purposes, so that you can learn how crawling works. Near the end of this course, we'll show you a much easier and faster way to crawl, using a specialized scraping library. If you want, you can skip the details and [go there now](./18_pro_scraping.md).
## Processing URLs {#processing-urls}
@@ -71,7 +70,7 @@ The code above is correct, but it's not robust. If something goes wrong, it will
In programming, you handle errors by catching and handling them. Typically by printing information that the error occurred and/or retrying.
-> The scraping library we'll [show you in the following lessons](./pro_scraping.md) handles errors and retries automatically for you.
+> The scraping library we'll [show you in the following lessons](./18_pro_scraping.md) handles errors and retries automatically for you.
```js title=crawler.js
import { gotScraping } from 'got-scraping';
@@ -116,8 +115,8 @@ for (const url of productUrls) {
At the time of writing, none of the links have failed; however, as you crawl more pages, you will surely hit a few errors π. The important thing is that the crawler will no longer crash if an error does in fact occur, and that it will be able to download the HTML from the working product links.
-> If you thought that the crawl was taking too long to complete, the [scraping library](./pro_scraping.md) we keep referring to will help once again. It automatically parallelizes the downloads and processing of HTML, which leads to significant speed improvements.
+> If you thought that the crawl was taking too long to complete, the [scraping library](./18_pro_scraping.md) we keep referring to will help once again. It automatically parallelizes the downloads and processing of HTML, which leads to significant speed improvements.
## Next up {#next}
-In the [next lesson](./scraping_the_data.md), we will complete the scraper by extracting data about all the products from their individual pages.
+In the [next lesson](./17_scraping_the_data.md), we will complete the scraper by extracting data about all the products from their individual pages.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/scraping_the_data.md b/sources/academy/webscraping/scraping_basics_javascript/17_scraping_the_data.md
similarity index 92%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/scraping_the_data.md
rename to sources/academy/webscraping/scraping_basics_javascript/17_scraping_the_data.md
index 734c637d6..e1c375a5e 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/scraping_the_data.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/17_scraping_the_data.md
@@ -1,7 +1,6 @@
---
-title: Scraping data
+title: "Crawling: Scraping data"
description: Learn how to add data extraction logic to your crawler, which will allow you to extract data from all the websites you crawled.
-sidebar_position: 6
slug: /web-scraping-for-beginners/crawling/scraping-the-data
---
@@ -11,7 +10,7 @@ slug: /web-scraping-for-beginners/crawling/scraping-the-data
---
-At the [very beginning of this course](../index.md), we learned that the term web scraping usually means a combined process of data extraction and crawling. And this is exactly what we'll do in this lesson. We will take the crawling code from the previous lesson, and we will combine it with data extraction code and turn everything into a web scraper.
+At the [very beginning of this course](./index.md), we learned that the term web scraping usually means a combined process of data extraction and crawling. And this is exactly what we'll do in this lesson. We will take the crawling code from the previous lesson, and we will combine it with data extraction code and turn everything into a web scraper.
## Extracting data from a product detail page {#extracting-data}
@@ -32,7 +31,7 @@ const $ = cheerio.load(html);
// Attribute extraction code will go here.
```
-We will use the techniques learned in the [Basics of data extraction](../data_extraction/index.md) section to find and extract the following product attributes:
+We will use the techniques learned in the [Basics of data extraction](./02_data_extraction.md) section to find and extract the following product attributes:
- title
- vendor
@@ -42,7 +41,7 @@ We will use the techniques learned in the [Basics of data extraction](../data_ex

-> For brevity, we won't explain how to extract every attribute step-by-step. Review the [Basics of data extraction](../data_extraction/index.md) section to learn about DevTools and extracting data.
+> For brevity, we won't explain how to extract every attribute step-by-step. Review the [Basics of data extraction](./02_data_extraction.md) section to learn about DevTools and extracting data.
### Title
@@ -154,7 +153,7 @@ for (const url of productUrls) {
}
```
-We can see that the code is quite similar. Both scripts download HTML and then process the HTML. To understand how to put them together, we'll go back to the [original process of crawling](./index.md).
+We can see that the code is quite similar. Both scripts download HTML and then process the HTML. To understand how to put them together, we'll go back to the [original process of crawling](./11_crawling.md).
1. Visit the start URL.
2. Extract the next URLs (and data) and save them.
@@ -329,4 +328,4 @@ That's it for the absolute basics of crawling, but we're not done yet. We scrape
## Next up {#next}
-In the [next lesson](./pro_scraping.md) we will rewrite the scraper using an open-source web scraping library called [Crawlee](https://crawlee.dev). It will make the scraper more robust while speeding up development at the same time.
+In the [next lesson](./18_pro_scraping.md) we will rewrite the scraper using an open-source web scraping library called [Crawlee](https://crawlee.dev). It will make the scraper more robust while speeding up development at the same time.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/pro_scraping.md b/sources/academy/webscraping/scraping_basics_javascript/18_pro_scraping.md
similarity index 97%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/pro_scraping.md
rename to sources/academy/webscraping/scraping_basics_javascript/18_pro_scraping.md
index b4b161641..edef900b7 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/pro_scraping.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/18_pro_scraping.md
@@ -1,7 +1,6 @@
---
-title: Professional scraping
+title: "Crawling: Professional scraping"
description: Learn how to build scrapers quicker and get better and more robust results by using Crawlee, an open-source library for scraping in Node.js.
-sidebar_position: 7
slug: /web-scraping-for-beginners/crawling/pro-scraping
---
@@ -35,7 +34,7 @@ Crawlee factors away and manages the dull and repetitive parts of web scraper de
- Request concurrency
- Queueing requests
- Data storage
-- Using and rotating [proxies](../../anti_scraping/mitigation/proxies.md)
+- Using and rotating [proxies](../anti_scraping/mitigation/proxies.md)
- Puppeteer/Playwright setup overhead
- [See all the features](https://crawlee.dev/docs/introduction)
@@ -241,4 +240,4 @@ Thanks to **Crawlee**, we were able to create a **faster and more robust scraper
## Next up {#next}
-In the [next lesson](./headless_browser.md) we'll show you how to turn this plain HTTP crawler into a **headless browser** scraper in only a few lines of code.
+In the [next lesson](./19_headless_browser.md) we'll show you how to turn this plain HTTP crawler into a **headless browser** scraper in only a few lines of code.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/headless_browser.md b/sources/academy/webscraping/scraping_basics_javascript/19_headless_browser.md
similarity index 96%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/headless_browser.md
rename to sources/academy/webscraping/scraping_basics_javascript/19_headless_browser.md
index b57a81064..202d3849f 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/headless_browser.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/19_headless_browser.md
@@ -1,7 +1,6 @@
---
-title: Headless browsers
+title: "Crawling: Headless browsers"
description: Learn how to scrape the web with a headless browser using only a few lines of code. Chrome, Firefox, Safari, Edge - all are supported.
-sidebar_position: 8
slug: /web-scraping-for-beginners/crawling/headless-browser
---
@@ -127,7 +126,7 @@ One of the important benefits of using a browser is that it allows you to extrac
:::tip
-We discuss dynamic data at length in the [How to scrape dynamic pages](../../../tutorials/node_js/dealing_with_dynamic_pages.md) tutorial, and we also have a special lesson dedicated to it in our [Puppeteer & Playwright course](../../puppeteer_playwright/page/waiting.md).
+We discuss dynamic data at length in the [How to scrape dynamic pages](../../tutorials/node_js/dealing_with_dynamic_pages.md) tutorial, and we also have a special lesson dedicated to it in our [Puppeteer & Playwright course](../puppeteer_playwright/page/waiting.md).
:::
@@ -201,4 +200,4 @@ When you run the code, you'll find the recommended product names correctly extra
## Next up {#next}
-We learned how to scrape with Cheerio and Playwright, but how do we export the data for further processing? Let's learn that in the [next and final lesson](./exporting_data.md) of the Basics of crawling section.
+We learned how to scrape with Cheerio and Playwright, but how do we export the data for further processing? Let's learn that in the [next and final lesson](./20_exporting_data.md) of the Basics of crawling section.
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/exporting_data.md b/sources/academy/webscraping/scraping_basics_javascript/20_exporting_data.md
similarity index 91%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/exporting_data.md
rename to sources/academy/webscraping/scraping_basics_javascript/20_exporting_data.md
index d0d4baad8..98d8a37e6 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/crawling/exporting_data.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/20_exporting_data.md
@@ -1,7 +1,6 @@
---
-title: Exporting data
+title: "Crawling: Exporting data"
description: Learn how to export the data you scraped using Crawlee to CSV or JSON.
-sidebar_position: 9
slug: /web-scraping-for-beginners/crawling/exporting-data
---
@@ -109,4 +108,4 @@ await Dataset.exportToCSV('results');
## Next up {#next}
-And this is it for the [**Basics of crawling**](./index.md) section of the [**Web scraping basics for JavaScript devs**](../index.md) course. If you want to learn more, test your knowledge of the methods and concepts you learned in this course by moving forward with the [**challenge**](../challenge/index.md).
+And this is it for the [**Basics of crawling**](./11_crawling.md) section of the [**Web scraping basics for JavaScript devs**](./index.md) course. If you want to learn more, test your knowledge of the methods and concepts you learned in this course by moving forward with the [**challenge**](./21_challenge.md).
diff --git a/sources/academy/webscraping/scraping_basics_javascript/challenge/index.md b/sources/academy/webscraping/scraping_basics_javascript/21_challenge.md
similarity index 89%
rename from sources/academy/webscraping/scraping_basics_javascript/challenge/index.md
rename to sources/academy/webscraping/scraping_basics_javascript/21_challenge.md
index 301501de8..aa753138f 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/challenge/index.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/21_challenge.md
@@ -1,7 +1,6 @@
---
title: Challenge
description: Test your knowledge acquired in the previous sections of this course by building an Amazon scraper using Crawlee's CheerioCrawler!
-sidebar_position: 1.4
slug: /web-scraping-for-beginners/challenge
---
@@ -13,7 +12,7 @@ slug: /web-scraping-for-beginners/challenge
Before moving onto the other courses in the academy, we recommend following along with this section, as it combines everything you've learned in the previous lessons into one cohesive project that helps you prove to yourself that you've thoroughly understood the material.
-We recommend that you make sure you've gone through both the [data extraction](../data_extraction/index.md) and [crawling](../crawling/index.md) sections of this course to ensure the smoothest development process.
+We recommend that you make sure you've gone through both the [data extraction](./02_data_extraction.md) and [crawling](./11_crawling.md) sections of this course to ensure the smoothest development process.
## Learning π§ {#learning}
@@ -21,7 +20,7 @@ Before continuing, it is highly recommended to do the following:
- Look over [how to build a crawler in Crawlee](https://crawlee.dev/docs/introduction/first-crawler) and ideally **code along**.
- Read [this short article](https://docs.apify.com/academy/node-js/request-labels-in-apify-actors) about [**request labels**](https://crawlee.dev/api/core/class/Request#label) (this will be extremely useful later on).
-- Check out [this tutorial](../../../tutorials/node_js/dealing_with_dynamic_pages.md) about dynamic pages.
+- Check out [this tutorial](../../tutorials/node_js/dealing_with_dynamic_pages.md) about dynamic pages.
- Read about the [RequestQueue](https://crawlee.dev/api/core/class/RequestQueue).
## Our task {#our-task}
@@ -42,7 +41,7 @@ Our crawler's input will look like this:
The goal at hand is to scrape all of the products from the first page of results for whatever keyword was provided (for our test case, it will be **iPhone**), then to scrape all available offers of each product and push the results to the dataset. For context, the offers for a product look like this:
-
+
In the end, we'd like our final output to look something like this:
@@ -85,4 +84,4 @@ From this course, you should have all the knowledge to build this scraper by you
The challenge can be completed using either [CheerioCrawler](https://crawlee.dev/api/cheerio-crawler/class/CheerioCrawler) or [PlaywrightCrawler](https://crawlee.dev/api/playwright-crawler/class/PlaywrightCrawler). Playwright is significantly slower but doesn't get blocked as much. You will learn the most by implementing both.
-Let's start off this section by [initializing and setting up](./initializing_and_setting_up.md) our project with the Crawlee CLI (don't worry, no additional installation is required).
+Let's start off this section by [initializing and setting up](./22_initializing_and_setting_up.md) our project with the Crawlee CLI (don't worry, no additional installation is required).
diff --git a/sources/academy/webscraping/scraping_basics_javascript/challenge/initializing_and_setting_up.md b/sources/academy/webscraping/scraping_basics_javascript/22_initializing_and_setting_up.md
similarity index 96%
rename from sources/academy/webscraping/scraping_basics_javascript/challenge/initializing_and_setting_up.md
rename to sources/academy/webscraping/scraping_basics_javascript/22_initializing_and_setting_up.md
index c0cf40bc1..96848cd3e 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/challenge/initializing_and_setting_up.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/22_initializing_and_setting_up.md
@@ -1,7 +1,6 @@
---
-title: Initializing & setting up
+title: "Challenge: Initializing & setting up"
description: When you extract links from a web page, you often end up with a lot of irrelevant URLs. Learn how to filter the links to only keep the ones you need.
-sidebar_position: 1
slug: /web-scraping-for-beginners/challenge/initializing-and-setting-up
---
@@ -76,4 +75,4 @@ Finally, we'll add the following input file to **INPUT.json** in the project's r
## Next up {#next}
-Cool! We're ready. But [let's discuss a bit about modularity](./modularity.md) before moving forward!
+Cool! We're ready. But [let's discuss a bit about modularity](./23_modularity.md) before moving forward!
diff --git a/sources/academy/webscraping/scraping_basics_javascript/challenge/modularity.md b/sources/academy/webscraping/scraping_basics_javascript/23_modularity.md
similarity index 95%
rename from sources/academy/webscraping/scraping_basics_javascript/challenge/modularity.md
rename to sources/academy/webscraping/scraping_basics_javascript/23_modularity.md
index e6d62c7b3..c5294f8f7 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/challenge/modularity.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/23_modularity.md
@@ -1,7 +1,6 @@
---
-title: Modularity
+title: "Challenge: Modularity"
description: Before you build your first web scraper with Crawlee, it is important to understand the concept of modularity in programming.
-sidebar_position: 2
slug: /web-scraping-for-beginners/challenge/modularity
---
@@ -13,7 +12,7 @@ slug: /web-scraping-for-beginners/challenge/modularity
Now that we've gotten our first request going, the first challenge is going to be selecting all of the resulting products on the page. Back in the browser, we'll use the DevTools hover tool to inspect a product.
-
+
**Bingo!** Each product seems to have a `data-asin` attribute, which includes the ASIN (product ID) data we want. Now, we can select each of these elements with this selector: `div > div[data-asin]:not([data-asin=""])`. Then, we'll scrape some data about each product, and push a request to the main product page so we can grab hold of the description.
@@ -123,8 +122,8 @@ Then, the labels can be used by importing `labels` and accessing `labels.START`,
This is not necessary, but it is best practice, as it can prevent dumb typos that can cause nasty bugs π For the rest of this lesson, all of the examples using labels will be using the imported versions.
-> If you haven't already read the **Best practices** lesson in the **Web scraping basics for JavaScript devs** course, please [give it a read](../best_practices.md).
+> If you haven't already read the **Best practices** lesson in the **Web scraping basics for JavaScript devs** course, please [give it a read](./25_best_practices.md).
## Next up {#next}
-Now that we've gotten that out of the way, we can finally continue with our Amazon scraper. [Let's do it](./scraping_amazon.md)!
+Now that we've gotten that out of the way, we can finally continue with our Amazon scraper. [Let's do it](./24_scraping_amazon.md)!
diff --git a/sources/academy/webscraping/scraping_basics_javascript/challenge/scraping_amazon.md b/sources/academy/webscraping/scraping_basics_javascript/24_scraping_amazon.md
similarity index 93%
rename from sources/academy/webscraping/scraping_basics_javascript/challenge/scraping_amazon.md
rename to sources/academy/webscraping/scraping_basics_javascript/24_scraping_amazon.md
index fa8291593..d4159150a 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/challenge/scraping_amazon.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/24_scraping_amazon.md
@@ -1,7 +1,6 @@
---
-title: Scraping Amazon
+title: "Challenge: Scraping Amazon"
description: Before you build your first web scraper with Crawlee, it is important to understand the concept of modularity in programming.
-sidebar_position: 4
slug: /web-scraping-for-beginners/challenge/scraping-amazon
---
@@ -28,7 +27,7 @@ router.addHandler(labels.PRODUCT, async ({ $ }) => {
```
-Great! But wait, where do we go from here? We need to go to the offers page next and scrape each offer, but how can we do that? Let's take a small break from writing the scraper and open up [Proxyman](../../../glossary/tools/proxyman.md) to analyze requests which we might be difficult to find in the network tab, then we'll click the button on the product page that loads up all of the product offers:
+Great! But wait, where do we go from here? We need to go to the offers page next and scrape each offer, but how can we do that? Let's take a small break from writing the scraper and open up [Proxyman](../../glossary/tools/proxyman.md) to analyze requests which we might be difficult to find in the network tab, then we'll click the button on the product page that loads up all of the product offers:

@@ -48,7 +47,7 @@ Here's what this page looks like:
Wow, that's ugly. But for our scenario, this is really great. When we click the **View offers** button, we usually have to wait for the offers to load and render, which would mean we could have to switch our entire crawler to a **PuppeteerCrawler** or **PlaywrightCrawler**. The data on this page we've just found appears to be loaded statically, which means we can still use CheerioCrawler and keep the scraper as efficient as possible π
-> It's totally possible to scrape the same data as this crawler using [Puppeteer or Playwright](../../puppeteer_playwright/index.md); however, with this offers link found in Postman, we can follow the same workflow much more quickly with static HTTP requests using CheerioCrawler.
+> It's totally possible to scrape the same data as this crawler using [Puppeteer or Playwright](../puppeteer_playwright/index.md); however, with this offers link found in Postman, we can follow the same workflow much more quickly with static HTTP requests using CheerioCrawler.
First, we'll create a request for each product's offers page:
diff --git a/sources/academy/webscraping/scraping_basics_javascript/best_practices.md b/sources/academy/webscraping/scraping_basics_javascript/25_best_practices.md
similarity index 99%
rename from sources/academy/webscraping/scraping_basics_javascript/best_practices.md
rename to sources/academy/webscraping/scraping_basics_javascript/25_best_practices.md
index b3e1540cc..80aee9bb8 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/best_practices.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/25_best_practices.md
@@ -1,7 +1,6 @@
---
title: Best practices
description: Understand the standards and best practices that we here at Apify abide by to write readable, scalable, and maintainable code.
-sidebar_position: 1.5
slug: /web-scraping-for-beginners/best-practices
---
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/finding_links.js b/sources/academy/webscraping/scraping_basics_javascript/code_blocks/finding_links.js
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/finding_links.js
rename to sources/academy/webscraping/scraping_basics_javascript/code_blocks/finding_links.js
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-console-commands.png b/sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-console-commands.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-console-commands.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-console-commands.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-console.png b/sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-console.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-console.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-console.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-element-selection.png b/sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-element-selection.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-element-selection.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-element-selection.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-elements-tab.png b/sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-elements-tab.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-elements-tab.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-elements-tab.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-hover.png b/sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-hover.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-hover.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-hover.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-wikipedia.png b/sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-wikipedia.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/browser-devtools-wikipedia.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/browser-devtools-wikipedia.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/challenge/images/crawlee-create.png b/sources/academy/webscraping/scraping_basics_javascript/images/crawlee-create.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/challenge/images/crawlee-create.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/crawlee-create.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/csv-data-in-sheets.png b/sources/academy/webscraping/scraping_basics_javascript/images/csv-data-in-sheets.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/csv-data-in-sheets.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/csv-data-in-sheets.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-clean-price.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-clean-price.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-clean-price.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-clean-price.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-cleaning-noise.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-cleaning-noise.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-cleaning-noise.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-cleaning-noise.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-class.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-class.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-class.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-class.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-product-hover.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-product-hover.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-product-hover.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-product-hover.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-product-name.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-product-name.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-product-name.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-product-name.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-query-all.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-query-all.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-query-all.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-query-all.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-query-hover.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-query-hover.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-query-hover.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-query-hover.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-query.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-query.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-query.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-query.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-warehouse.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-warehouse.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-collection-warehouse.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-collection-warehouse.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-count-products.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-count-products.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-count-products.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-count-products.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-extract-product-price.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-extract-product-price.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-extract-product-price.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-extract-product-price.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-extract-product-title.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-extract-product-title.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-extract-product-title.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-extract-product-title.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-find-child-elements.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-find-child-elements.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-find-child-elements.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-find-child-elements.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-print-all-products.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-print-all-products.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-print-all-products.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-print-all-products.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-print-parent-text.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-print-parent-text.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-print-parent-text.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-print-parent-text.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-product-titles.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-product-titles.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-product-titles.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-product-titles.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-split-price.png b/sources/academy/webscraping/scraping_basics_javascript/images/devtools-split-price.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/devtools-split-price.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/devtools-split-price.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/images/filtering-product-detail-link.png b/sources/academy/webscraping/scraping_basics_javascript/images/filtering-product-detail-link.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/images/filtering-product-detail-link.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/filtering-product-detail-link.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/images/filtering-product-urls.png b/sources/academy/webscraping/scraping_basics_javascript/images/filtering-product-urls.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/images/filtering-product-urls.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/filtering-product-urls.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/images/filtering-regex-urls.png b/sources/academy/webscraping/scraping_basics_javascript/images/filtering-regex-urls.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/images/filtering-regex-urls.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/filtering-regex-urls.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/images/headless-dynamic-data.png b/sources/academy/webscraping/scraping_basics_javascript/images/headless-dynamic-data.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/images/headless-dynamic-data.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/headless-dynamic-data.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/node-scraper-title.png b/sources/academy/webscraping/scraping_basics_javascript/images/node-scraper-title.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/node-scraper-title.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/node-scraper-title.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/challenge/images/offers-page.jpg b/sources/academy/webscraping/scraping_basics_javascript/images/offers-page.jpg
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/challenge/images/offers-page.jpg
rename to sources/academy/webscraping/scraping_basics_javascript/images/offers-page.jpg
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/images/scraping-title.png b/sources/academy/webscraping/scraping_basics_javascript/images/scraping-title.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/images/scraping-title.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/scraping-title.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/challenge/images/view-offers-button.jpg b/sources/academy/webscraping/scraping_basics_javascript/images/view-offers-button.jpg
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/challenge/images/view-offers-button.jpg
rename to sources/academy/webscraping/scraping_basics_javascript/images/view-offers-button.jpg
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-create-file.png b/sources/academy/webscraping/scraping_basics_javascript/images/vscode-create-file.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-create-file.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/vscode-create-file.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-hello-world.png b/sources/academy/webscraping/scraping_basics_javascript/images/vscode-hello-world.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-hello-world.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/vscode-hello-world.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-npm-init.png b/sources/academy/webscraping/scraping_basics_javascript/images/vscode-npm-init.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-npm-init.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/vscode-npm-init.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-open-folder.png b/sources/academy/webscraping/scraping_basics_javascript/images/vscode-open-folder.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-open-folder.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/vscode-open-folder.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-open-terminal.png b/sources/academy/webscraping/scraping_basics_javascript/images/vscode-open-terminal.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-open-terminal.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/vscode-open-terminal.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-test-setup.png b/sources/academy/webscraping/scraping_basics_javascript/images/vscode-test-setup.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-test-setup.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/vscode-test-setup.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-type-module.png b/sources/academy/webscraping/scraping_basics_javascript/images/vscode-type-module.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/data_extraction/images/vscode-type-module.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/vscode-type-module.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/images/warehouse-links.png b/sources/academy/webscraping/scraping_basics_javascript/images/warehouse-links.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/images/warehouse-links.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/warehouse-links.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/crawling/images/warehouse-store.png b/sources/academy/webscraping/scraping_basics_javascript/images/warehouse-store.png
similarity index 100%
rename from sources/academy/webscraping/scraping_basics_javascript/crawling/images/warehouse-store.png
rename to sources/academy/webscraping/scraping_basics_javascript/images/warehouse-store.png
diff --git a/sources/academy/webscraping/scraping_basics_javascript/index.md b/sources/academy/webscraping/scraping_basics_javascript/index.md
index 064723fc3..950858fd5 100644
--- a/sources/academy/webscraping/scraping_basics_javascript/index.md
+++ b/sources/academy/webscraping/scraping_basics_javascript/index.md
@@ -33,9 +33,9 @@ When we set out to create the Academy, we wanted to build a complete guide to we
This is what you'll learn in the **Web scraping basics for JavaScript devs** course:
* [Web scraping basics for JavaScript devs](./index.md)
- * [Basics of data extraction](./data_extraction/index.md)
- * [Basics of crawling](./crawling/index.md)
- * [Best practices](./best_practices.md)
+ * [Basics of data extraction](./02_data_extraction.md)
+ * [Basics of crawling](./11_crawling.md)
+ * [Best practices](./25_best_practices.md)
## Requirements {#requirements}
@@ -61,7 +61,7 @@ Throughout the next lessons, we will sometimes use certain technologies and term
* [HTML](https://developer.mozilla.org/en-US/docs/Web/HTML)
* [HTTP protocol](https://developer.mozilla.org/en-US/docs/Web/HTTP)
-* [DevTools](./data_extraction/browser_devtools.md)
+* [DevTools](./03_browser_devtools.md)
### jQuery or Cheerio {#jquery-or-cheerio}
@@ -69,6 +69,6 @@ We'll be using the [**Cheerio**](https://www.npmjs.com/package/cheerio) package
## Next up {#next}
-The course begins with a small bit of theory and moves into some realistic and practical examples of extracting data from the most popular websites on the internet using your browser console. [Let's get to it!](./introduction.md)
+The course begins with a small bit of theory and moves into some realistic and practical examples of extracting data from the most popular websites on the internet using your browser console. [Let's get to it!](./01_introduction.md)
-> If you already have experience with HTML, CSS, and browser DevTools, feel free to skip to the [Basics of crawling](./crawling/index.md) section.
+> If you already have experience with HTML, CSS, and browser DevTools, feel free to skip to the [Basics of crawling](./11_crawling.md) section.