Install

clone repo
cd scrapper
go build .
run exe file

config.json

It's important to first create config.json file. It should look like this:

{
  "name": "Example Shop",
  "url": "https://example.com",
  "output": "output.txt",
  "category_regex": "category_regex_pattern",
  "product_regex": "product_regex_pattern",
  "product": {
    "sku": "123456",
    "name": "Example Product",
    "price": "99.99",
    "old_price": "149.99",
    "currency": "USD",
    "url": "https://example.com/product",
    "availability": "In Stock",
    "description": "This is an example product description.",
    "image": "https://example.com/image.jpg",
    "category": "Example Category",
    "ean": "1234567890123",
    "brand": "Example Brand",
    "stock": "100"
  }
}

Best refer to config.go for more info.

** Info **

There is example_config.json file for easier start

If we just give query it will extract Text of page, but if we want to extract Attribute Value we need to say which attribute we want, by adding / after querty to element we want, and we need to provide name of attribute we want, eg. #my-element/att-value will extract value of att-value in element with id=my-element

ToDo ### Project status

Add next page xpath
Remove filtering urls
Save output to file (csv) more often
Extraction attributes
Concatenate/create array of values if more elements on same query exists
Multithreading
Fix bug with https and http in url. Desn's see http as https and vice versa

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
examples		examples
src		src
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

config.json

ToDo ### Project status

About

Releases

Packages

Languages

DrEverr/Scrapper

Folders and files

Latest commit

History

Repository files navigation

Install

config.json

ToDo ### Project status

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages