Ark Wiki Scraper 🦖📜

A Python script to scrape data from the Ark Wiki, focusing on creature and item data. This script uses requests, BeautifulSoup, and Selenium to extract and organize data into a JSON file.

Features 🚀

Scrapes Creature Data: Collects creature IDs, names, entity IDs, and blueprints from Ark Wiki's creature pages.
Scrapes Item Data: Collects item IDs, names, class names, and blueprints from Ark Wiki's item pages.
Configurable Blacklists: Allows exclusion of specific creatures or items using blacklists.
Automatic Driver Management: Uses webdriver-manager to handle ChromeDriver setup.

Requirements 📋

Python 3.7 or later
Chrome browser
Required Python packages (listed below)

Installation 🔧

Clone the Repository:

git clone https://github.com/jonxmitchell/ark-wiki-scraper.git
cd ark-wiki-scraper

Install Required Packages:

pip install -r requirements.txt

Configuration ⚙️

Edit blacklist.json: This file contains blacklists for creatures and items. Modify as needed to exclude specific entries.

{
	"Creatures": ["ExampleCreature"],
	"Items": ["ExampleItem"],
	"Engrams": ["ExampleEngram"],
	"Beacons": ["ExampleBeacon"]
}

Adjust URLs (if necessary):

Ensure the URLs for creature and item data are correctly set in the script.

Usage 🛠️

Run the Script:

python main.py

Output:
- ark_data.json: This file will be created in the project directory and will contain the scraped data. The file is structured as follows:

{
	"Dinos": {
		"Creature_Name": {
			"ID": 1,
			"Type": "creature",
			"Name": "CreatureName",
			"EntityID": "EntityID",
			"Blueprint": "BlueprintPath"
		}
	},
	"Items": {
		"Item_Name": {
			"ID": 1,
			"Type": "ItemType",
			"Name": "ItemName",
			"ClassName": "ClassName",
			"Blueprint": "BlueprintPath"
		}
	},
	"Engrams": {
		"Engram_Name": {
			"ID": 1,
			"Type": "engram",
			"Name": "EngramName",
			"Blueprint": "BlueprintPath"
		}
	},
	"Beacons": {
		"Beacon_Name": {
			"ID": 1,
			"Type": "beacon",
			"Name": "BeaconName",
			"ClassName": "ClassName"
		}
	}
}

Troubleshooting 🛠️

Driver Issues: If you encounter issues with ChromeDriver, ensure that your Chrome browser version matches the ChromeDriver version. The script uses webdriver-manager to handle this automatically.
Dependencies: If you face import errors, make sure all required packages are installed and that you are using the correct Python interpreter.

Contributing 🤝

Feel free to submit issues or pull requests. Contributions are welcome!

Acknowledgments 🙌

Ark Wiki for the data.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ark_data.json		ark_data.json
blacklist.json		blacklist.json
config.py		config.py
data_fetchers.py		data_fetchers.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ark Wiki Scraper 🦖📜

Features 🚀

Requirements 📋

Installation 🔧

Configuration ⚙️

Usage 🛠️

Troubleshooting 🛠️

Contributing 🤝

Acknowledgments 🙌

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ark Wiki Scraper 🦖📜

Features 🚀

Requirements 📋

Installation 🔧

Configuration ⚙️

Usage 🛠️

Troubleshooting 🛠️

Contributing 🤝

Acknowledgments 🙌

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages