Termux Web Scraper Example

This repository provides example projects demonstrating the usage of the termux-web-scraper framework, which enables running Selenium-based web scraping tasks on Android devices using Termux.

Prerequisites

Before you can use the Termux Web Scraper, you need to have the following installed on your Android device:

Termux: You can download Termux from the F-Droid or Google Play store.
Git: You can install Git in Termux by running pkg install git.

You also need to:

Disable Battery Optimization: Disable battery optimization for Termux to prevent it from being killed by the Android system.
Acquire a Wakelock: Acquire a wakelock in Termux to prevent the device from sleeping while your scraper is running.
Address Phantom Process Killing (Android 12+): On Android 12 and newer, you may need to disable phantom process killing to prevent Termux from being killed. You can do this by running the following command in an ADB shell:
```
./adb shell "settings put global settings_enable_monitor_phantom_procs false"
```

Project Structure

.
├── simple
│   ├── run.sh
│   └── ...
├── loop_error_ignore
│   ├── run.sh
│   └── ...
├── ...

This repository contains next example projects:

simple: A basic example that demonstrates how to perform a search on DuckDuckGo.
loop_error_ignore: An example that shows how to run the scraper in a loop and gracefully handle errors.

Getting Started

Get started by launching Termux on your Android device and cloning the repository:
```
git clone https://github.com/kpliuta/termux-web-scraper-example.git
```
Then, navigate to a project like simple and run the script:
```
cd termux-web-scraper-example/simple
./run.sh
```

Example Projects

Simple

The simple example demonstrates a basic web scraping scenario. It navigates to DuckDuckGo, searches for "Python Selenium Example", and saves a screenshot of the results.

Loop with Error Ignoring

The loop_error_ignore example showcases how to run the scraper in a continuous loop while ignoring any errors that may occur during the process. This is useful for long-running scraping tasks that need to be resilient to network issues or unexpected page changes. The script attempts to navigate to a non-existent URL, and the --loop-error-ignore flag ensures that the loop continues to run even when the navigation fails.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
loop_error_ignore		loop_error_ignore
simple		simple
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Termux Web Scraper Example

Prerequisites

Project Structure

Getting Started

Example Projects

Simple

Loop with Error Ignoring

License

About

Uh oh!

Releases

Packages

Languages

License

kpliuta/termux-web-scraper-example

Folders and files

Latest commit

History

Repository files navigation

Termux Web Scraper Example

Prerequisites

Project Structure

Getting Started

Example Projects

Simple

Loop with Error Ignoring

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages