The smartphone eye-tracking app is now available at App Store.
Train three models (iTracker, AFFNet, and MGazeNet) for gaze estimation and Deploy them on phones or PCs. The experiment was performed on a workstation equipped with dual NVIDIA RTX 3090 graphics cards, an Intel Xeon Silver 4210 processor, and 256 GB of RAM. The software environment consisted of Ubuntu 18.0, CUDA 11.0, and Python 3.9.12.
-
Dataset preparation
Download the preprocessed GazeCapture dataset using this link. As per the IRB approved by the research ethics community governing the present study, the authors are not allowed to share any of the face images contained in the ZJUGaze dataset.
Once downloaded, move to the dataset directory and unzip the dataset:
cd GazeEstimation/dataset cp path/to/gazecapture.zip ./ unzip gazecapture.zip
-
Train model
Start by unzipping the dataset if not done already:
unzip gazecapture.zip
Install Pythonic dependency library:
python -m pip install -r requirements.txt
Create a configuration for a gaze estimation experiment according to a template
GazeEstimation/config/config_itracker.yaml
, then run model training:# if you train iTracker python -m torch.distributed.launch --nproc_per_node=2 --master_port=29900 train.py --world_size 2 --config_file config/config_itracker.yaml # train your config file python -m torch.distributed.launch --nproc_per_node=2 --master_port=29900 train.py --world_size 2 --config_file {path to configuration file}
Run model evaluation:
python test.py --config_file {path to the configuration file}
-
Convert model
Convert the PyTorch checkpoint model into an ONNX model, and then convert the ONNX model into an MNN model.
-
Deploy
Deploy your MNN model to either a smartphone or a PC for real-time gaze estimation. Please refer to the website.
-
Finetuning
You can run the following code to run the finetuning experiments with the ZJUGaze dataset.
cd GazeEstimation/finetuing python finetuning_freezen.py --model_path {model_path} --data_source {data_source}
-
How to get feature for calibration
cd GazeEstimation python predict_features.py --model_path {model_path} --npy_save_dir {--npy_save_dir}
Calibration is essential for mapping the relationship between ocular features and gaze coordinates in both appearance- and geometry-based video eye tracking. We utilize three swarm intelligence algorithms to optimize the hyperparameters of the support vector regressor. Below are the steps to run the experiments.
Before running the following code, please download the calibration feature dataset
using this link. Then, unzipping the dataset and moving all files and directories to SwarmIntelligentCalibration/calibration_data
.
-
Hyperparameter search for support vector regressor
Navigate to the hyperparameter search directory and run the experiment:
cd SwarmIntelligentCalibration python -m pip install -r requirements.txt cd hyperparameter_search python run.py
Swarm intelligence algorithms can be time-consuming, so please be patient during this process.
-
Validation searched hyperparameter
We assume that you have installed the Python package in requirements.txt. If not, see above.
# For example, run smooth pursuit calibration cd SwarmIntelligentCalibration/validation python run.py
The heuristic filter is designed to reduce noise in gaze signals before detecting eye movements such as saccades and fixations. It relies on heuristics derived from an examination of noise patterns in raw gaze data, resembling expert systems' rules of thumb.
The One Euro filter is a low-latency filtering algorithm, functioning as a first-order low-pass filter with an adaptive cutoff frequency. It stabilizes the signal at low velocities by using a low cutoff and minimizes delay at higher velocities with a higher cutoff. This filter is computationally efficient and suitable for real-time applications.
Download feature files from this link. Then, unzipping the dataset and moving all files and directories to Filter/feature
.
cd Filter
python -m pip install -r requirements.txt
python filter_expriment.py
python filter_exp_plotting.py
@article{https://doi.org/10.1155/2024/2644725,
author = {Zhu, Gancheng and Li, Yongkai and Zhang, Shuai and Duan, Xiaoting and Huang, Zehao and Yao, Zhaomin and Wang, Rong and Wang, Zhiguo},
title = {Neural Networks With Linear Adaptive Batch Normalization and Swarm Intelligence Calibration for Real-Time Gaze Estimation on Smartphones},
journal = {International Journal of Intelligent Systems},
volume = {2024},
number = {1},
pages = {2644725},
keywords = {convolutional neural networks, eye tracking, gaze estimation, smartphone, swarm intelligence},
doi = {https://doi.org/10.1155/2024/2644725},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1155/2024/2644725},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1155/2024/2644725},
abstract = {Eye tracking has emerged as a valuable tool for both research and clinical applications. However, traditional eye-tracking systems are often bulky and expensive, limiting their widespread adoption in various fields. Smartphone eye tracking has become feasible with advanced deep learning and edge computing technologies. However, the field still faces practical challenges related to large-scale datasets, model inference speed, and gaze estimation accuracy. The present study created a new dataset that contains over 3.2 million face images collected with recent phone models and presents a comprehensive smartphone eye-tracking pipeline comprising a deep neural network framework (MGazeNet), a personalized model calibration method, and a heuristic gaze signal filter. The MGazeNet model introduced a linear adaptive batch normalization module to efficiently combine eye and face features, achieving the state-of-the-art gaze estimation accuracy of 1.59 cm on the GazeCapture dataset and 1.48 cm on our custom dataset. In addition, an algorithm that utilizes multiverse optimization to optimize the hyperparameters of support vector regression (MVO–SVR) was proposed to improve eye-tracking calibration accuracy with 13 or fewer ground-truth gaze points, further improving gaze estimation accuracy to 0.89 cm. This integrated approach allows for eye tracking with accuracy comparable to that of research-grade eye trackers, offering new application possibilities for smartphone eye tracking.},
year = {2024}
}