Farthest Point Sampling in Chemical Feature Space

Our research introduces the farthest point sampling (FPS) strategy within targeted chemical feature spaces to generate well-distributed training datasets. This approach enhances model performance by increasing the diversity within the training data's chemical feature space. We rigorously evaluated this strategy across various ML models – including artificial neural networks (ANN), support vector machines (SVM), random forests (RF) etc. – using datasets encapsulating key physicochemical properties. Our findings demonstrate that FPS-based models markedly outperform those trained via random sampling in terms of predictive accuracy, robustness, and a notable reduction in overfitting, especially in smaller training datasets.

A graphic illustration of the farthest point sampling in chemical space

MSE compared between FPS and RS

MSE compared by sampling in different chemical space

Heatmap of MSE for different machine learning model

MSE for different physicochemical datasets

t-SNE distributions for FPS and RS

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
figs		figs
README.md		README.md
config.py		config.py
dataset.xlsx		dataset.xlsx
load_dataset.py		load_dataset.py
main.py		main.py
main2.py		main2.py
main_cv.py		main_cv.py
main_svm.py		main_svm.py
models.py		models.py
plots.py		plots.py
results.py		results.py
sampler.py		sampler.py
scaler.py		scaler.py
test.py		test.py
train_data_critical_property.xlsx		train_data_critical_property.xlsx
train_data_hvap.xlsx		train_data_hvap.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Farthest Point Sampling in Chemical Feature Space

About

Releases

Packages

Languages

yuxi-TJU/Farthest-Point-Sampling-in-Chemical-Feature-Space

Folders and files

Latest commit

History

Repository files navigation

Farthest Point Sampling in Chemical Feature Space

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages