Skip to content

Commit

Permalink
Merge pull request #70 from cesmix-mit/main
Browse files Browse the repository at this point in the history
update branch using main
  • Loading branch information
emmanuellujan authored Jun 11, 2024
2 parents 1a5e961 + 0200258 commit fc7393b
Show file tree
Hide file tree
Showing 27 changed files with 82 additions and 87 deletions.
6 changes: 2 additions & 4 deletions .github/workflows/Documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- run: sudo apt-get update && sudo apt-get install -y xorg-dev mesa-utils xvfb libgl1 freeglut3-dev libxrandr-dev libxinerama-dev libxcursor-dev libxi-dev libxext-dev
- uses: julia-actions/setup-julia@v1
with:
version: "1.9"
- uses: julia-actions/cache@v1
version: "1.10"
- name: add CESMIX registry
run: |
julia -e '
Expand All @@ -42,7 +40,7 @@ jobs:
doctest(PotentialLearning)
'
- name: generate docs
run: DISPLAY=:0 xvfb-run -s '-screen 0 1024x768x24' julia --project=docs docs/make.jl
run: julia --project=docs docs/make.jl
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## [WIP] PotentialLearning.jl

An open source Julia library for active learning of interatomic potentials in atomistic simulations of materials. It incorporates elements of bayesian inference, machine learning, differentiable programming, software composability, and high-performance computing. This package is part of a software suite developed for the [CESMIX](https://computing.mit.edu/cesmix/) project.
Developing Optimization Workflows for Fast and Accurate Interatomic Potentials. This package is part of a software suite developed for the [CESMIX](https://computing.mit.edu/cesmix/) project.

<!--<a href="https://cesmix-mit.github.io/PotentialLearning.jl/stable">
<img alt="Stable documentation" src="https://img.shields.io/badge/documentation-stable%20release-blue?style=flat-square">
Expand Down
2 changes: 1 addition & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
[deps]
AtomsBase = "a963bdd2-2df7-4f54-a1ee-49d51e6be12a"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterCitations = "daee34ce-89f3-4625-b898-19384cb65244"
InteratomicPotentials = "a9efe35a-c65d-452d-b8a8-82646cd5cb04"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
PotentialLearning = "82b0a93c-c2e3-44bc-a418-f0f89b0ae5c2"
Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d"
UnitfulAtomic = "a7773ee8-282e-5fa2-be4e-bd808c38a91a"
3 changes: 2 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@ const EXAMPLES_DIR = joinpath(@__DIR__, "..", "examples")
const OUTPUT_DIR = joinpath(@__DIR__, "src/generated")

examples = [
"Compute ACE descriptors, subsample, and fit ACE" => "Na/fit-dpp-ace-na.jl"
"Subsample Na dataset with DPP and fit with ACE" => "DPP-ACE-Na/fit-dpp-ace-na.jl",
"Load Ar+Lennard-Jones dataset and postprocess" => "LJ-Ar/lennard-jones-ar.jl"
]

for (_, example_path) in examples
Expand Down
33 changes: 15 additions & 18 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,20 @@
# [WIP] PotentialLearning.jl

An open source Julia library to enhance active learning of interatomic potentials in atomistic simulations of materials. It incorporates elements of bayesian inference, machine learning, differentiable programming, software composability, and high-performance computing. This package is part of a software suite developed for the [CESMIX](https://computing.mit.edu/cesmix/) project.

## Specific goals

- **Intelligent subsampling** of atomistic configurations via [DPP](https://github.com/dahtah/Determinantal.jl) and [clustering](https://docs.google.com/document/d/1SWAanEWQkpsbr2lqetMO3uvdX_QK-Z7dwrgPaM1Dl0o/edit)-based algorithms.
- **Optimization of interatomic potentials**
- Parallel optimization of hyperparameters and coefficients via [Hyperopt.jl](https://github.com/baggepinnen/Hyperopt.jl).
- Multi-objective optimization (Pareto fronts): force execution time vs fitting accuracy (e.g. MAE of energies and forces).
- **Neuralization of linear interatomic potentials**
- Neural version of [Julia-ACE](https://github.com/ACEsuit/ACE1.jl) and [LAMMPS-POD](https://docs.lammps.org/pair_pod.html).
- Integration with [Flux](https://fluxml.ai/Flux.jl/stable/) ecosystem.
- **Interatomic potential compression**
- Feature selection (e.g. [CUR](https://github.com/JuliaLinearAlgebra/LowRankApprox.jl)) and dimensionality reduction (e.g [PCA](https://juliastats.org/MultivariateStats.jl/dev/pca/)) of atomistic descriptors.
- **Interatomic potential fitting/training**
- Inference of parameter uncertainties in linear interatomic potentials.
- **Quantity of Interest (QoI) sensitivity** analysis of interatomic potential parameters.
- **Dimension reduction of QoI** through the theory of Active Subspaces.
- **Atomistic configuration and DFT data management and post-processing**
PotentialLerning.jl: **Developing optimization workflows for fast and accurate interatomic potentials**. This package is part of a software suite developed for the [CESMIX](https://computing.mit.edu/cesmix/) project.

## Goals

**Optimize your atomistic data: intelligent subsampling of large datasets to reduce DFT computations**
- Intelligent subsampling of atomistic configurations using algorithms based on [DPP](https://github.com/dahtah/Determinantal.jl), [DBSCAN](https://docs.google.com/document/d/1SWAanEWQkpsbr2lqetMO3uvdX_QK-Z7dwrgPaM1Dl0o/edit), [CUR](https://github.com/JuliaLinearAlgebra/LowRankApprox.jl), etc.
- Highly scalable parallel subsampling via hierarchical subsampling and distributed parallelism ([Dagger.jl](https://github.com/JuliaParallel/Dagger.jl)).
- Optimal subsampler choosing via [Hyperopt.jl](https://github.com/baggepinnen/Hyperopt.jl).

**Optimize your interatomic potential model: hyperparameters, coefficients, model compression, and model selection.**
- Parallel optimization of hyperparameters, coefficients, and model selection via [Hyperopt.jl](https://github.com/baggepinnen/Hyperopt.jl); multi-objective optimization (Pareto fronts): force execution time vs fitting accuracy (e.g. MAE of energies and forces).
- Model compression via feature selection (e.g. [CUR](https://github.com/JuliaLinearAlgebra/LowRankApprox.jl)) and dimensionality reduction (e.g [PCA](https://juliastats.org/MultivariateStats.jl/dev/pca/), Active Subspaces) of atomistic descriptors.
- Fitting of linear potentials and inference of parameter uncertainties. Training of neural versions of [Julia-ACE](https://github.com/ACEsuit/ACE1.jl) and [LAMMPS-POD](https://docs.lammps.org/pair_pod.html).

Additionally, this package provides utilities for atomistic configuration and DFT data management and post-processing.
- Process input data so that it is ready for training. E.g. read XYZ file with atomic configurations, linearize energies and forces, split dataset into training and testing, normalize data, transfer data to GPU, define iterators, etc.
- Post-processing: computation of different metrics (MAE, RSQ, COV, etc), saving results, and plotting.

Expand Down
32 changes: 0 additions & 32 deletions examples/Ar/plot-lennard-jones-ar.jl

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[deps]
AtomsBase = "a963bdd2-2df7-4f54-a1ee-49d51e6be12a"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
InteratomicPotentials = "a9efe35a-c65d-452d-b8a8-82646cd5cb04"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
PotentialLearning = "82b0a93c-c2e3-44bc-a418-f0f89b0ae5c2"
Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d"
UnitfulAtomic = "a7773ee8-282e-5fa2-be4e-bd808c38a91a"
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
#push!(Base.LOAD_PATH, "../../")

using Unitful, UnitfulAtomic
using AtomsBase, InteratomicPotentials, PotentialLearning
using LinearAlgebra, CairoMakie
using LinearAlgebra, Plots

# Load dataset
path = joinpath(dirname(pathof(PotentialLearning)), "../examples/Na/")
confs, thermo = load_data("$path/data/liquify_sodium.yaml", YAML(:Na, u"eV", u""))
path = joinpath(dirname(pathof(PotentialLearning)), "../examples/DPP-ACE-Na")
confs, thermo = load_data("$path/../data/Na/liquify_sodium.yaml", YAML(:Na, u"eV", u""))
confs, thermo = confs[220:end], thermo[220:end]

# Split dataset
Expand Down Expand Up @@ -53,16 +51,13 @@ println("MAE: $e_mae, RMSE: $e_rmse, RSQ: $e_rsq")
# Plot energy error scatter
e_err_train, e_err_test = (e_train_pred - e_train), (e_test_pred - e_test)
dpp_inds2 = get_random_subset(dpp; batch_size = 20)
size_inches = (12, 8)
size_pt = 72 .* size_inches
fig = Figure(resolution = size_pt, fontsize = 16)
ax1 = Axis(fig[1, 1], xlabel = "Energy (eV/atom)", ylabel = "Error (eV/atom)")
scatter!(ax1, e_train, e_err_train, label = "Training", markersize = 5.0)
scatter!(ax1, e_test, e_err_test, label = "Test", markersize = 5.0)
scatter!(ax1, e_train[dpp_inds2], e_err_train[dpp_inds2], markersize = 5.0,
color = :darkred, label = "DPP Samples")
axislegend(ax1)
save("$path/figures/energy_error_training_test_scatter.pdf", fig)
display(fig)

scatter( e_train, e_err_train, label = "Training", color = :blue,
markersize = 1.5, markerstrokewidth=0)
scatter!(e_test, e_err_test, label = "Test", color = :yellow,
markersize = 1.5, markerstrokewidth=0)
scatter!(e_train[dpp_inds2], e_err_train[dpp_inds2],
color = :darkred, label = "DPP Samples",
markersize = 2.5, markerstrokewidth=0)
scatter!(xlabel = "Energy (eV/atom)", ylabel = "Error (eV/atom)",
dpi = 1000, fontsize = 16)

File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
push!(Base.LOAD_PATH, "../../")

using LinearAlgebra, Random, InvertedIndices
using Statistics, StatsBase, Distributions, Determinantal
using Unitful, UnitfulAtomic
Expand All @@ -12,8 +10,9 @@ include("subsampling_utils.jl")
# Load dataset -----------------------------------------------------------------
elname = "Si"
elspec = [:Si]
inpath = "../Si-3Body-LAMMPS/"
outpath = "./output/$elname/"
path = joinpath(dirname(pathof(PotentialLearning)), "../examples/DPP-ACE-Si")
inpath = "$path/../data/Si-3Body-LAMMPS/"
outpath = "$path/output/$elname/"

# Read all data
file_arr = readext(inpath, "xyz")
Expand Down Expand Up @@ -47,8 +46,8 @@ ace = ACE(species = elspec, # species

# Update dataset by adding energy (local) descriptors --------------------------
println("Computing local descriptors")
@time e_descr = compute_local_descriptors(confs, ace)
@time f_descr = compute_force_descriptors(confs, ace)
e_descr = compute_local_descriptors(confs, ace)
f_descr = compute_force_descriptors(confs, ace)
JLD.save(outpath*"$(elname)_energy_descriptors.jld", "e_descr", e_descr)
JLD.save(outpath*"$(elname)_force_descriptors.jld", "f_descr", f_descr)

Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,12 @@ using AtomsBase, InteratomicPotentials, PotentialLearning
using JLD, CairoMakie

#################### Importing Data ###################

path = joinpath(dirname(pathof(PotentialLearning)), "../examples/DPP-ACE-aHfO2")

# Import Raw Data
energies, descriptors = JLD.load(
"examples/aHfO2/data/aHfO2_diverse_descriptors_3600.jld",
"$path/../data/aHfO2/aHfO2_diverse_descriptors_3600.jld",
"energies",
"descriptors",
)
Expand Down
2 changes: 1 addition & 1 deletion examples/Na/Project.toml → examples/LJ-Ar/Project.toml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[deps]
AtomsBase = "a963bdd2-2df7-4f54-a1ee-49d51e6be12a"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
InteratomicPotentials = "a9efe35a-c65d-452d-b8a8-82646cd5cb04"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
PotentialLearning = "82b0a93c-c2e3-44bc-a418-f0f89b0ae5c2"
Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d"
UnitfulAtomic = "a7773ee8-282e-5fa2-be4e-bd808c38a91a"
34 changes: 34 additions & 0 deletions examples/LJ-Ar/lennard-jones-ar.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
using Unitful, UnitfulAtomic
using AtomsBase, InteratomicPotentials, PotentialLearning
using LinearAlgebra, Plots

# Load dataset: Lennard-Jones + Argon
path = joinpath(dirname(pathof(PotentialLearning)), "../examples/LJ-Ar")
ds, thermo = load_data("$path/../data/LJ-AR/lj-ar.yaml", YAML(:Ar, u"eV", u""))

# Filter first configuration (zero energy)
ds = ds[2:end]

# Compute distance from origin, LJ energies, and time range
systems = get_system.(ds)
n_atoms = length(first(systems)) # Note: in this dataset all systems contain the same no. of atoms
positions = position.(systems)
dists_origin = map(x->ustrip.(norm.(x)), positions)
energies = get_values.(get_energy.(ds))
time_range = 0.5:0.5:5000

# Plot distance from origin vs time
p = plot(xlabel = "τ | ps",
ylabel = "Distance from origin | Å",
dpi = 300, fontsize = 12)
for i = 1:n_atoms
plot!(time_range, map(x->x[i], dists_origin), label="")
end
p

# Plot LJ energies vs time
plot(time_range, energies,
xlabel = "τ | ps",
ylabel = "Lennard Jones energy | eV",
dpi = 300, fontsize = 12)

2 changes: 1 addition & 1 deletion examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Change the directory to the desired example folder. E.g.
```bash
$ cd PotentialLearning.jl/examples/Na
$ cd PotentialLearning.jl/examples/DPP-ACE-Na
```

Open Julia REPL, activate ```Project.toml``` file in folder ```examples```, and chose the number of threads. E.g.
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion test/io/extxyz_test.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ using Unitful, UnitfulAtomic

energy_units = u"eV"
distance_units = u""
ds = load_data("../examples/Si-3Body-LAMMPS/data.xyz", ExtXYZ(energy_units, distance_units));
ds = load_data("../examples/data/Si-3Body-LAMMPS/data.xyz", ExtXYZ(energy_units, distance_units));

@test length(ds) == 201
@test typeof(ds) == DataSet
2 changes: 1 addition & 1 deletion test/io/yaml_test.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ using Unitful, UnitfulAtomic
energy_units = u"eV"
distance_units = u""
ds, t = load_data(
"../examples/Na/data/empirical_sodium_2d.yaml",
"../examples/data/Na/empirical_sodium_2d.yaml",
YAML(:Na; energy_units = energy_units, distance_units = distance_units),
);

Expand Down
2 changes: 1 addition & 1 deletion test/subset_selector/subset_selector_tests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ r = RandomSelector(num_configs; batch_size = batch_size)
# DBSCANSelector tests
energy_units = u"eV"
distance_units = u""
ds = load_data("../examples/Si-3Body-LAMMPS/data.xyz", ExtXYZ(energy_units, distance_units));
ds = load_data("../examples/data/Si-3Body-LAMMPS/data.xyz", ExtXYZ(energy_units, distance_units));
epsi, minpts, sample_size = 0.05, 5, batch_size
dbscans = DBSCANSelector( ds,
epsi,
Expand Down

0 comments on commit fc7393b

Please sign in to comment.