Releases: BerkeleyLab/fiats
Flexible tensor reads and optional double-precision inference
This release offers
- A with a new version of
demo/app/train-cloud-microphysics.f90
that reads tensor component names fromdemo/training_configuration.json
and then reads the named variables from training data files written by Berkeley Lab's ICAR fork. ☁️ - A switch to LLVM
flang
as the supported compiler. (We have submitted bug reports to other compiler vendors.) 🪃 - Optional
double precision
inference as demonstrated indemo/app/infer-aerosol.f90
. 🔢 Non_overridable
inference and training procedures. We are collaborating with LLVMflang
developers at AMD on leveraging this feature to automatically offload parallel inference and training to graphics processing units (GPUs). 📈- A global renaming of this software to Fiats in all source code and documentation from 🌐
What's Changed
- Replace gfortran in CI with flang-new by @ktras in #200
- Support building cloud-microphysics with LLVM Flang by @ktras in #203
- Update filenames being read in by
infer_aerosol
by @ktras in #204 - Remove unallowed whitespace from project name in
demo/fpm.toml
by @ktras in #209 - Fix GELU & sigmoid activation precision by @rouson in #214
- Make all procedures involved in inference and training
non_overridable
by @rouson in #215 - Simplify class relationships by @rouson in #217
- Rename Inference-Engine to Fiats by @rouson in #218
- Merge multi-precision support into main by @rouson in #213
- Generalize train cloud microphysics by @rouson in #220
- doc(README): "tensor names" in JSON configuration by @rouson in #221
Full Changelog: 0.14.0...0.15.0
Parallel training
What's Changed
- Update README with
-Ofast
flag for flang-new build notes by @ktras in #201 - Parallel training via
do concurrent
by @rouson in #202
Full Changelog: 0.13.0...0.14.0
New JSON file format
The new file format includes
- A file version indicator currently named
acceptable_engine_tag
to denote thegit tag
used to create the new format, - Better nomenclature:
- The
minima
andmaxima
fields are nowintercept
andslope
, respectively, to better match the function's purpose: to define a linear map to and from the unit interval. - The encompassing
inputs_range
andoutputs_range
objects are nowinputs_map
andoutputs_map
, 🗺️
- The
- A fix for
cloud_microphysics/setup.sh
. ☁️ - Other minor bug fixes and enhancements.
What's Changed
- doc(README): specify required gfortran versions by @rouson in #185
- Enhance saturated mixing ratio example by @rouson in #189
- Refactor tensor_map_m to improve nomenclature & move phase_space_bin_t to cloud-microphysics by @rouson in #192
- Add git tag to JSON file to denote inference-engine version that reads and writes the format by @rouson in #194
- Fixes #195 by @rouson in #196
Full Changelog: 0.12.0...0.13.0
Support LLVM Flang + add entropy-maximizing filter to speed convergence
This release adds a feature to speed convergence and adds support for a fourth compiler in addition to the GNU, NAG, & Intel compilers:
- All tests pass with the LLVM Flang (
flang-new
) compiler, - The
cloud-microphysics/app/train-cloud-microphysics.f90
program includes new options--bins
filters the training data to maximize the Shannon entropy by selecting only one data point per bin in a five-dimensional phase space,--report
controls the frequency of writing JSON files to reduce file I/O costs,
- Eliminates several warning messages from the NAG compiler (
nagfor
). - Switches a dependency from Sourcery to Julienne to eliminate the requirement for coarray feature support,
- Adds the GELU activation function.
- Speeds up the calculation of the data needed to construct histograms.
- Adds a new
cloud-microphysics/train.sh
program to manage the training process, - Adds the ability to terminate a training run based on a cost-function tolerance rather than a fixed number of epochs.
What's Changed
- Remove second, unneeded and no longer supported build of gcc by @ktras in #150
- build: update to sourcery 4.8.1 by @rouson in #151
- doc(README): add instructions for auto offload by @rouson in #152
- Work around ifx automatic-offloading bugs by @rouson in #145
- Add bug workarounds for gfortran-14 associate-stmt bug by @ktras in #155
- Switching from Sourcery to Julienne by @rouson in #154
- Update fpm manifest with tag for v1.0 of dependency julienne by @ktras in #157
- Support compilation with LLVM Flang by @ktras in #159
- Update cloud-microphysics compiler and dependencies by @rouson in #160
- Add GELU activation function by @rouson in #161
- Feature: Faster histogram construction when the number of bins exceeds 80 by @rouson in #162
- Read & perform inference on networks for which the hidden-layer width varies across layers by @rouson in #166
- Fix/Feature(JSON): disambiguate tensor_range objects and allow flexible line-positioning of objects by @rouson in #165
- Feature: Support JSON output for networks with varying-width hidden layers by @rouson in #167
- Feature: filter training data for maximal information entropy via flat multidimensional output-tensor histograms by @rouson in #169
- Features: maximize information entropy and variable reporting interval. by @rouson in #170
- build: add single-file compile script by @rouson in #171
- Add ubuntu to CI by @ktras in #156
- Feature: add training script in cloud-microphysics/train.sh by @rouson in #172
- feat(train.sh): graceful exits by @rouson in #173
- refac(train): rm rendundant array allocations by @rouson in #174
- feat(cloud-micro): write 1st/last cost, fewer JSON by @rouson in #175
- feat(train.sh): add outer loop for refinement by @rouson in #176
- feat(cloud-micro): terminate on cost-tolerance by @rouson in #177
- Concurrent loop through each mini-batch during training by @rouson in #178
- test(adam): reset iterations so all tests pass with flang-new by @rouson in #179
- doc(README): add flags to optimize builds by @rouson in #180
- fix(single-source): mv script outside fpm's purview by @rouson in #182
- doc(README): optimize ifx builds by @rouson in #181
- Eliminate compiler warnings by @rouson in #183
- fix(single-file-source): respect file extension case by @rouson in #184
Full Changelog: 0.11.1...0.12.0
Selective test execution and compiler workarounds
New Feature
This release enables the selecting a subset of tests to run based on a search for substrings contained in the test output.
All test output is of the form
<Subject>
passes on <description 1>.
FAILS on <description 2>.
where the subject describes what is being tested (.e.g, A tensor_range_t object
) and the description details how the subject is being tested (e.g., component-wise construction followed by conversion to and from JSON
). The subject typically contains a type name such as tensor_range_t
. The description typically does not contain a type name. Therefore, running the command
fpm test -- --contains tensor_range_t
will execute and report the outcome of all tests of the given subject, tensor_range_t
, and only those tests. For test output similar to that shown above, this would display two test outcomes: one passing and one failing.
By contrast, running the command
fpm test -- --contains "component-wise construction"
would execute and report the outcome of the tests with descriptions containing component-wise construction
for any subject.
This release also works around a few compiler bugs and reorders tests so that the fastest and most stable run first.
What's Changed
- Work around ifx bug by @rouson in #142
- Fix filename extension for file that has directives by @ktras in #143
- feat(inference_engine_t): tensor_range_t getters (later removed) by @rouson in #147
- Cray bug workarounds for compile time bugs by @ktras in #146
- Feature: redesigned functionality for mapping input and output tensors to and from training ranges by @rouson in #148
- Test reordering and selective test execution by @rouson in #149
Full Changelog: 0.11.0...0.11.1
Batch normalization, more concurrency, & NAG compiler support
This release adds
- A
tensor_range_t
type that 🧑🌾 🌱- Encapsulates input and output tensor component minima and maxima,
- Provides type-bound
map_to_training_range
andmap_from_training_range
procedures for mapping tensors to and from the unit interval[0,1]
, and 🤺 - Provides a type-bound
in_range
procedure that users can employ to check whether inference input or output data involve extrapolation beyond the respective ranges employed in training.
- BREAKING CHANGE: the network JSON file format has been updated to include
input_range
andoutput_range
objects. The JSON file reader in this release may fail to read or write network files that are written or read by older versions of Inference-Engine. 🚒 🗄️ 📂 - Automatic use of the aforementioned mapping capability during inference. 🧭
- Enhanced concurrency to improve performance: 🐎
- Additional use of
do concurrent
in the training algorithm and 🚄 🚋 - Enabling building with OpenMP in the
setup.sh
script. 🏗️ 👷♀️
- Additional use of
- Additional compiler support: this is the first release that builds with the NAG Fortran compiler starting with compiler Build 7202.
What's Changed
- Simplify app: rm redundant procedures by @rouson in #102
- Concurrent inference example by @rouson in #103
- Exploit additional concurrency in the training algorithm by @rouson in #105
- feat(example): add nested do-loop inferences by @rouson in #106
- chore(examples): match program names to file names by @rouson in #109
- feat(infer): allow non-type-bound invocations by @rouson in #110
- doc(README): minimum gfortran version 13 by @rouson in #111
- Add new fpm subproject
icar-trainer
by @ktras in #108 - Enable OpenMP in setup script & work around related compiler bug by @rouson in #114
- fix(run-fpm.sh): revert to copying header into build dir by @rouson in #115
- Remove module keyword from abstract interface by @ktras in #116
- Compute & output tensor histograms in columnar format & add gnuplot script by @rouson in #118
- Bugfixes for nag by @ktras in #119
- fix(examples): .f90->.F90 to preprocess files by @rouson in #121
- Get beyond one type of Intel bugs by @ktras in #120
- Nagfor workaround by @rouson in #122
- chore(test): rm nagfor compiler workaround by @rouson in #129
- Workaround intel bug by @ktras in #128
- doc(README): add compilers in testing instructions by @rouson in #130
- build(fpm.toml): increment dependency versions by @rouson in #131
- More robust Adam optimizer test by @rouson in #134
- Ifx workarounds + train longer in Adam test to pass with nagfor by @rouson in #135
- Store tensor ranges by @rouson in #137
- build(random_init): rm final nagfor workaround by @rouson in #136
- Feature: Add input/output tensor component ranges to network files by @rouson in #138
- Feature: map input to unit range & output tensors from unit range in inference_engine_t infer procedure by @rouson in #139
- Normalize in training by @rouson in #140
- Fix training restarts by @rouson in #141
New Contributors
Full Changelog: 0.10.0...0.11.0
Train cloud microphysics
This is the first release that contains an app/train_cloud_microphysics.f90
program and a training_configuration.json
file that exhibits convergent behavior of the (Adam) training algorithm on an ICAR-generated training data set by as demonstrated by a monotonically decreasing cost function:
./build/run-fpm.sh run train-cloud-microphysics -- --base training --epochs 10 --start 720
...
Epoch Cost (avg)
1 0.121759593
2 1.61784310E-02
3 5.31613547E-03
4 2.68347375E-03
5 1.63242721E-03
6 1.11283606E-03
7 8.27088661E-04
8 6.59517595E-04
9 5.56710584E-04
10 4.91619750E-04
Training time: 39.319034000000002 for 10 epochs
System clock time: 353.68379099999999
What's Changed
- fix(deploy-docs.yml) - use linuxbrew to install ford 7 by @rouson in #99
- Train Thompson microphysics proxy by @rouson in #98
- doc(README): clarify language by @rouson in #100
Full Changelog: 0.9.0...0.10.0
Training configuration JSON file I/O and new examples
New in this release
- The
app/train-cloud-microphysics.f90
program reads hyperparameters and network configuration from the newtraining_configuration.json
input file and defines the corresponding variables in the program. - The new
example/print-training-configuration.f90
program displays a sample input file as shown below. - The new
example/learn-microphysics-procedures.f90
program learns to model two functions from [ICAR]'s Thompson cloud microphysics model. - Updated netcdf-interfaces dependency.
./build/run-fpm.sh run --example print-training-configuration
Project is up to date
{
"hyperparameters": {
"mini-batches" : 10,
"learning rate" : 1.50000000,
"optimizer" : "adam"
}
,
"network configuration": {
"skip connections" : false,
"nodes per layer" : [2,72,2],
"activation function" : "sigmoid"
}
}
What's Changed
- Cleanup examples by @rouson in #91
- Train neural net proxy for two functions from ICAR's Thompson microphysics model by @rouson in #92
- JSON-formatted input for training configuration by @rouson in #94
- doc(README) add training configuration material by @rouson in #95
- App reads training configuration JSON file by @rouson in #96
- fix(example): work around associate issues by @rouson in #97
Full Changelog: 0.8.0...0.9.0
New training algorithms, examples, documentation, and bug fixes
New training algorithms:
- Stochastic gradient descent
- The Adam optimizer
New examples:
Training neural nets to learn basic mathematical operations
The example/learn-*.f90
programs train neural networks to learn basic math functions. Given 8 input variables x(1), … ,x(8)
, the training algorithm can now learn to produce the following 6 comma-separated outputs corresponding to
- Addition:
[x(1)+x(2), x(2)+x(3), x(3)+x(4), x(4)+x(5), x(5)+x(6), x(6)+x(8)]
- Multiplication:
[x(1)*x(2), x(2)*x(3), x(3)*x(4), x(4)*x(5), x(5)*x(6), x(6)*x(8)]
- Exponentiation:
[x(1)**2, x(2)**3, x(3)**4, x(4)**4, x(5)**3, x(6)**2]
- Power series:
[1 + x(1) + (x(1)**2)/2 + (x(1)**3)/6, x(2), x(3), x(4), x(5), x(6)]
Inference-Engine's first application-relevant training example
The learn-saturated-mixing-ratio.f90
function trains a network with 2 inputs (normalized temperature and pressure), 1 output (saturated mixing ratio), and 1 hidden layer containing 72 nodes. The training inputs correspond to a uniform cartesian grid laid over the 2D space of procedure argument values bounded by the minimum and maximum values for the corresponding variables from an actual run of the ICAR regional climate model. The training data outputs are the actual result of a refactored version of the ICAR's sat_mr
function, wherein a test was used to verify that the refactored code gives the same answer to all significant digits across the entire grid of input values. The inputs are normalized so that the grid covers a unit square. The outputs are unnormalized.
What's Changed
Refactoring
- Breaking change: Adopt PyTorch nomenclature: replace input/output types with new
tensor_t
type by @rouson in #70
Features
- Add app that reads training data from ICAR output by @rouson in #73
- Train ICAR cloud microphysics by @rouson in #74
- feat(train): make cost calculation/return optional by @rouson in #75
- feat(app): rm zero-derivative points; report cost by @rouson in #76
- feat(app): allow initial parameters to vary by @rouson in #77
- Add Adam optimizer and stochastic gradient descent by @rouson in #78
- New app features & refactoring: checkpoint/restart and user-specified training time range by @rouson in #79
- feat(app): train in strided epochs by @rouson in #81
- Reshuffle training data for each epoch by @rouson in #82
- feat(app): 1st converging cloud microphysics model by @rouson in #83
- feat(example): train ICAR saturated mixing ratio by @rouson in #90
Documentation
- Add
inference_engine_t
class diagram by @kareem-weaver in #71
Bug fixes
- fix(setup.sh/run-fpm.sh): add -cpp flag by @rouson in #72
- fix(app): only read opened json file if 'old' by @rouson in #80
- chore(setup.sh): rm homebrew installation of cmake by @rouson in #84
Examples and Tests
- Train linear combinations of inputs by @rouson in #88
- Training examples: Learn math operations and a function from a cloud microphysics model by @rouson in #89
- Train identity network from identity with 10% maximum random perturbations by @rouson in #86
- test(train):test near radius of non-convergence by @rouson in #87
New Contributors
- @kareem-weaver made their first contribution in #71
Full Changelog: 0.7.0...0.8.0
0.7.0 Train deep networks
This is the first release containing passing unit tests that train deep neural networks. The release uses a new implementation of the mini-batch back propagation algorithm originally developed by @davytorres and refactored by @rouson. The new algorithm uses arrays structured differently from previous versions of Inference-Engine. In this release, every procedure has been refactored to eliminate all references to the previous array structures. This refactoring also results in considerable speedup of the test suite.
What's Changed
- Fix terminology by @rouson in #59
- Fix construction from json by @rouson in #60
- Add copyright statements by @rouson in #62
- ci: use newer version of gcc by @everythingfunctional in #64
- Train deep networks by @rouson in #63
- Make trainable_engine_t independent by @rouson in #65
- Refactor tests to use the new data structure by @rouson in #66
- Refactor: remove legacy component arrays in
inference_engine_t
by @rouson in #67 - Remove inference strategies and integrate netCDF file I/O into library & test suite by @rouson in #68
Full Changelog: 0.6.2...0.7.0