Skip to content

Commit

Permalink
Release PyPWA 3.3.0, with GPU acceleration
Browse files Browse the repository at this point in the history
This update is a series of changes that should have been their own
seperate commits, but one thing lead to another, and now they're a
massive single commit.

- 2D Gauss Tutorial: This is an introductory tutorial on how to use the
  components of PyPWA without getting too deep into complex amplitudes.
- Overhauled particles: Particles no longer infer their charge from the
  particle's ID, because there are different particle IDs depending on
  who's format you use. Now, any particle that hasn't been encountered
  before will be labeled as "Unknown" but will still function as a
  normal ID.
- CUDA: I've added CUDA support by including CuPy. CuPy is an optional
  component, but should be included automatically if you installed the
  package from Anaconda. This gives you all the benefits of computing
  on the GPU without the headache of tackling the CUDA development.
  Likelihoods and Simulate have already been adjusted to handle CuPy
  configured kernels, and the 2D Gauss contains an CuPy example
  amplitude to demostrate how it functions.
  • Loading branch information
markjonestx committed Jun 21, 2021
1 parent a6ed803 commit 5976ff7
Show file tree
Hide file tree
Showing 20 changed files with 1,151 additions and 84 deletions.
21 changes: 20 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,24 @@ All changes important to the user will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/)
and this project adheres to [Semantic Versioning](http://semver.org/)

## [3.3.0] - 2021-6-20
### Added
- 2D Gauss introductory tutorial to the documentation
- CuPy support for Likelihoods and Simulation. This means we now officially
support NVIDIA GPU acceleration, however for now it is limited to a single
GPU. If there is enough demand for this to be expanded on, support for
multiple GPUs will be added.
### CHanged
- Particle now requires a charge to be supplied during the creation of the
object. GAMP has also been modified to support the Charge being passed
through to the Particle
- Depreciated internal options that were passed to Minuit have been
replaced with the modern alternatives.
## Fixed
- Likelihoods were spawning multiple processes even when USE_MP was set
to false. This has been corrected, and will avoid spawning extra
processes as it was intended.

## [3.2.3] - 2021-6-11
### Added
- Particle Pools can now compared against other Particle Pools to see if they
Expand Down Expand Up @@ -212,7 +230,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/)
- PySim plugin
- Packaging

[Unreleased]: https://github.com/JeffersonLab/PyPWA/compare/v3.2.3...main
[Unreleased]: https://github.com/JeffersonLab/PyPWA/compare/v3.3.0...main
[3.3.0]: https://github.com/JeffersonLab/PyPWA/compare/v3.2.2...v.3.2.3
[3.2.3]: https://github.com/JeffersonLab/PyPWA/compare/v3.2.2...v.3.2.3
[3.2.2]: https://github.com/JeffersonLab/PyPWA/compare/v3.2.1...3.2.2
[3.2.1]: https://github.com/JeffersonLab/PyPWA/compare/v3.2.0...v3.2.1
Expand Down
7 changes: 6 additions & 1 deletion PyPWA/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,10 @@
- ProjectDatabase: A numerical database based off of HDF5 that allows for
working with data larger than memory. Only recommended if you have
to use it.
- to_contiguous: Converts an numpy array to contiguous arrays to be used
with Cython modules.
- pandas_to_numpy: Converts a pandas dataframe to a contiguous numpy
structured array.
- cache.read: Reads the cache for a specific source file, or for an
intermediate step.
- cache.write: Writes the cache for a specific source file, or for an
Expand All @@ -84,6 +88,7 @@
from PyPWA import info as _info
from PyPWA.libs import simulate
from PyPWA.libs.binning import bin_by_range, bin_with_fixed_widths, bin_by_list
from PyPWA.libs.common import to_contiguous, pandas_to_numpy
from PyPWA.libs.file import (
get_reader, get_writer, read, write, ProjectDatabase, cache, DataType
)
Expand All @@ -102,7 +107,7 @@
"monte_carlo_simulation", "minuit", "ChiSquared", "LogLikelihood",
"EmptyLikelihood", "NestedFunction", "FunctionAmplitude", "cache",
"ResonanceData", "bin_by_range", "bin_with_fixed_widths", "make_lego",
"simulate", "DataType"
"simulate", "DataType", "to_contiguous", "pandas_to_numpy"
]

__author__ = _info.AUTHOR
Expand Down
2 changes: 1 addition & 1 deletion PyPWA/info.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
__credits__ = ["Mark Jones"]

AUTHOR = "PyPWA Team and Contributors"
VERSION = "3.2.3"
VERSION = "3.3.0"
RELEASE = f"{VERSION}"
LICENSE = "GPLv3"
STATUS = "development"
11 changes: 8 additions & 3 deletions PyPWA/libs/file/project/_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,10 +60,14 @@ class ParticleLeaf:
def __init__(self, leaves: List[tables.Table]):
self.__leaves = leaves
self.__ids = [int(pid.name.split("_")[1]) for pid in leaves]
self.__charges = [int(pid.name.split("_")[2]) for pid in leaves]
# negative values can't be stored as metadata, so we need to
# shift the charges back into the correct range
self.__charges = [charge - 1 for charge in self.__charges]

particles = []
for pid in self.__ids:
particles.append(vectors.Particle(pid, 1))
for pid, charge in zip(self.__ids, self.__charges):
particles.append(vectors.Particle(pid, charge, 1))

self.__pool = vectors.ParticlePool(particles)

Expand Down Expand Up @@ -194,7 +198,8 @@ def __update_particle_pool(self, data: List[npy.ndarray]):
particle.e = array["e"]

def __replace_particle_pool(self, data: List[npy.ndarray]):
ps = [vectors.Particle(pid, d) for pid, d in zip(self.__ids, data)]
p_data = zip(self.__ids, self.__charges, data)
ps = [vectors.Particle(pid, charge, d) for pid, charge, d in p_data]
self.__pool = vectors.ParticlePool(ps)

@property
Expand Down
10 changes: 6 additions & 4 deletions PyPWA/libs/file/project/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def __init__(self, file: Union[Path, str], mode: str):
else:
self.__group = self.__file.create_group(
self.__file.root, "pypwa",
title=f"v:0; Created with PyPWA {_info.VERSION}"
title=f"v:1; Created with PyPWA {_info.VERSION}"
)

def __repr__(self):
Expand Down Expand Up @@ -221,10 +221,11 @@ def __reader_to_root_data(self, reader: ReaderBase, disable, desc):
def __parse_particle_pool(self, data: ReaderBase, disable: bool, desc):
# Initialize the tables for the particles
leaves = list()
for index, the_id in enumerate(data.fields):
# data.fields charge will already be in the correct 0-2 range
for index, (the_id, charge) in enumerate(data.fields):
leaves.append(
self.__file.create_table(
where=self.__folder, name=f"root{index}_{the_id}",
where=self.__folder, name=f"root{index}_{the_id}_{charge}",
description=self._PARTICLE, title=desc,
expectedrows=data.get_event_count()
)
Expand Down Expand Up @@ -266,7 +267,8 @@ def __parse_regular_data(self, data: ReaderBase, disable: bool, desc):
def __particle_pool_to_root(self, data: vectors.ParticlePool, desc):
for index, particle in enumerate(data.iter_particles()):
table = self.__file.create_table(
where=self.__folder, name=f"root{index}_{particle.id}",
where=self.__folder,
name=f"root{index}_{particle.id}_{particle.charge + 1}",
description=particle.data_frame, expectedrows=len(particle),
title=desc
)
Expand Down
95 changes: 70 additions & 25 deletions PyPWA/libs/fit/likelihoods.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,15 @@
from PyPWA import info as _info
from PyPWA.libs import process

# Handle GPU calculation being available... or not.
GPU_AVAIL = True
try:
import cupy as cp
except ImportError:
GPU_AVAIL = False
cp = npy


__credits__ = ["Mark Jones"]
__author__ = _info.AUTHOR
__version__ = _info.VERSION
Expand All @@ -48,14 +57,20 @@ class NestedFunction(ABC):
likelihood.
Set USE_MP to false to execute on the main thread only, this is best
for when using packages like numexpr
for when using packages like numexpr.
Set USE_GPU to calculate the likelihood entirely on the GPU. Assumes that
all data returned from the NestedFunction will already be in CuPy. If
this is set to True, then GPU acceleration will be used over
multiprocessing, and will effectively implicitly set USE_MP to false.
See Also
--------
FunctionAmplitude : For using the old amplitudes with PyPWA 3
"""

USE_MP = True
USE_GPU = False

def __call__(self, *args):
return self.calculate(*args)
Expand Down Expand Up @@ -157,11 +172,12 @@ class _GeneralLikelihood:
def __init__(self, amplitude: NestedFunction, num_of_process: int):
self._amplitude = amplitude
self._num_of_processes = num_of_process
self._single_process = amplitude.USE_GPU or not amplitude.USE_MP

def _setup_interface(
self, likelihood_data: Dict[str, Any], kernel: process.Kernel
):
if not self._amplitude.USE_MP or not self._num_of_processes:
if self._single_process or not self._num_of_processes:
[setattr(kernel, n, v) for n, v in likelihood_data.items()]
kernel.setup()
self._interface = kernel
Expand Down Expand Up @@ -305,19 +321,29 @@ def process(self, data: Any = False) -> float:
return self.__multiplier * self.__likelihood(intensity)

def __binned(self, results):
return ne.evaluate(
"sum(((results - binned)**2)/binned)", local_dict={
"results": results, "binned": self.__binned
}
)
if self.__amplitude.USE_GPU:
return cp.asnumpy(
cp.sum((results - self.binned)**2/self.binned)
)
else:
return ne.evaluate(
"sum(((results - binned)**2)/binned)", local_dict={
"results": results, "binned": self.binned
}
)

def __expected_errors(self, results):
return ne.evaluate(
"sum(((results - expected)**2)/errors)", local_dict={
"results": results, "expected": self.expected_values,
"errors": self.event_errors
}
)
if self.__amplitude.USE_GPU:
return cp.asnumpy(
cp.sum((results - self.expected_values)**2/self.event_errors)
)
else:
return ne.evaluate(
"sum(((results - expected)**2)/errors)", local_dict={
"results": results, "expected": self.expected_values,
"errors": self.event_errors
}
)


class LogLikelihood(_GeneralLikelihood):
Expand Down Expand Up @@ -462,20 +488,36 @@ def process(self, data: Any = False) -> float:
def __extended_likelihood(self, params):
data = self.__data_amplitude.calculate(params)
monte_carlo = self.__monte_carlo_amplitude.calculate(params)
likelihood = ne.evaluate(
"sum(qf * log(data))", local_dict={
"qf": self.quality_factor, "data": data
}
)
return likelihood - self.__generated * npy.sum(monte_carlo)

if self.__data_amplitude.USE_GPU:
likelihood = cp.asnumpy(
cp.sum(self.quality_factor * cp.log(data))
)
monte_carlo_sum = cp.asnumpy(cp.sum(monte_carlo))
else:
likelihood = ne.evaluate(
"sum(qf * log(data))", local_dict={
"qf": self.quality_factor, "data": data
}
)
monte_carlo_sum = npy.sum(monte_carlo)

return likelihood - self.__generated * monte_carlo_sum

def __log_likelihood(self, params):
data = self.__data_amplitude.calculate(params)
return ne.evaluate(
"sum(qf*binned*log(data))", local_dict={
"qf": self.quality_factor, "binned": self.binned, "data": data
}
)

if self.__data_amplitude.USE_GPU:
return cp.asnumpy(
cp.sum(self.quality_factor * self.binned * cp.log(data))
)

else:
return ne.evaluate(
"sum(qf*binned*log(data))", local_dict={
"qf": self.quality_factor, "binned": self.binned, "data": data
}
)


class EmptyLikelihood(_GeneralLikelihood):
Expand Down Expand Up @@ -536,4 +578,7 @@ def setup(self):
self.__amplitude.setup(self.data)

def process(self, data: Any = False) -> float:
return npy.sum(self.__amplitude.calculate(data))
if self.__amplitude.USE_GPU:
return cp.asnumpy(cp.sum(self.__amplitude.calculate(data)))
else:
return npy.sum(self.__amplitude.calculate(data))
2 changes: 1 addition & 1 deletion PyPWA/libs/fit/minuit.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def minuit(
that can be passed to iminuit, and how to use the resulting object
after a fit has been completed.
"""
settings["forced_parameters"] = parameters
settings["name"] = parameters
settings["errordef"] = set_up
translator = _Translator(parameters, likelihood)
optimizer = _iminuit.Minuit(translator, **settings)
Expand Down
2 changes: 1 addition & 1 deletion PyPWA/libs/process.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@
"""

import copy
import time
from abc import ABC, abstractmethod
from enum import Enum
from multiprocessing import cpu_count, Pipe, Process
Expand Down Expand Up @@ -73,6 +72,7 @@
class Kernel(ABC):

PROCESS_ID: int = 0
USE_GPU: bool = False

"""Kernel that will be placed inside each spawned process
Expand Down
10 changes: 9 additions & 1 deletion PyPWA/libs/simulate.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,12 @@
from PyPWA.libs.file import project
from PyPWA.libs.fit import likelihoods

try:
import cupy as cp
except ImportError:
cp = npy


__credits__ = ["Mark Jones"]
__author__ = _info.AUTHOR
__version__ = _info.VERSION
Expand Down Expand Up @@ -143,9 +149,11 @@ def _in_memory_intensities(
processes: int) -> npy.ndarray:

kernel = _Kernel(amplitude, params)
if not amplitude.USE_MP or not processes:
if not amplitude.USE_MP or not processes or amplitude.USE_GPU:
kernel.data = data
kernel.setup()
if amplitude.USE_GPU:
return cp.asnumpy(kernel.run()[1])
return kernel.run()[1]

interface = _Interface()
Expand Down
Loading

0 comments on commit 5976ff7

Please sign in to comment.