-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use OpenMP with SIMD or LLAMA_INDEPENDENT_DATA
- Loading branch information
1 parent
dba1158
commit 14080fc
Showing
3 changed files
with
54 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# n-body simulation | ||
|
||
This example shows an n-body simulation, comparing a LLAMA implementation with manually written versions. | ||
|
||
## OpenMP | ||
|
||
All kernels can be run scalar or using multithreading+SIMD. | ||
For the latter, enable `LLAMA_NBODY_OPENMP` in cmake, | ||
and specify OpenMP thread affinity when executing: | ||
`OMP_NUM_THREADS=x OMP_PROC_BIND=true OMP_PLACES=cores llama-nbody`, | ||
where x is the number of cores on your system. | ||
|
||
## rsqrt | ||
|
||
The use of the `rsqrt` instruction is disabled, | ||
which is required for comparable benchmarks (so all versions use sqrt and divison). | ||
You can enable `rsqrt` by setting the appropriate variable in the C++ source file, | ||
and compile with `-ffast-math`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters