Comments

When I first looked at the source codes, I thought this code must be slow. I planned to do some modification myself to see if I can rewrite the code with much faster speed and clearer code structure. I failed at the first time using class objects, because I never realized that the handle objects are such slow, especially in nested loops. Of course I could tune the code and improve it, but I finally decided to use the simpler but similar struct type to deal with the problem.

Lesson from failure

Why is my first OOP version so slow? Checking the restriction of class properties takes time; accessing and overwrite properties takes significant time.
Using handle classes may not be a good idea if you want performance. I tried because it seems to be a clever way of mimicking pointers to pass arguments, but it turns out to be extremely inefficient.
So, let's forget about OOP in MATLAB simulations...which is sad.
I thought it should be easy to convert this to a GPU-enabled code, but obviously I was wrong. Because the functions like update velocity loop over all the particles one by one, making the velocities and positions into gpuArray doesn't help much. It would only benefit if I am doing matrix operations!

Improvements

Remove all the global variables, replace by structs;
Put all the function into the private directory;
Improved parameter reading function, with textscan instead of textread and strread, and str2double, strtrim;
Clearer comment style;
Rename almost all the variables and functions following naming standards;
Replace the old GUI designed by GUIDE with a new one by AppDesigner.
Rewrite the kernel of velocity update and current calculation: replace the for loop with matrix operations. For the velocity update, the results are exactly the same; for the current calculation, I use Matlab built-in function accumarray, which give slightly different results for current down to machine precision for each timestep. I don't know the exact reason behind this, but I think it is due to the multithread usage of the built-in function, such that the sequence of reduction changes. Both of the new functions have about 5 to 10 times speedup.

Side notes

JIT is smart enough to handle ^2 as simple multiplication. In fact, what really happens underneath is that during the compilation phase, ^ will be converted into multiplications, as is the case in Julia for power less than or equal to 3.
rotate3d should be set only at the beginning, otherwise it will significantly affect the performance.

Copyrights

The original KEMPO1 version can be downloaded from here.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
init		init
src		src
src_oop		src_oop
src_original		src_original
.gitignore		.gitignore
KEMPO1 93Omura.pdf		KEMPO1 93Omura.pdf
LICENSE.txt		LICENSE.txt
README.md		README.md
isss-13_Omura.pptx		isss-13_Omura.pptx
kempo1_book.pdf		kempo1_book.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comments

Lesson from failure

Improvements

Side notes

Copyrights

About

Uh oh!

Releases

Packages

Languages

License

hertz120/PIC1D

Folders and files

Latest commit

History

Repository files navigation

Comments

Lesson from failure

Improvements

Side notes

Copyrights

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages