Problem
When testing LEM on a single rtx4090 I cannot find a combination of num_cpus and orientation_batch_size that quite saturates the GPU based on gpu utilization as observed using nvidia-smi
Best config so far
- orientation_batch_size=48
- num_cpu=[ this doesn't seem to have an effect ]
I'm using the prebuilt version installed using pip in a python3.10 venv as suggested at the homepage under the pre-package releases.
- As an aside, it would be cool if the example code blocks were copyable. There are many mechanisms to do this, e.g.

Questions
- What is the orientation_batch_size you typically recommend?
- Is this just sending a stack of images off in a batch to pyTorch? Does that make the answer in the last question effectively "As many as will fit in memory."
- Is num_cpus actually doing anything right now? It doesn't seem to be based on either my hardware resource monitors, or what I can make out from the code.
Any tips or tricks on how to get the most out of a given hardware setup would be very helpful. Thanks!
Problem
When testing LEM on a single rtx4090 I cannot find a combination of num_cpus and orientation_batch_size that quite saturates the GPU based on gpu utilization as observed using nvidia-smi
Best config so far
I'm using the prebuilt version installed using pip in a python3.10 venv as suggested at the homepage under the pre-package releases.
Questions
Any tips or tricks on how to get the most out of a given hardware setup would be very helpful. Thanks!