Could you add support for setting a batch size, so that it can work on multiple frames simultaneously instead of only doing every frame one by one? I notice that there is quite low GPU and VRAM usage while this is running, so batching could significantly increase the speed of it.