Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow initial runs followd by blazing fast computation #8415

Closed
h-OUS-e opened this issue Oct 12, 2024 · 10 comments
Closed

Slow initial runs followd by blazing fast computation #8415

h-OUS-e opened this issue Oct 12, 2024 · 10 comments

Comments

@h-OUS-e
Copy link

h-OUS-e commented Oct 12, 2024

Hello,

I am having this issue where I run an initial computation, basic multiplication of matrices using TFJS. On the first run, the computation is extremely slow, taking 800ms. On the second run, it takes 5ms to run. I mocked up a simpler version for debugging and I get ~90ms on the first run and 1ms on subsequent runs. The JS code is here:

document.getElementById("testOptim").addEventListener('click', function(){
console.time('testOptim: ');
let a = tf.randomNormal([15, 3]);
let b = tf.randomNormal([15, 15]);
let c = tf.matMul(b, a);
console.timeEnd('testOptim: ');
});

And I am using latest tfjs module: <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest/dist/tf.min.js"></script>

Based on previous reported issues, I suspect the problem is with how initial caching of kernels is slow, but I couldn't find a good solution to fix it. Any directions or thoughts would be appreciated!

@shmishra99
Copy link
Contributor

Hi @h-OUS-e,

I was able to replicate the code you shared. The initial slowness is due to the loading and compiling of the kernel. Once the kernel is loaded and compiled, it will be cached for use.

Code:

console.time('testOptim1: ');
let a = tf.randomNormal([15, 3]);
let b = tf.randomNormal([15, 15]);
let c = tf.matMul(b, a);
console.timeEnd('testOptim1: ');

console.time('testOptim: 2 ');
a = tf.randomNormal([16, 4]);
b = tf.randomNormal([16, 16]);
c = tf.matMul(b, a);
console.timeEnd('testOptim: 2 ');

console.time('testOptim: 3 ');
a = tf.randomNormal([17, 4]);
b = tf.randomNormal([17, 17]);
c = tf.matMul(b, a);
console.timeEnd('testOptim: 3 ');

console.time('testOptim: 4 ');
a = tf.randomNormal([18, 5]);
b = tf.randomNormal([18, 18]);
c = tf.matMul(b, a);
console.timeEnd('testOptim: 4 ');

Results:

image

From the results, you can see that the first run (testOptim1:) takes longer compared to the subsequent runs (testOptim: 2, testOptim: 3, and testOptim: 4). This is due to the kernel loading and compiling, while the later multiplications take less time because the results are cached.
To improve performance, you could consider registering the kernel during tfjs initialization by loading dummy tensors.

Let me know if this helps!

Thank you!

@h-OUS-e
Copy link
Author

h-OUS-e commented Oct 17, 2024

Thank you @shmishra99. Is the loading and compiling of the kernel always slow? Preloading dummy tensors on initialization seemed to solve the issue. But for my main application, the range of possible tensors to be multiplied with each other is too large, and loading them all would make the initial loading of the page very slow. I basically need to load every possible dummy tensor, where the largest would be of a shape of [200,200] or more. Is there a better solution?

For now, I set the backend to the cpu using tf.setBackend('cpu');, and to my surprise it has been very fast, which appears to work for the needs of my application. But beyond that, I remain curious if there is a way to still utilize the GPU without the slow loading and compiling time of the kernels?

Thanks!

@shmishra99
Copy link
Contributor

@h-OUS-e , As per my understanding, kernels will need to be loaded at some point, whether it's during initial page load or when tensors are executed. While the CPU backend might perform well in terms of kernel loading, GPU backends are generally faster for tensor operations.
Thank You!!

@h-OUS-e
Copy link
Author

h-OUS-e commented Oct 18, 2024 via email

@shmishra99
Copy link
Contributor

@h-OUS-e , To preload the tensors, you can use the following code:

tf.ready().then(() => {
  const LoadTensor = tf.randomNormal([n, n]);
  LoadTensor.data();
});

However, I will say that this approach is not highly recommended, as it may slow down the initial page loading time.

Please let me know if this helps. Thank you!

@h-OUS-e
Copy link
Author

h-OUS-e commented Oct 19, 2024

Thank you @shmishra99. It helps but as you mentioned, produces slow initial page loading times. Are there other solutions? I checked tf.env().set('ENGINE_COMPILE_ONLY', true);, but I couldn't get it to produce meaningful results.

@shmishra99
Copy link
Contributor

Hi @h-OUS-e ,

Sorry for the late response. As I understand it, loading and compiling the kernel will take some time in all cases, so the app will experience some latency during page load or afterward.

Thank You!!

Copy link

github-actions bot commented Nov 5, 2024

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Nov 5, 2024
Copy link

This issue was closed due to lack of activity after being marked stale for past 7 days.

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants