-
Notifications
You must be signed in to change notification settings - Fork 3.5k
calling iter twice messes up dataloaders with queues #19427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This condition here is meant to prevent the pytorch-lightning/src/lightning/pytorch/loops/training_epoch_loop.py Lines 169 to 171 in 47c8f4c
But it isn't. The problem is that the fit loop sets pytorch-lightning/src/lightning/pytorch/loops/fit_loop.py Lines 123 to 128 in 47c8f4c
This is tricky to solve @carmocca. The logic probably needs to be lifted up into the fit loop before |
I didn't look too deeply. Couldn't we check |
The problem in the |
Also since this has appeared twice now, and its the sort of bug which is hard to track down could we add a test like my example? |
Hey all, I went ahead and created the test for this #20705. Would love to hear any feedback :) |
I've been looking at this issue a bit, and I could use some help understanding what's needed. I think "calling iter twice" isn't necessarily a problem since any number of functions are calling it and the underlying data from the queue looks correct. What seems to be the issues are: 1) recognizing the empty queue as the end of the epoch and 2) aligning the epochs with the queue data. Ie, i would say, "this actually looks okay but we should end processing smoothly when the queue is exhausted." But I could be wrong? |
Bug description
This bug has reappeared #18414
We now call iter() twice in different places:
What version are you seeing the problem on?
v2.1
How to reproduce the bug
Error messages and logs
relevant logs are:
Environment
lighting==2.1.4
More info
No response
cc @justusschock @awaelchli @carmocca
The text was updated successfully, but these errors were encountered: