Has anyone recorded the loss curve during clipcap training? I used clipcap as a basline in my experiments and found that the loss curve was always far from ideal. Then I measured the loss curve during clipcap training and found that he was not ideal either.
