-
-
Notifications
You must be signed in to change notification settings - Fork 46.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add connectionist temporal classification (CTC) loss algorithm #11240
base: master
Are you sure you want to change the base?
Conversation
Hi there, I just wanted to thank the maintainers for their hard work on the project, and I wanted to let them know that I submitted a few pull requests in the last few days. I would really appreciate it if one of them could take a look at them and let me know if there are any issues or if there's anything that needs to be fixed before they can be merged. Thanks again for all your hard work! @cclauss |
Calculate the connectionist temporal classification (CTC) loss between the given | ||
log probabilities and targets. | ||
|
||
CTC loss is used in speech recognition, handwriting recognition and other sequence | ||
problems. It's used to get around not knowing the alignment between the input and | ||
the output. | ||
|
||
References: | ||
- https://en.wikipedia.org/wiki/Connectionist_temporal_classification | ||
- https://pytorch.org/docs/stable/generated/torch.nn.CTCLoss.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple suggestions to clarify the documentation:
-
PyTorch cites Graves et al for its implementation. Since your implementation is also based on this paper, please add it as a reference as well.
-
Please add a short paragraph explaining how the loss is actually calculated. This should explain some general questions about your variables (What is
blank
? What isalpha
, and why is it calculated using DP?) so that the reader understands what is being calculated. I ask for this because this repository is meant for educational purposes, so we want readers to understand how and why the implementation works.Also, in your implementation you use
np.logaddexp
for log-probabilities when calculatingalpha
rather than calculating probabilities directly. Since this differs from the definitions in the Graves et al paper, please be sure to note this implementation detail in your explanation.
Describe your change:
Checklist: