-
Notifications
You must be signed in to change notification settings - Fork 310
ADD RWKV7 #2421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
ADD RWKV7 #2421
Conversation
Summary of ChangesHello @pass-lin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR introduces the RWKV-7 model, a powerful RNN architecture, to keras_hub. The contribution is significant and includes the backbone, tokenizer, preprocessor, an incomplete task model, and a checkpoint conversion script. The implementation follows the modular structure of keras_hub.
However, there are several critical issues that must be addressed before this PR can be merged:
- Missing Tests: The PR lacks unit tests for all new components. According to the contribution guidelines, testing is a mandatory requirement.[^1]
- Incomplete
CausalLMTask: TheRWKV7CausalLMtask model is a stub withTODOs, making it non-functional for generation. - Critical Bugs: There are critical bugs in the tokenizer and preprocessor implementations that will cause runtime errors.
- Style Guide Violations: There are numerous style guide violations, including a filename typo, missing docstrings, and inconsistencies with the recommended model input structure.
I've left detailed comments on these issues. Once these are resolved, this will be a great addition to the library.
|
@divyashreepathihalli @mattdangerw Anybody review my code? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive implementation of the RWKV-7 model, a modern RNN architecture, into keras_hub. The contribution is well-structured, following the repository's modular design with a backbone, causal LM task, preprocessor, tokenizer, and a checkpoint conversion script. The code is generally of high quality.
My review highlights a few areas for improvement. There are critical bugs in the tokenizer's asset loading and saving logic that need to be addressed. The custom layers in rwkv7_layer.py have inconsistent return signatures, which could lead to runtime errors and makes the code harder to maintain. Additionally, the test files do not use the standardized helper methods from TestCase as required by the repository's style guide, which is a significant deviation. I have also pointed out some minor typos and opportunities for code clarification.
Overall, this is an excellent and valuable addition. Addressing the feedback will improve the robustness, correctness, and maintainability of the new model.
|
@mattdangerw @divyashreepathihalli @sachinprasadhs |
|
Apologies for the delay in review, taking a look into this. Will add my comments. |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the RWKV-7 model, a modern RNN architecture, to keras_hub. The implementation is comprehensive, covering the backbone, causal LM task, preprocessor, tokenizer, and a checkpoint conversion script. The code is well-structured and follows the modular design principles of the repository.
My review focuses on ensuring adherence to the repository's style guide, particularly regarding testing practices and code style conventions. I've identified several areas for improvement:
- The testing for the new components should be updated to use the standardized helper methods from the base
TestCase. Some tests also contain incorrect assertions. - There are a few deviations from the coding style, such as the use of type hints in function signatures and a few hardcoded values that could be made more flexible.
- The backbone implementation should be updated to accept a
padding_maskas input, aligning with the repository's conventions.
Addressing these points will improve the consistency, correctness, and maintainability of the new model. Overall, this is a great contribution, adding a powerful and interesting architecture to the library.
Finally, please allow me to add one more point. From February 26, 2025, to today, RWKV-LM has gained 1,000 stars (from 13.1k to 14.1k). It has also increased by 100 stars from October 16 to today. You can see this trend of star growth at the following link. I believe this demonstrates that RWKV is a very popular and highly active community. |
|
BTW, RWKV is also mentioned here by the Linux Foundation. |
|
Thank you @pass-lin!! |
I want to know if the Keras team thinks RWKV is suitable to be merged into Keras Hub. |
|
@pass-lin , We can add it since there is already a lot of efforts involved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still there are many unresolved comments, please go though them carefully and let us know once this is ready for review again.
Also, match the coding style to Keras Hub standard implementation.
refer our Model and contribution guidelines.
|
Following your review, I readjusted rwkv7_tokenizer. Please note that I retained the recursion, as rashly modifying the recursive code written by the original rwkv author would be too bug-prone. Given that the trie tree's maximum depth in the rwkv vocabulary is 80, stack overflow is unlikely, so I believe it should be kept. |
|
Okay, also once you address any comment mark the comment as resolved |
I think all the current issues have been resolved, and we can proceed to the next step. |
|
There is an option/button as "Resolve conversation" for each review, could you please click that if the comment is resolved there are 100 plus comments and many are still showing open. |
Ok, i have resolved all conversations |
|
@sachinprasadhs Can you review the current code? |
RWKV7 is one of the strongest RNN models available today, and we now provide a full implementation for it in keras_hub.
📚 References
🔗 Pre-trained Checkpoints (ModelScope)
Numerical-verification and Inference Example notebook
This is the first modern RNN architecture in keras_hub. With the resurgence of recurrent models, more pre-trained RNN backbones will follow; hence this PR also serves as a reference implementation for future work.
Current progress