Hi LCM team,
I've been exploring the large_concept_model repository and noticed the implementations of BaseLCM, Single Tower Diffusion, and Two Tower Diffusion models. I am curious if the team has experimented with or considered other Transformer variants, such as recursive transformers or their recent evolutions like Relaxed Recursive Transformers or Mixture-of-Recursions models?
Given the efficiency and parameter-sharing advantages of recursive transformers researched in recent papers, I wonder if they could be beneficial or are planned for this codebase.
Would appreciate any insights or pointers regarding this!
Thank you for your work on this great project.
Hi LCM team,
I've been exploring the large_concept_model repository and noticed the implementations of BaseLCM, Single Tower Diffusion, and Two Tower Diffusion models. I am curious if the team has experimented with or considered other Transformer variants, such as recursive transformers or their recent evolutions like Relaxed Recursive Transformers or Mixture-of-Recursions models?
Given the efficiency and parameter-sharing advantages of recursive transformers researched in recent papers, I wonder if they could be beneficial or are planned for this codebase.
Would appreciate any insights or pointers regarding this!
Thank you for your work on this great project.