Skip to content

Conversation

@wp4032
Copy link

@wp4032 wp4032 commented Nov 21, 2025

RND1 works on llama.cpp, can run:

llama-diffusion-cli -m RND1-Base-0910.gguf -p "write code to train MNIST in pytorch" -ub 256 --diffusion-algorithm 1 --diffusion-steps 256 --diffusion-visual --temp 0.5

Model Card

https://huggingface.co/radicalnumerics/RND1-Base-0910

Instructions

# Create conda env
cd llama.cpp && conda create --name rnd1 python=3.12
conda activate rnd1
pip install -r requirements.txt

# Converting to gguf
huggingface-cli download radicalnumerics/RND1-Base-0910 --local-dir RND1-Base-0910
python llama.cpp/convert_hf_to_gguf.py RND1-Base-0910/ --outfile RND1-Base-0910.gguf --outtype bf16 

# Building diffusion cli
cmake -B build    # Will build with Metal automatically
cmake --build llama.cpp/build --target llama-diffusion-cli -j
llama.cpp/build/bin/llama-diffusion-cli -m RND1-Base-0910.gguf -p "What is a GPU?" -ub 32 --temp 0.01 -ngl 999 -fa on --seed 1234 --verbose

Results

Works on GB200, non-causal, BF16, 512 context len, 256 diffusion steps:

total time: 55793.55ms, time per step: 217.94ms, sampling time per step: 52.18ms

Works on H100, non-causal, BF16, 512 context len, 256 diffusion steps:

total time: 55602.47ms, time per step: 217.20ms, sampling time per step: 63.52ms

Works on Mac Studio M3 Ultra, non-causal, BF16, 512 context len, 256 diffusion steps:

total time: 65817.30ms, time per step: 257.10ms, sampling time per step: 16.65ms

@github-actions github-actions bot added model Model specific examples python python script changes labels Nov 21, 2025
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you!

@am17an
Copy link
Collaborator

am17an commented Nov 22, 2025

Please update or close the issue with a comment #17291

@wp4032
Copy link
Author

wp4032 commented Nov 22, 2025

Please update or close the issue with a comment #17291

Regarding this, I am still looking for the cause of the issue, however, I am quite confident it is not to do with RND1's implementation. If you really want me to close the issue #17291, I can.

@am17an
Copy link
Collaborator

am17an commented Nov 23, 2025

Regarding this, I am still looking for the cause of the issue, however, I am quite confident it is not to do with RND1's implementation. If you really want me to close the issue #17291, I can.

I don't really want you to close the issue, however if it is a correctness bug as you mention there then the outputs for RND1 will also suffer from the same issue, I'm not sure why you are saying it is not to do with RND1's implementation as that is the same implementation as Qwen2. I would rather see that resolved before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples model Model specific python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants