Skip to content

Commit cba9d1a

Browse files
localai-botmudler
andauthored
chore(model gallery): 🤖 add 1 new models via gallery agent (#6919)
chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <[email protected]>
1 parent dd21a0d commit cba9d1a

File tree

1 file changed

+38
-0
lines changed

1 file changed

+38
-0
lines changed

gallery/index.yaml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23071,3 +23071,41 @@
2307123071
- filename: Gemma-3-The-Grand-Horror-27B-Q4_k_m.gguf
2307223072
sha256: 46f0b06b785d19804a1a796bec89a8eeba8a4e2ef21e2ab8dbb8fa2ff0d675b1
2307323073
uri: huggingface://DavidAU/Gemma-3-The-Grand-Horror-27B-GGUF/Gemma-3-The-Grand-Horror-27B-Q4_k_m.gguf
23074+
- !!merge <<: *qwen3
23075+
name: "qwen3-nemotron-32b-rlbff-i1"
23076+
urls:
23077+
- https://huggingface.co/mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF
23078+
description: |
23079+
**Model Name:** Qwen3-Nemotron-32B-RLBFF
23080+
**Base Model:** Qwen/Qwen3-32B
23081+
**Developer:** NVIDIA
23082+
**License:** NVIDIA Open Model License
23083+
23084+
**Description:**
23085+
Qwen3-Nemotron-32B-RLBFF is a high-performance, fine-tuned large language model built on the Qwen3-32B foundation. It is specifically optimized to generate high-quality, helpful responses in a default thinking mode through advanced reinforcement learning with binary flexible feedback (RLBFF). Trained on the HelpSteer3 dataset, this model excels in reasoning, planning, coding, and information-seeking tasks while maintaining strong safety and alignment with human preferences.
23086+
23087+
**Key Performance (as of Sep 2025):**
23088+
- **MT-Bench:** 9.50 (near GPT-4-Turbo level)
23089+
- **Arena Hard V2:** 55.6%
23090+
- **WildBench:** 70.33%
23091+
23092+
**Architecture & Efficiency:**
23093+
- 32 billion parameters, based on the Qwen3 Transformer architecture
23094+
- Designed for deployment on NVIDIA GPUs (Ampere, Hopper, Turing)
23095+
- Achieves performance comparable to DeepSeek R1 and O3-mini at less than 5% of the inference cost
23096+
23097+
**Use Case:**
23098+
Ideal for applications requiring reliable, thoughtful, and safe responses—such as advanced chatbots, research assistants, and enterprise AI systems.
23099+
23100+
**Access & Usage:**
23101+
Available on Hugging Face with support for Hugging Face Transformers and vLLM.
23102+
**Cite:** [Wang et al., 2025 — RLBFF: Binary Flexible Feedback](https://arxiv.org/abs/2509.21319)
23103+
23104+
👉 *Note: The GGUF version (mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF) is a user-quantized variant. The original model is available at nvidia/Qwen3-Nemotron-32B-RLBFF.*
23105+
overrides:
23106+
parameters:
23107+
model: Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf
23108+
files:
23109+
- filename: Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf
23110+
sha256: 000e8c65299fc232d1a832f1cae831ceaa16425eccfb7d01702d73e8bd3eafee
23111+
uri: huggingface://mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF/Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf

0 commit comments

Comments
 (0)