|
23071 | 23071 | - filename: Gemma-3-The-Grand-Horror-27B-Q4_k_m.gguf |
23072 | 23072 | sha256: 46f0b06b785d19804a1a796bec89a8eeba8a4e2ef21e2ab8dbb8fa2ff0d675b1 |
23073 | 23073 | uri: huggingface://DavidAU/Gemma-3-The-Grand-Horror-27B-GGUF/Gemma-3-The-Grand-Horror-27B-Q4_k_m.gguf |
| 23074 | +- !!merge <<: *qwen3 |
| 23075 | + name: "qwen3-nemotron-32b-rlbff-i1" |
| 23076 | + urls: |
| 23077 | + - https://huggingface.co/mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF |
| 23078 | + description: | |
| 23079 | + **Model Name:** Qwen3-Nemotron-32B-RLBFF |
| 23080 | + **Base Model:** Qwen/Qwen3-32B |
| 23081 | + **Developer:** NVIDIA |
| 23082 | + **License:** NVIDIA Open Model License |
| 23083 | + |
| 23084 | + **Description:** |
| 23085 | + Qwen3-Nemotron-32B-RLBFF is a high-performance, fine-tuned large language model built on the Qwen3-32B foundation. It is specifically optimized to generate high-quality, helpful responses in a default thinking mode through advanced reinforcement learning with binary flexible feedback (RLBFF). Trained on the HelpSteer3 dataset, this model excels in reasoning, planning, coding, and information-seeking tasks while maintaining strong safety and alignment with human preferences. |
| 23086 | + |
| 23087 | + **Key Performance (as of Sep 2025):** |
| 23088 | + - **MT-Bench:** 9.50 (near GPT-4-Turbo level) |
| 23089 | + - **Arena Hard V2:** 55.6% |
| 23090 | + - **WildBench:** 70.33% |
| 23091 | + |
| 23092 | + **Architecture & Efficiency:** |
| 23093 | + - 32 billion parameters, based on the Qwen3 Transformer architecture |
| 23094 | + - Designed for deployment on NVIDIA GPUs (Ampere, Hopper, Turing) |
| 23095 | + - Achieves performance comparable to DeepSeek R1 and O3-mini at less than 5% of the inference cost |
| 23096 | + |
| 23097 | + **Use Case:** |
| 23098 | + Ideal for applications requiring reliable, thoughtful, and safe responses—such as advanced chatbots, research assistants, and enterprise AI systems. |
| 23099 | + |
| 23100 | + **Access & Usage:** |
| 23101 | + Available on Hugging Face with support for Hugging Face Transformers and vLLM. |
| 23102 | + **Cite:** [Wang et al., 2025 — RLBFF: Binary Flexible Feedback](https://arxiv.org/abs/2509.21319) |
| 23103 | + |
| 23104 | + 👉 *Note: The GGUF version (mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF) is a user-quantized variant. The original model is available at nvidia/Qwen3-Nemotron-32B-RLBFF.* |
| 23105 | + overrides: |
| 23106 | + parameters: |
| 23107 | + model: Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf |
| 23108 | + files: |
| 23109 | + - filename: Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf |
| 23110 | + sha256: 000e8c65299fc232d1a832f1cae831ceaa16425eccfb7d01702d73e8bd3eafee |
| 23111 | + uri: huggingface://mradermacher/Qwen3-Nemotron-32B-RLBFF-i1-GGUF/Qwen3-Nemotron-32B-RLBFF.i1-Q4_K_M.gguf |
0 commit comments