Skip to content

QWEN3 - MOE structure - not working ? #582

@David-AU-github

Description

@David-AU-github

Hi;

Tried to build:

base_model: D:/Qwen3-0.6B
gate_mode: random
architecture: qwen
dtype: bfloat16
experts_per_token: 2
experts:

  • source_model: D:/qwen3-0.6b-creative-writing
  • source_model: D:/Qwen3-0.6B-dreamwriter-0.6b-beta
  • source_model: D:/Qwen3-0.6B-Konjac-0.6B
  • source_model: D:/Qwen3-0.6B-Josiefied-Qwen3-0.6B-abliterated-v1
  • source_model: D:/Qwen3-0.6B-Qwill-0.6B-IT-FULL
  • source_model: D:/qwen3-0.6b-writing

shared_experts:

  • source_model: D:/Qwen3-0.6B

Using:

mergekit-moe --clone-tensors f:/mergefiles/Qwen3-MOE-6X0.6B-Creative.txt e:/Qwen3-MOE-6X0.6B-Creative

ERROR:

ERROR:root:No output architecture found that is compatible with the given models.
ERROR:root:All supported output architectures:
ERROR:root: * Mixtral
ERROR:root: * DeepSeek MoE
ERROR:root: * Qwen MoE

NOTE:
Not sure if Qwen3 supports "MOE" structure composed of "0.6B" Qwen 3 models. (??)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions