Skip to content

Conversation

@michaelfeil
Copy link
Contributor

@michaelfeil michaelfeil commented Nov 21, 2025

What does this PR do?

What happend:
Qwen2Flash was written to support only Alibaba-NLP/gte-Qwen2-1.5B-instruct, which was trained and has reference inference code with causal=False (aka use_bidirectional_attention=True). Later on I asked to add a flag in the config to actually read this.
https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/discussions/28

Now, we have also newer models, e.g. jina which are built on qwen2 but with:

The new models are currently loaded, but the output embeddings will be incorrect.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • X ] Did you read the contributor guideline?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines.
  • Did you write any new necessary tests? If applicable, did you include or update the insta snapshots?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@michaelfeil
Copy link
Contributor Author

@codex review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant