[FEATURE] add Lxmert model #1292

zhtmike · 2025-09-17T08:32:49Z

What does this PR do?

Fixes # (issue)

Adds # (feature)
Add Lxmert model

>>> from transformers import AutoTokenizer
>>> from mindone.transformers import LxmertForQuestionAnswering
>>> import mindspore as ms

>>> tokenizer = AutoTokenizer.from_pretrained("unc-nlp/lxmert-base-uncased", revision="refs/pr/3")
>>> model = LxmertForQuestionAnswering.from_pretrained("unc-nlp/lxmert-base-uncased", revision="refs/pr/3")

>>> question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

>>> inputs = tokenizer(question, text, return_tensors="np")
>>> for k, v in inputs.items():
...     inputs[k] = ms.tensor(v)

>>> outputs = model(**inputs)

>>> answer_start_index = outputs.start_logits.argmax()
>>> answer_end_index = outputs.end_logits.argmax()

>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
>>> tokenizer.decode(predict_answer_tokens, skip_special_tokens=True)

>>> # target is "nice puppet"
>>> target_start_index = ms.tensor([14])
>>> target_end_index = ms.tensor([15])

>>> outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index)
>>> loss = outputs.loss
>>> round(loss.item(), 2)

will encounter same error as huggingface/transformers#7266 (comment)

need to migrate demo to validate end-to-end accuracy: https://github.com/huggingface/transformers-research-projects/tree/main/lxmert

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
documentation guidelines
Did you build and run the code without any errors?
Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

zhtmike · 2025-09-23T02:23:13Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces the Lxmert model, a multi-modal transformer for language and vision, ported from Hugging Face's PyTorch implementation to MindSpore. The core model logic and auto-model registration are included. However, the review identified several critical issues stemming from the direct porting of PyTorch-specific code, which will lead to runtime errors. These include incorrect weight initialization and tensor manipulation syntax. Additionally, the accompanying tests are incomplete, covering only the base model and not the heads for pre-training or question answering. I have provided detailed comments and suggestions to address these issues, fix the bugs, and enhance the overall code quality and test coverage.

mindone/transformers/models/lxmert/modeling_lxmert.py

tests/transformers_tests/models/lxmert/test_modeling_lxmert.py

mindone/transformers/models/lxmert/modeling_lxmert.py

tests/transformers_tests/models/lxmert/test_modeling_lxmert.py

zhtmike added 3 commits September 17, 2025 15:36

add lxmert

9103acc

add UT

f01e34a

fix typo

d8eae56

zhtmike added the feature request Add new features label Sep 17, 2025

add license

c8f34cb

zhtmike marked this pull request as ready for review September 18, 2025 09:04

zhtmike requested a review from vigo999 as a code owner September 18, 2025 09:04

Merge branch 'master' into lxmert

87a56df

zhtmike added new model add new model to mindone and removed feature request Add new features labels Sep 23, 2025

zhtmike self-assigned this Sep 23, 2025

Merge branch 'master' into lxmert

750a12d

gemini-code-assist bot reviewed Sep 23, 2025

View reviewed changes

vigo999 added this to mindone Sep 29, 2025

vigo999 moved this to In Progress in mindone Sep 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] add Lxmert model #1292

[FEATURE] add Lxmert model #1292

Uh oh!

zhtmike commented Sep 17, 2025 •

edited

Loading

Uh oh!

zhtmike commented Sep 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[FEATURE] add Lxmert model #1292

Are you sure you want to change the base?

[FEATURE] add Lxmert model #1292

Uh oh!

Conversation

zhtmike commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

zhtmike commented Sep 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zhtmike commented Sep 17, 2025 •

edited

Loading