-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add XLMRoberta in Embedding Train #10074
base: develop
Are you sure you want to change the base?
Add XLMRoberta in Embedding Train #10074
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
b = np.tril(np.ones([cur_len, cur_len]), 0) | ||
input_mask_data[0, 0, offset : offset + cur_len, offset : offset + cur_len] = b | ||
b = np.ones([cur_len]) | ||
input_mask_data[0, offset : offset + cur_len] = b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注意一下数据处理是否兼容
llm/run_embedding.py
Outdated
@@ -248,6 +253,7 @@ def main(): | |||
return_tensors="np", | |||
return_attention_mask=not model_args.flash_mask, | |||
pad_to_multiple_of=data_args.pad_to_multiple_of, | |||
return_position_ids=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改成参数里面可以配置吧
if isinstance(model_config, XLMRobertaConfig): | ||
model_class = XLMRobertaSentenceEmbedding | ||
elif isinstance(model_config, Qwen2Config): | ||
model_class = Qwen2SentenceEmbedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后面可以考虑加一个 AutoModelForSentenceEmbedding
Before submitting
tests
folder. If there are codecov issues, please add tests cases first.PR types
New features
PR changes
Models
Description
一、在Embedding训练中增加对XLMRoberta模型的支持,可支持bge-m3及系列模型的微调训练:
1.在XLMRoberta的modeling文件中增加相关模型;
2.调整训练脚本中的模型选择与初始化相关代码;
3.调整embedding dataset相关脚本中的数据构造代码;
4.其他参数文件支持等
二、修复了原XLMRoberta模型recompute开启不正常的问题