-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround to get correct results from llama2 mxr #3602
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3602 +/- ##
========================================
Coverage 92.13% 92.13%
========================================
Files 512 512
Lines 21424 21424
========================================
Hits 19740 19740
Misses 1684 1684 ☔ View full report in Codecov by Sentry. |
Check results before merge 🔆 |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that there's an issue with the data pointers. I don't follow how this workaround fixes it. We should meet to discuss.
|
||
std::vector<instruction_ref> attn_probs_inputs{id, pres_k, pres_v, inputs.at(5)}; | ||
std::vector<instruction_ref> attn_probs_inputs{concat, pres_k, pres_v, inputs.at(5), rotary_qkv}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why cant you do:
auto id = mpm.get_module().insert_instruction(ins, make_op("identity"), concat, rotary_qkv);
std::vector<instruction_ref> attn_probs_inputs{id, pres_k, pres_v, inputs.at(5)};
?
class Params> | ||
__device__ void compute_attention_probabilities(Output output, | ||
Query query, | ||
Passthrough, // Used for shape info and graph ordering only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont really follow how these parameters match what is done in prefuse_ops. You dont really change the first parameter but somehow this is changed here. Is the query input supposed to be the concat or the rotary_qkv
?
Fixes #3596