-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Project Pages
- Loading branch information
Showing
7 changed files
with
435 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
--- | ||
layout: default | ||
sitemap: false | ||
--- | ||
<div class="project-presentation"> | ||
<div class="project-header"> | ||
<div class="project-header-content"> | ||
<h1 class="project-title">{{ page.paper_name }}</h1> | ||
{% assign numberAffiliations = page.affiliations | size %} | ||
{% assign showAffiliations = false %} | ||
{% if numberAffiliations > 1 %} | ||
{% assign showAffiliations = true %} | ||
{% endif %} | ||
<div class="project-authors"> | ||
{% assign authors = "" %} | ||
{% assign correspondingAuthor = false %} | ||
{% assign numberOfFirstAuthor = 0 %} | ||
{% for author in page.authors %} | ||
{% if author.first == true %} | ||
{% assign numberOfFirstAuthor = numberOfFirstAuthor | plus: 1 %} | ||
{% endif %} | ||
{% endfor %} | ||
{% assign seenFirstFirstAuthor = false %} | ||
{% for author in page.authors %} | ||
{% assign compareAuthor = author.name | downcase %} | ||
{% assign authorFound = false %} | ||
{% for person in site.people %} | ||
{% assign personName = person.name | split: ' ' | first | downcase %} | ||
{% assign personLastName = person.name | split: ' ' | last | downcase %} | ||
{% if compareAuthor contains personName and compareAuthor contains personLastName %} | ||
{% assign authorFound = true %} | ||
{% assign memberurl = site.url | append: site.baseurl | append: person.url %} | ||
{% if person.position == 'Professor' %} | ||
{% assign memberurl = 'https://seungwonh.github.io/' %} | ||
{% endif %} | ||
{% break %} | ||
{% endif %} | ||
{% endfor %} | ||
{% if author.url %} | ||
{% assign memberurl = author.url %} | ||
{% assign authorFound = true %} | ||
{% endif %} | ||
{% if authorFound %} | ||
{% assign authorLink = "<a href='" | append: memberurl | append: "'>" | append: author.name | append: "</a>" %} | ||
{% else %} | ||
{% assign authorLink = author.name %} | ||
{% endif %} | ||
{% if showAffiliations %} | ||
{% assign authorLink = authorLink | append: "<sup>" | append: author.affiliation | append: "</sup>" %} | ||
{% endif %} | ||
{% if author.corresponding %} | ||
{% assign authorLink = authorLink | append: "*" %} | ||
{% assign correspondingAuthor = true %} | ||
{% endif %} | ||
|
||
{% if author.first %} | ||
{% if numberOfFirstAuthor > 1 %} | ||
{% if seenFirstFirstAuthor == false %} | ||
{% assign authorLink = "{" | append: authorLink %} | ||
{% assign seenFirstFirstAuthor = true %} | ||
{% endif %} | ||
{% assign numberOfFirstAuthor = numberOfFirstAuthor | minus: 1 %} | ||
{% elsif numberOfFirstAuthor == 1 and seenFirstFirstAuthor == true %} | ||
{% assign authorLink = authorLink | append: "}" %} | ||
{% assign numberOfFirstAuthor = 0 %} | ||
{% endif %} | ||
{% endif %} | ||
|
||
{% assign authors = authors | append: authorLink | append: ", " %} | ||
{% endfor %} | ||
{% assign authors = authors | append: "@" | replace: ", @", "." %} | ||
{{ authors }} | ||
</div> | ||
<div class="project-affiliations"> | ||
{% assign affiliations = "" %} | ||
{% for affiliation in page.affiliations %} | ||
{% if showAffiliations %} | ||
{% assign affiliations = affiliations | append: "<sup>" | append: forloop.index | append: "</sup>" | append: affiliation | append: ", " %} | ||
{% else %} | ||
{% assign affiliations = affiliations | append: affiliation | append: ", " %} | ||
{% endif %} | ||
{% endfor %} | ||
{% assign affiliations = affiliations | append: "@" | replace: ", @", "." %} | ||
{{ affiliations }} | ||
</div> | ||
<div class="project-mails"> | ||
{% for author in page.authors %} | ||
{% if author.mail %} | ||
<a href="mailto:{{ author.mail }}">{{ author.mail }}</a>{% if forloop.last %}{% else %}, {% endif %} | ||
{% endif %} | ||
{% endfor %} | ||
</div> | ||
<div class="project-buttons"> | ||
{% for button in page.buttons %} | ||
<a href="{{ button.link }}" class="project-button" target="_blank"> | ||
<i class="fa fa-{{ button.icon }}"></i> {{ button.name }} | ||
</a> | ||
{% endfor %} | ||
</div> | ||
{% if correspondingAuthor %} | ||
<div class="project-corresponding"> | ||
* Corresponding author | ||
</div> | ||
{% endif %} | ||
</div> | ||
</div> | ||
<div class="project-container"> | ||
<div class="project-content"> | ||
{{ content }} | ||
</div> | ||
</div> | ||
<div class="project-bibtex"> | ||
<h3 class="bibtex-title">Cite this project.</h3> | ||
<pre id="bibtex-pre" onclick="selectBibtex()"><code id="bibtex-code">{{ page.bibtex }}</code></pre> | ||
</div> | ||
</div> | ||
|
||
<script> | ||
function selectBibtex() { | ||
var bibtexCode = document.getElementById("bibtex-code"); | ||
var bibtexPre = document.getElementById("bibtex-pre"); | ||
if (document.body.createTextRange) { | ||
var range = document.body.createTextRange(); | ||
range.moveToElementText(bibtexCode); | ||
range.select(); | ||
} else if (window.getSelection) { | ||
var selection = window.getSelection(); | ||
var range = document.createRange(); | ||
range.selectNodeContents(bibtexCode); | ||
selection.removeAllRanges(); | ||
selection.addRange(range); | ||
} | ||
} | ||
</script> | ||
|
||
<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
--- | ||
paper_name: "<span style='font-family: monospace'>HARP</span>: Hesitation-Aware Reframing in Transformer Inference Pass" | ||
authors: | ||
- name: Romain Storaï | ||
affiliation: 1 | ||
mail: [email protected] | ||
first: true | ||
- name: Seung-won Hwang | ||
mail: [email protected] | ||
affiliation: 1 | ||
corresponding: true | ||
affiliations: | ||
- Computer Science and Engineering, Seoul National University | ||
bibtex: "@misc{storaï2024harphesitationawarereframingtransformer,\n | ||
title={HARP: Hesitation-Aware Reframing in Transformer Inference Pass}, \n | ||
author={Romain Storaï and Seung-won Hwang},\n | ||
year={2024},\n | ||
eprint={2412.07282},\n | ||
archivePrefix={arXiv},\n | ||
primaryClass={cs.CL},\n | ||
url={https://arxiv.org/abs/2412.07282}, \n | ||
}" | ||
buttons: | ||
- name: "arXiv" | ||
icon: "file-pdf-o" | ||
link: "https://arxiv.org/abs/2412.07282" | ||
- name: "Code" | ||
icon: "github" | ||
link: "https://github.com/romsto/HARP" | ||
--- | ||
|
||
# TL;DR | ||
|
||
Not all tokens are equally complex to predict - some require more computational resources. | ||
HARP modifies Transformer's forward pass to introduce uncertainty-selective extra computation during inference. | ||
By identifying "harder" tokens through hesitation detection, HARP performs additional computation on reframed inputs. | ||
This method improves overall accuracy, requiring no retraining, and compatible with any Transformer model. | ||
|
||
# Abstract | ||
|
||
This paper aims to improve the performance of large language models by addressing the variable computational demands in inference steps, where some tokens require more computational resources than others. We present HARP, a simple modification to "off-the-shelf" Transformer forward pass. Drawing from hesitation and the framing effect in decision-making, HARP selectively applies additional computation when the model encounters uncertainty during token generation. Our method mimics human cognitive processes by pausing at difficult decision points and reframing inputs for a different perspective. Unlike other approaches, HARP is model-agnostic, training-free, and easy to implement. We thoroughly evaluate our method across various downstream tasks and model sizes, demonstrating performance improvements up to $$+5.16$$%. Notably, HARP achieves these gains while maintaining inference times twice faster than beam search. Simple and yet with significant gains, HARP offers a practical solution for enhancing the performance of Transformer-based language models with minimal computational impact. | ||
|
||
# Our Breakthrough? | ||
We made the Transformer think more like humans, introducing flexibility into its rigid architecture. | ||
|
||
# How does it work? | ||
By modifying the transformer forward pass to incorporate: | ||
- **Hesitation** - We detect when the model is uncertain using shannon entropy on logits. | ||
- **Reframing** - When the model hesitates, we reframe the inputs (by applying drop out to the embeddings) and perform an additional forward pass. | ||
|
||
![HARP Overview]({{ site.baseurl }}/assets/projects/harp/method_figure.png) | ||
<div style="text-align: center"> | ||
<em> | ||
<strong>An overview of the standard vs. HARP-modified Transformer forward pass. HARP detects hesitation and leverages dropout-based perturbation to improve token accuracy.</strong> | ||
</em> | ||
</div> | ||
|
||
# Results | ||
|
||
HARP consistently delivers higher performance across models and tasks, with notable gains: | ||
- **LAMBADA**: **+5.16%** accuracy gains (LLaMA 3.1 8B Instruct) | ||
- **GSM8K**: **+4.79%** accuracy gains (Mistral v0.3 8B Instruct) | ||
|
||
All of those gains are achieved with a minimal impact: | ||
- only **x1.25** average inference time compared to the base model. | ||
- **twice** faster than beam search. | ||
|
||
![Comparison of Accuracy gains and Relative Inference Time for LLaMA 3.1 8B Instruct.]({{ site.baseurl }}/assets/projects/harp/short_results.png) | ||
<div style="display: flex; align-items: center; justify-content: center; text-align: center;"> | ||
<div style="flex: 1; margin-right: 10px; text-align: center;"> | ||
<em> | ||
<strong>Left:</strong> Relative Accuracy compared to the vanilla model (higher is better). Rouge 1 score is reported for CNN/DailyMail. | ||
</em> | ||
</div> | ||
<div style="flex: 1; margin-left: 10px; text-align: center;"> | ||
<em> | ||
<strong>Right:</strong> Relative inference time compared to the vanilla model (lower is better). | ||
</em> | ||
</div> | ||
</div> | ||
<div style="text-align: center"> | ||
<em> | ||
<strong>Comparison of Accuracy Gains and Relative Inference Time Across Tasks of LLaMA 3.1 8B Instruct.</strong> | ||
</em> | ||
</div> | ||
|
||
# Why is it amazing? | ||
HARP is **plug-and-play**, requiring no retraining and compatible to any "off-the-shelf" Transformer-based model! |
Oops, something went wrong.