Skip to content

Commit 72f1c84

Browse files
authoredFeb 22, 2021
Update README.md
1 parent ad7cfd4 commit 72f1c84

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed
 

‎README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
# Fast Torch
22

3-
I wanted to explore different ways to optimize PyTorch models for inference, so I played a little bit with TorchScript, ONNX Runtime and classic PyTorch eager-mode to compare their performance. I use pre-trained RoBERTa model (trained for sentiment analysis from tweets) along with BERT tokenizer. Both models are hosted by HuggingFace.
3+
I wanted to explore different ways to optimize PyTorch models for inference, so I played a little bit with TorchScript, ONNX Runtime and classic PyTorch eager-mode and compared their performance. I use pre-trained RoBERTa model (trained for sentiment analysis from tweets) along with BERT tokenizer. Both models are [available here](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment).
44

5-
I wrote 14 different text sequences (7 with positive and 7 with negative sentiments) with different lengths and I used them for model inference. To obtain more reliable results, I repeated that process 1000 times (1000 times x 14 sequences = 14K runs for single model configuration).
5+
I wrote 14 short-to-medium length text sequences (7 with positive and 7 with negative sentiments) and I used them for model prediction. To obtain more reliable results, I repeated that process 1000 times (1000 times x 14 sequences = 14K runs for a single model configuration).
66

77
### Results
88

9-
Horizontal axis shows 14 sequences (numered from 0 to 13) that were used for prediction. In each column (for each sequence) there is n=1000 measurements for each mode: Eager, Script (JIT), ONNX. A total of 3000 values is present for each sequence ID. Eager (default) mode is always slightly worse than Script (TorchScript) mode inference. ONNX Runtime seems to outperform both Eager and Script predictions speed which can be observed in the image below.
9+
Horizontal axis shows 14 sequences (numbered from 0 to 13) that were used for prediction. In each column (for each sequence) there is n=1000 measurements for: Eager, Script (JIT), ONNX modes. A total of 3000 values were plotted for each sequence ID. Eager (default) mode is always slightly worse than Script (TorchScript) mode inference. ONNX Runtime seems to outperform both Eager and Script predictions speed which can be observed in the image below.
1010

1111
![](doc/scatter.png)
1212

1313
When summing up all the results (from all experiments and sequences), grouping them by inference mode and calculating the average, it is once again clear that ONNX performs much better than other two options.
1414

15-
![](doc/bar.png)
15+
![](doc/bar.png)

0 commit comments

Comments
 (0)
Please sign in to comment.