Skip to content

Latest commit

 

History

History
85 lines (53 loc) · 2.93 KB

README.md

File metadata and controls

85 lines (53 loc) · 2.93 KB

gpt-2

Use Case and High-Level Description

The gpt-2 model is a one of Generative Pre-trained Transformer (GPT) model family, pre-trained on a very large corpus of English data in a self-supervised fashion. The GPT architecture implements a deep neural network, specifically a transformer model, which uses attention in place of previous recurrence- and convolution-based architectures. Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text.

More details provided in the paper, repository and model card.

Specification

Metric Value
Type Text Prediction
GFlops 293.0489
MParams 175.6203
Source framework PyTorch*

GFlops calculated for 1, 1024 input shape, that is suitable for long context

Accuracy

Perplexity obtained on WikiText-2 raw character level data dataset for converted model.

Metric Value
Perplexity 29.00%

Input

Original model

Token ids, name: input, dynamic shape in the format B, L, where:

  • B - batch size
  • L - sequence length

Converted model

Token ids, name: input, dynamic shape in the format B, L, where:

  • B - batch size
  • L - sequence length

Output

Original model

Prediction scores of language modeling head, name: output, dynamic shape B, L, 50257 in the format B, L, S, where:

  • B - batch size
  • L - sequence length
  • S - vocab size

Converted model

Prediction scores of language modeling head, name: output, dynamic shape B, L, 50257 in the format B, L, S, where:

  • B - batch size
  • L - sequence length
  • S - vocab size

Download a Model and Convert it into OpenVINO™ IR Format

You can download models and if necessary convert them into OpenVINO™ IR format using the Model Downloader and other automation tools as shown in the examples below.

An example of using the Model Downloader:

omz_downloader --name <model_name>

An example of using the Model Converter:

omz_converter --name <model_name>

Demo usage

The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:

Legal Information

The original model is distributed under the mit License.