Skip to content

Latest commit

 

History

History
72 lines (67 loc) · 3.04 KB

README.md

File metadata and controls

72 lines (67 loc) · 3.04 KB

packedBERT algorithms for efficient sequence packing

This folder contains the histogram data for the Wikipedia and SQuAD dataset, as well as the algorithms presented in "Packing: Towards 2x NLP BERT Acceleration".

Contents

  1. Sequence length histograms histograms.py
  2. Non-negative least squares histogram-packing (nnlshp.py)
  3. Shortest-pack-first histogram-packing spfhp.py
  4. Longest-pack-first histogram-packing lpfhp.py
  5. Extended non-negative least squares histogram-packing ennlshp.py

Example usage

All of the presented algorithms operate on histograms. The following snippets demonstrate how each can be demoed on the Wikipedia pre-training dataset histogram.

Non-negative least-squares histogram-packing (NNLSHP):

from histograms import wikipedia_histogram, wikipedia_max_sequence_length
from nnlshp import pack_using_nnlshp
max_sequence_length = 512
max_sequences_per_pack = 3
strategy_set, strategy_repeat_count = pack_using_nnlshp(wikipedia_histogram, wikipedia_max_sequence_length, max_sequences_per_pack)

Which is expected to print:

Packing efficiency (fraction of real tokens): 0.9975
 Speed-up theoretical limit: 2.0013
 Achieved speed-up over un-packed dataset: 1.99625
Runtime: Packed 16279552 sequences in 24.267 seconds.

Shortest-pack-first histogram-packing (SPFHP):

from histograms import wikipedia_histogram, wikipedia_max_sequence_length
from spfhp import pack_using_spfhp
max_sequences_per_pack = 12
strategy_set, strategy_repeat_count = pack_using_spfhp(wikipedia_histogram, wikipedia_max_sequence_length, max_sequences_per_pack)

which is expected to print:

Packing efficiency (fraction of real tokens): 99.6040
 Speed-up theoretical limit: 2.0013
 Achieved speed-up over un-packed dataset: 1.99340
 Runtime: Packed 16279552 sequences in 0.032 seconds.

Longest-pack-first histogram-packing (LPFHP):

from histograms import wikipedia_histogram, wikipedia_max_sequence_length
from lpfhp import pack_using_lpfhp
max_sequences_per_pack = 12
strategy_set, strategy_repeat_count = pack_using_lpfhp(wikipedia_histogram, wikipedia_max_sequence_length, max_sequences_per_pack)

which is expected to print:

Packing efficiency (fraction of real tokens): 99.8129
 Speed-up theoretical limit: 2.0013
 Achieved speed-up over un-packed dataset: 1.99758 Runtime: Packed 16279552 sequences in 0.048 seconds.

Extended non-negative least squares histogram-packing (ENNLSHP)

from histograms import wikipedia_histogram, wikipedia_max_sequence_length
from ennlshp import pack_using_ennlshp
max_sequence_length = 512
max_sequences_per_pack = 3
strategy_set, strategy_repeat_count = pack_using_ennlshp(wikipedia_histogram, wikipedia_max_sequence_length, max_sequences_per_pack)

Which is expected to print:

Packing efficiency (fraction of real tokens): 0.9975
 Speed-up theoretical limit: 2.0013
 Achieved speed-up over un-packed dataset: 1.99625
Runtime: Packed 16279552 sequences in 283.997 seconds.