PROTOQUANT - Dynamic Quantization with Tensor Subclassing

The protoquant package provides dynamic vector-wise quantization and quantized arithmetic using torch.tensor subclassing.

This dynamnic quantization support is directed at a broad range of applications, and currently tested with the PyTorch Transformner API and Better Transformers implementation with a focus on GPU inference.

The focus on testing for Transformer Inference is non-limiting and protoquant is broadly applicable to support broad uses for using dynamic inference with PyTorch.

Installation

You need to clone the repo with recursive submodules.

git clone --recurse-submodules https://github.com/facebookexperimental/protoquant.git

If you forget to, you can always fix this using this trick

Once the repository is cloned, you will NEED to be on a GPU machine, and then pip install -e . works.

If you really want to compile on a CPU machine, see here

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
ao_experimental		ao_experimental
benchmark		benchmark
protoquant		protoquant
tests		tests
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PROTOQUANT - Dynamic Quantization with Tensor Subclassing

Installation

License

About

Releases

Packages

Contributors 9

Languages

License

facebookexperimental/protoquant

Folders and files

Latest commit

History

Repository files navigation

PROTOQUANT - Dynamic Quantization with Tensor Subclassing

Installation

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages