Skip to content

Releases: neuralmagic/compressed-tensors

Compressed Tensors v0.11.0

19 Aug 19:02
b78eb8f
Compare
Choose a tag to compare

What's Changed

  • Fix nightly issues with python 3.11 by @dhuangnm in #371
  • Clean up disk space and restore ubuntu 22.04 runner by @dhuangnm in #373
  • [Transform] Update tests to use conftest file by @kylesayrs in #367
  • [Transform] Hadamard Permutations by @kylesayrs in #329
  • [Transform] Construct on GPU, cache on CPU by @kylesayrs in #352
  • enable code coverage collection and reporting INFERENG-1049 by @derekk-nm in #382
  • Deprecate iter_named_leaf_modules and iter_named_quantizable_modules by @kylesayrs in #381
  • Added support for compression on meta device by @shanjiaz in #376
  • Add torch.float64 as a viable dtype for scales by @eldarkurtic in #379
  • [Transform] apply_transform_config by @kylesayrs in #348
  • [Compression] Fix compression device movement in cases of indexed devices by @kylesayrs in #384
  • Enable code coverage report for nightly tests by @dhuangnm in #388
  • [Bugfix] Only quant-compress modules with weight quantization by @kylesayrs in #387
  • [Transform] Fix config serialization by @kylesayrs in #396
  • [Transform] Do not fuse div operation into hadamard matrices by @kylesayrs in #395
  • [Transform] Implement multi-headed transforms by @kylesayrs in #383
  • Support DeepSeekV3-style block FP8 quantization by @mgoin in #372
  • [Transform] [Utils] Canonical matching utilities by @kylesayrs in #392
  • [Bugfix] Safeguard against submodule parameter deletion in decompress_model by @kylesayrs in #347
  • fix block quantization initialization by @shanjiaz in #403
  • [Utils] Skip internal modules when matching by @kylesayrs in #404
  • [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params by @dsikka in #407
  • [Utils] Support matching vLLM modules by @kylesayrs in #413
  • Fix block size inference logic by @shanjiaz in #411
  • [Transform] Serialize with tied weights by @kylesayrs in #370
  • [Transform] [Utils] Support precision, add torch dtype validation by @kylesayrs in #414
  • [Transform] Serialize transforms config by @kylesayrs in #412
  • Error when configs are created with unrecognized fields by @kylesayrs in #386
  • revert forbid constraint on QuantizationConfig by @brian-dellabetta in #418
  • Revert "[Transform] Serialize transforms config (#412)" by @dsikka in #419
  • added wrapper for execution device by @shanjiaz in #417
  • [Transform] Serialize config (include format) by @dsikka in #420
  • exclude transform_config from quantization_config parse by @brian-dellabetta in #421
  • [Quantization] Support more than one quant-compressor by @dsikka in #415
  • [QuantizationScheme] Validate format by @dsikka in #424
  • [Utils] Expand is_match by @kylesayrs in #416
  • fix match.py syntax by @shanjiaz in #426
  • [Offload] Fully remove dispatch by @kylesayrs in #427

Full Changelog: 0.10.2...0.11.0

Compressed Tensors v0.10.2

23 Jun 13:20
38cbdd1
Compare
Choose a tag to compare

What's Changed

  • [Hotfix] Implement quantization compressor methods on dense compressor by @kylesayrs in #344
  • [Hotfix] Implement method on dense compressor by @kylesayrs in #345
  • [Transform] Factory classes with shared memory and offloading by @kylesayrs in #316
  • [Transform] [Bugfix] Fix enum value serialization in python>=3.11 by @kylesayrs in #350
  • Remove redundant call by @eldarkurtic in #349
  • [Accelerate] Rename and simplify force_cpu_offload by @kylesayrs in #354
  • [Transform] Extend set of known Hadamard matrices by @kylesayrs in #351
  • [Accelerate] Fix offloaded_dispatch, implement disable_offloading by @kylesayrs in #355
  • [Accelerate] Extend functionality of register_offload_parameter by @kylesayrs in #356
  • [Bugfix] Fix saving of models dispatched by offloaded_dispatch by @kylesayrs in #357
  • [Bugfix] Only update direct params in disable_offloading by @kylesayrs in #360
  • reference updated reportportal_submit_execution_results action by @derekk-nm in #362
  • [Accelerate] Expand get_execution_device to support models by @kylesayrs in #363
  • [Accelerate] Fix typos in get_execution_device by @kylesayrs in #365

New Contributors

Full Changelog: 0.10.1...0.10.2

Compressed Tensors v0.10.1

06 Jun 18:26
f5dbfc3
Compare
Choose a tag to compare

What's Changed

  • [Transform] Hadamard and Matrix Transform Utils by @kylesayrs in #330
  • Fix error on import whenever accelerate is absent by @maresb in #342

New Contributors

Full Changelog: 0.10.0...0.10.1

Compressed Tensors v0.10.0

05 Jun 17:51
d7ce8ec
Compare
Choose a tag to compare

What's Changed

  • Updates to build system by @dbarbuzzi in #304
  • [Utils] add align_modules by @kylesayrs in #282
  • Enable module state_dict compression, simplify compression logic by @kylesayrs in #302
  • Fix _initialize_scale_zero_point initializing on the wrong device by @mgoin in #295
  • Revert "Enable module state_dict compression, simplify compression lo… by @kylesayrs in #306
  • [Bugfix] Fix shape calculation for group quantization by @kylesayrs in #308
  • Enable module state_dict compression, simplify compression logic by @kylesayrs in #307
  • Clarify decompression return type by @kylesayrs in #310
  • Clarify match_param_name return type by @kylesayrs in #312
  • [Compressor][NVFP4] Support FP4 Compression by @dsikka in #311
  • [NVFP4] Update FloatArgs and NVFP4 by @dsikka in #313
  • fix signatures on model_validator functions by @brian-dellabetta in #314
  • [Performance] Add memory compression and decompression pathways by @kylesayrs in #301
  • Model Compression: Set compression status by @kylesayrs in #318
  • [NVFP4] Enable Fp4 Quantization; introduce / apply global_scales by @dsikka in #315
  • [NVFP4] Skip fused global scale calculation if already fused by @dsikka in #322
  • Update default observer to be MSE by @shanjiaz in #300
  • [Misc] Generics typehinting for RegistryMixin by @kylesayrs in #320
  • Revert "Update default observer to be MSE (#300)" by @dsikka in #323
  • [NVFP4] Add tensor_group strategy; enable NVFP4 Activations by @dsikka in #317
  • [Transforms] Transform Args, Scheme, and Config by @kylesayrs in #321
  • [NVFP4] Expand dynamic types, clean-up conditions by @dsikka in #325
  • Use different runner for UPLOAD job by @dbarbuzzi in #327
  • [NVFP4] Use torch.compile when rounding to NVFP4 by @dsikka in #331
  • [Tests] Update test_fp8_quant.py by @dsikka in #337
  • [Tests] Fix test scale init for group quant by @dsikka in #338
  • [Quantization] Update group quantization by @dsikka in #336
  • [NVFP4] update global scale generation by @dsikka in #339
  • [Transform] Accelerate Utilities by @kylesayrs in #328
  • Model Compression: Delete offload by @kylesayrs in #319
  • [Decompression] Keep unused parameters when decompressing from memory by @kylesayrs in #340
  • [NVFP4] Small Nits by @dsikka in #341

New Contributors

Full Changelog: 0.9.4...0.10.0

Compressed Tensors v0.9.4

24 Apr 19:21
8aa8b82
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.3...0.9.4

Compressed Tensors v0.9.3

02 Apr 17:13
4574747
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.2...0.9.3

Compressed Tensors v0.9.2

18 Feb 19:08
b8cf630
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.1...0.9.2

Compressed Tensors v0.9.1

23 Jan 18:54
fc711ca
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.0...0.9.1

Compressed Tensors v0.9.0

15 Jan 20:28
0ba31bf
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 0.8.1...0.9.0

Compressed Tensors v0.8.1

11 Dec 19:29
5771dec
Compare
Choose a tag to compare

What's Changed

  • Skip accelerate tests by @kylesayrs in #208
  • Remove QuantizationScheme.default_scheme by @kylesayrs in #202
  • Allow ModelCompressor.from_pretrained to load from quantization_config, not compression config by @horheynm in #207
  • Quantization Scheme Validation by @kylesayrs in #209
  • Fix uninitialized variable in quantized compressors by @markmc in #205
  • Implement aliasable mixin and alias activation ordering by @kylesayrs in #213
  • Revert "Implement aliasable mixin and alias activation ordering (#213)" by @dsikka in #217
  • Implement aliasable mixin and alias activation ordering (python3.9 fix) by @kylesayrs in #218
  • bump by @dsikka in #226

New Contributors

Full Changelog: 0.8.0...0.8.1