Skip to content
Change the repository type filter

All

    Repositories list

    • EvoPress

      Public
      Python
      22010Updated Mar 29, 2025Mar 29, 2025
    • PanzaMail

      Public
      Python
      Apache License 2.0
      1728445Updated Mar 28, 2025Mar 28, 2025
    • torch_cgx

      Public
      Pytorch distributed backend extension with compression support
      C++
      GNU Affero General Public License v3.0
      01640Updated Mar 24, 2025Mar 24, 2025
    • QuEST

      Public
      Work in progress.
      Jupyter Notebook
      MIT License
      45020Updated Mar 17, 2025Mar 17, 2025
    • gemm-int8

      Public
      High Performance Int8 GEMM Kernels for SM80 and later GPUs.
      Python
      MIT License
      0600Updated Mar 11, 2025Mar 11, 2025
    • DarwinLM

      Public
      Official Pytorch Implementation of Paper "DarwinLM: Evolutionary Structured Pruning of Large Language Models"
      Python
      2900Updated Feb 21, 2025Feb 21, 2025
    • Python
      Apache License 2.0
      0800Updated Feb 19, 2025Feb 19, 2025
    • Official Repository for "Scalable Mechanistic Neural Networks" (ICLR 2025)
      Python
      MIT License
      0100Updated Feb 19, 2025Feb 19, 2025
    • SPADE

      Public
      Code of SPADE: Sparsity Guided Debugging for Deep Neural Networks
      Jupyter Notebook
      3110Updated Feb 18, 2025Feb 18, 2025
    • HALO

      Public
      HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arxiv.org/abs/2501.02625
      Python
      MIT License
      01210Updated Feb 17, 2025Feb 17, 2025
    • gemm-fp8

      Public
      High Performance FP8 GEMM Kernels for SM89 and later GPUs.
      Cuda
      MIT License
      0700Updated Jan 24, 2025Jan 24, 2025
    • GridSearcher simplifies running grid searches for machine learning projects in Python, emphasizing parallel execution and GPU scheduling without dependencies on SLURM or other workload managers.
      Python
      Apache License 2.0
      0200Updated Jan 23, 2025Jan 23, 2025
    • MicroAdam

      Public
      This repository contains code for the MicroAdam paper.
      Python
      Apache License 2.0
      41710Updated Dec 14, 2024Dec 14, 2024
    • LLM training code for Databricks foundation models
      Python
      Apache License 2.0
      557001Updated Nov 27, 2024Nov 27, 2024
    • Python
      0100Updated Nov 25, 2024Nov 25, 2024
    • 0000Updated Nov 20, 2024Nov 20, 2024
    • LDAdam

      Public
      LDAdam - Adaptive Optimization from Low-Dimensional Gradient Statistics
      Python
      Apache License 2.0
      0600Updated Nov 6, 2024Nov 6, 2024
    • Boosting 4-bit inference kernels with 2:4 Sparsity
      Cuda
      Apache License 2.0
      57111Updated Sep 4, 2024Sep 4, 2024
    • marlin

      Public
      FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
      Python
      Apache License 2.0
      63782275Updated Sep 4, 2024Sep 4, 2024
    • sparsegpt

      Public
      Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
      Python
      Apache License 2.0
      102777151Updated Aug 20, 2024Aug 20, 2024
    • peft-rosa

      Public
      A fork of the PEFT library, supporting Robust Adaptation (RoSA)
      Python
      Apache License 2.0
      31310Updated Aug 16, 2024Aug 16, 2024
    • Python
      MIT License
      0000Updated Jun 27, 2024Jun 27, 2024
    • spops

      Public
      C++
      Apache License 2.0
      0720Updated Jun 20, 2024Jun 20, 2024
    • Code for the EMNLP 2024 paper "Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on LLMs".
      Python
      Apache License 2.0
      0810Updated Jun 18, 2024Jun 18, 2024
    • QUIK

      Public
      Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024
      C++
      Apache License 2.0
      1417851Updated Apr 16, 2024Apr 16, 2024
    • FastOBQ-

      Public
      GPTQ with finetuning
      0000Updated Mar 27, 2024Mar 27, 2024
    • gptq

      Public
      Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
      Python
      Apache License 2.0
      1642.1k231Updated Mar 27, 2024Mar 27, 2024
    • RoSA

      Public
      Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)
      Python
      Apache License 2.0
      33910Updated Feb 13, 2024Feb 13, 2024
    • Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
      Python
      Apache License 2.0
      74030Updated Jan 15, 2024Jan 15, 2024
    • CAP

      Public
      Repository for Correlation Aware Prune (NeurIPS23) source and experimental code
      Python
      Apache License 2.0
      1510Updated Nov 29, 2023Nov 29, 2023