[go: up one dir, main page]

Skip to content
View efrantar's full-sized avatar

Organizations

@IST-DASLab

Block or report efrantar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. IST-DASLab/gptq IST-DASLab/gptq Public

    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

    Python 1.9k 154

  2. IST-DASLab/sparsegpt IST-DASLab/sparsegpt Public

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    Python 721 96

  3. IST-DASLab/marlin IST-DASLab/marlin Public

    FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

    Python 624 47

  4. IST-DASLab/qmoe IST-DASLab/qmoe Public

    Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

    Python 262 22

  5. IST-DASLab/OBC IST-DASLab/OBC Public

    Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".

    Python 104 14

  6. rob-twophase rob-twophase Public

    The ultimate Rubik's Cube solving algorithm for high-speed axial robots.

    C++ 123 10