ShiXianzheng

Follow

ShiXianzheng

Follow

0 followers · 5 following

Starred repositories

amusi / AI-Job-Notes

AI算法岗求职攻略（涵盖准备攻略、刷题指南、内推和AI公司清单等资料）

5,257 639 Updated Apr 24, 2024

heibaiying / BigData-Notes

大数据入门指南 ⭐

Java 15,943 4,248 Updated Jan 5, 2024

xinghaochen / SLAB

[ICML 2024] Official PyTorch implementation of "SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization"

Python 79 6 Updated Aug 23, 2024

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 10,816 2,133 Updated Nov 5, 2024

Zhen-Dong / HAWQ

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

Python 413 82 Updated May 15, 2023

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 30,360 4,596 Updated Nov 18, 2024

AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

1,025 86 Updated Nov 9, 2024

facebookresearch / LLM-QAT

Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"

Python 254 24 Updated Sep 3, 2024

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 730 56 Updated Oct 8, 2024

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,210 329 Updated May 16, 2023

hustzxd / LSQuantization

The PyTorch implementation of Learned Step size Quantization (LSQ) in ICLR2020 (unofficial)

Jupyter Notebook 124 21 Updated Nov 19, 2020

sIncerass / QBERT

Python 15 6 Updated Oct 26, 2022

wgwang / awesome-LLMs-In-China

中国大模型

5,514 452 Updated Jun 7, 2024

xai-org / grok-1

Grok open release

Python 49,577 8,320 Updated Aug 30, 2024

mistralai / mistral-inference

Official inference library for Mistral models

Jupyter Notebook 9,723 863 Updated Nov 12, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,055 385 Updated Aug 7, 2024

AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,487 484 Updated Sep 28, 2024

nbasyl / OFQ

The official implementation of the ICML 2023 paper OFQ-ViT

Python 27 Updated Oct 3, 2023

bytedance / effective_transformer

Running BERT without Padding

C++ 460 52 Updated Mar 18, 2022

Tencent / TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

C++ 1,484 198 Updated Jun 12, 2023

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 5,887 893 Updated Mar 27, 2024

Qualcomm-AI-research / oscillations-qat

Python 68 8 Updated Jul 21, 2022

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 4,642 473 Updated Nov 18, 2024

mostafaelhoushi / DeepShift

Implementation of "DeepShift: Towards Multiplication-Less Neural Networks" https://arxiv.org/abs/1905.13298

Python 108 30 Updated Nov 22, 2021

xvyaward / owq

Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models".

Python 53 5 Updated Mar 7, 2024

Cornell-RelaxML / QuIP

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Python 349 32 Updated Feb 24, 2024

MarsJacobs / kd-qat-large-enc

[EMNLP 2022 main] Code for "Understanding and Improving Knowledge Distillation for Quantization-Aware-Training of Large Transformer Encoders"

Jupyter Notebook 7 Updated Feb 7, 2023

SqueezeAILab / SqueezeLLM

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Python 650 43 Updated Aug 13, 2024

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,055 823 Updated Jun 10, 2024

programthink / zhao

【编程随想】整理的《太子党关系网络》，专门揭露赵国的权贵

Python 13,555 2,751 Updated Aug 1, 2021

Starred topics

Tensorflow