[go: up one dir, main page]

Skip to content
View ShiXianzheng's full-sized avatar

Block or report ShiXianzheng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

AI算法岗求职攻略(涵盖准备攻略、刷题指南、内推和AI公司清单等资料)

5,257 639 Updated Apr 24, 2024

大数据入门指南 ⭐

Java 15,943 4,248 Updated Jan 5, 2024

[ICML 2024] Official PyTorch implementation of "SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization"

Python 79 6 Updated Aug 23, 2024

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 10,816 2,133 Updated Nov 5, 2024

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

Python 413 82 Updated May 15, 2023

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 30,360 4,596 Updated Nov 18, 2024

[TMLR 2024] Efficient Large Language Models: A Survey

1,025 86 Updated Nov 9, 2024

Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"

Python 254 24 Updated Sep 3, 2024

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 730 56 Updated Oct 8, 2024

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,210 329 Updated May 16, 2023

The PyTorch implementation of Learned Step size Quantization (LSQ) in ICLR2020 (unofficial)

Jupyter Notebook 124 21 Updated Nov 19, 2020
Python 15 6 Updated Oct 26, 2022

中国大模型

5,514 452 Updated Jun 7, 2024

Grok open release

Python 49,577 8,320 Updated Aug 30, 2024

Official inference library for Mistral models

Jupyter Notebook 9,723 863 Updated Nov 12, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,055 385 Updated Aug 7, 2024

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,487 484 Updated Sep 28, 2024

The official implementation of the ICML 2023 paper OFQ-ViT

Python 27 Updated Oct 3, 2023

Running BERT without Padding

C++ 460 52 Updated Mar 18, 2022

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

C++ 1,484 198 Updated Jun 12, 2023

Transformer related optimization, including BERT, GPT

C++ 5,887 893 Updated Mar 27, 2024

Modeling, training, eval, and inference code for OLMo

Python 4,642 473 Updated Nov 18, 2024

Implementation of "DeepShift: Towards Multiplication-Less Neural Networks" https://arxiv.org/abs/1905.13298

Python 108 30 Updated Nov 22, 2021

Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models".

Python 53 5 Updated Mar 7, 2024

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Python 349 32 Updated Feb 24, 2024

[EMNLP 2022 main] Code for "Understanding and Improving Knowledge Distillation for Quantization-Aware-Training of Large Transformer Encoders"

Jupyter Notebook 7 Updated Feb 7, 2023

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Python 650 43 Updated Aug 13, 2024

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,055 823 Updated Jun 10, 2024

【编程随想】整理的《太子党关系网络》,专门揭露赵国的权贵

Python 13,555 2,751 Updated Aug 1, 2021
Next