[go: up one dir, main page]

Skip to content
View tqchen's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@apache @dmlc @uwsampl @octoml

Block or report tqchen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Modeling, training, eval, and inference code for OLMo

Python 4,786 488 Updated Dec 1, 2024

Efficient, Flexible and Portable Structured Generation

C++ 375 18 Updated Nov 29, 2024

Structured Text Generation

Python 9,843 503 Updated Nov 29, 2024
Python 733 11 Updated Apr 17, 2024

CUDA Python Low-level Bindings

Python 984 77 Updated Nov 30, 2024

Code for Papeg.ai

JavaScript 200 21 Updated Nov 13, 2024

Run LLMs in the Browser with MLC / WebLLM ✨

TypeScript 91 12 Updated Oct 5, 2024

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 663 38 Updated Dec 1, 2024

Introduction to Machine Learning Systems

TeX 1,235 156 Updated Nov 27, 2024

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,405 225 Updated Nov 26, 2024

Chat with AI large language models running natively in your browser. Enjoy private, server-free, seamless AI conversations.

TypeScript 501 62 Updated Nov 13, 2024

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 10,337 637 Updated Nov 29, 2024

An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.

Python 266 32 Updated Nov 3, 2024

🙌 OpenHands: Code Less, Make More

Python 37,692 4,259 Updated Dec 1, 2024

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。

TypeScript 77,126 59,415 Updated Nov 28, 2024

A @ClickHouse fork that supports high-performance vector search and full-text search.

C++ 873 45 Updated Nov 14, 2024

Grok open release

Python 49,644 8,330 Updated Aug 30, 2024

asyncio is a c++20 library to write concurrent code using the async/await syntax.

C++ 830 80 Updated Feb 3, 2024

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 640 50 Updated Sep 4, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 6,286 549 Updated Dec 1, 2024

Social and customizable AI writing assistant! ✍️

TypeScript 192 26 Updated Jun 29, 2024

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

Svelte 49,158 6,030 Updated Dec 1, 2024

An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.

Python 49 2 Updated Jul 23, 2024

MLX: An array framework for Apple silicon

C++ 17,617 1,017 Updated Nov 29, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 1,490 147 Updated Nov 26, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,685 515 Updated Oct 18, 2024

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Python 767 122 Updated Oct 10, 2024

Vercel and web-llm template to run wasm models directly in the browser.

TypeScript 126 17 Updated Nov 21, 2023

Serving multiple LoRA finetuned LLM as one

Python 991 46 Updated May 8, 2024

Letta (formerly MemGPT) is a framework for creating LLM services with memory.

Python 13,068 1,432 Updated Nov 30, 2024
Next