[go: up one dir, main page]

Skip to content
View scissorstail's full-sized avatar
  • Seoul, South Korea
  • 17:11 (UTC +09:00)

Block or report scissorstail

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 17,343 1,251 Updated Nov 18, 2024

PathPiece tokenizer

Rust 5 Updated Nov 10, 2024
Python 53 2 Updated Oct 30, 2024
Python 150 9 Updated Nov 17, 2024

🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library

Python 1,293 44 Updated Nov 17, 2024

Laminar - open-source all-in-one platform for engineering AI products. Traces, Evals, Datasets, Labels. YC S24.

TypeScript 1,128 56 Updated Nov 17, 2024

1-Click is all you need.

Jupyter Notebook 59 8 Updated Apr 29, 2024

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

Python 4,615 319 Updated Nov 15, 2024

➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready …

TypeScript 214 7 Updated Nov 7, 2024

Efficient optimizers

Python 75 3 Updated Nov 17, 2024

For optimization algorithm research and development.

Python 445 31 Updated Nov 18, 2024
HTML 58 2 Updated Nov 16, 2024

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.

Python 2,608 175 Updated Nov 17, 2024

A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.

1,121 116 Updated Nov 16, 2024

Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.

Python 258 21 Updated Oct 8, 2024
Python 19 Updated Nov 9, 2024

Label Studio is a multi-type data labeling and annotation tool with standardized output format

JavaScript 19,368 2,403 Updated Nov 18, 2024

A curated reading list of research in Mixture-of-Experts(MoE).

538 41 Updated Oct 30, 2024

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

18,055 2,512 Updated Nov 14, 2024

Get your documents ready for gen AI

Python 9,720 460 Updated Nov 17, 2024

Implementation of "Attention Is Off By One" by Evan Miller

Python 183 10 Updated Aug 28, 2023

Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Python 325 19 Updated Nov 12, 2024

⏬ Dumb downloader that scrapes the web

Python 53,886 9,638 Updated Oct 28, 2024

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 4,732 357 Updated Nov 5, 2024

Code repository for the paper "MrT5: Dynamic Token Merging for Efficient Byte-level Language Models."

Jupyter Notebook 27 1 Updated Nov 14, 2024

Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜

Jupyter Notebook 888 84 Updated Sep 11, 2024

Structured Text Generation

Python 9,479 483 Updated Nov 10, 2024

[EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Python 212 10 Updated Oct 31, 2023

Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"

Python 62 3 Updated Nov 12, 2024
Next