Lists (16)
Sort Name ascending (A-Z)
Stars
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Fast suffix arrays for Rust (with Unicode support).
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
Code and Checkpoints for "Generate rather than Retrieve: Large Language Models are Strong Context Generators" in ICLR 2023.
Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizXgXU)
List of papers on hallucination detection in LLMs.
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
An python vm injector with debug tools, based on gdb.
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Tools for merging pretrained large language models.
Official inference library for Mistral models
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
Medical NLP Competition, dataset, large models, paper
Supercharge Your LLM Application Evaluations 🚀
Codebase for Merging Language Models (ICML 2024)
RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Language Models.
Implementation of Nougat Neural Optical Understanding for Academic Documents
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning