Starred repositories
1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc
Qt based cross-platform GUI proxy configuration manager (backend: sing-box)
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
A Comprehensive Toolkit for High-Quality PDF Content Extraction
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
A machine learning software for extracting information from scholarly documents
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
TextClf :基于Pytorch/Sklearn的文本分类框架,包括逻辑回归、SVM、TextCNN、TextRNN、TextRCNN、DRNN、DPCNN、Bert等多种模型,通过简单配置即可完成数据处理、模型训练、测试等过程。
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
Convert the image of the formula to LaTeX. This project is also the final project of Full Stack Deep Learning course.
小瓶RPA 永久免费(个人版)RPA软件系统。 轻量级简单全能的RPA软件,显著降本增效 & 工作100%准确 & 非侵入式集成。同时支持浏览器web应用和客户端应用的操作流程自动化。同时支持 Js 和 Python 两种脚本制作流程。
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
A PyTorch implementation of the Transformer model in "Attention is All You Need".
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall in RAG). | Doc2X API的python封装,同时附带本地的文本处理(提升PDF在RAG中的召回率)。
A modular graph-based Retrieval-Augmented Generation (RAG) system
基于transformer的ocr识别,在公章(印章识别, seal recognition)拓展应用
DocBank 文档图像增强数据集,此数据集用于文档图像增强,具体任务包括以下内容:Seal detection & Removal 印章检测 & 移除 ;Watermark detection & Removal 水印检测 & 移除;Document deblurring 文档去模糊;Document shadow removal 文档去阴影;Document super-resoluti…
ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to o…
如需体验textin文档解析,请点击https://cc.co/16YSIy
List of references and online resources related to data science, machine learning and deep learning.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step