default search action
18th ICDAR 2024: Athens, Greece - Part I
- Elisa H. Barney Smith, Marcus Liwicki, Liangrui Peng:
Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30 - September 4, 2024, Proceedings, Part I. Lecture Notes in Computer Science 14804, Springer 2024, ISBN 978-3-031-70532-8
Business Documents
- Laura Jamieson, Carlos Francisco Moreno-García, Eyad Elyan:
A Multiclass Imbalanced Dataset Classification of Symbols from Piping and Instrumentation Diagrams. 3-16 - Glen Pouliquen, Guillaume Chiron, Joseph Chazalon, Thierry Géraud, Ahmad Montaser Awal:
Weakly Supervised Training for Hologram Verification in Identity Documents. 17-33 - Zhen-Lun Mo, Song-Lu Chen, Qi Liu, Feng Chen, Xu-Cheng Yin:
Multi-task Learning for License Plate Recognition in Unconstrained Scenarios. 34-50 - Maxime Talarmain, Carlos Boned Riera, Sanket Biswas, Oriol Ramos Terrades:
Recurrent Few-Shot Model for Document Verification. 51-62 - Xin Yang, Fei Yin, Yan-Ming Zhang, Xudong Yan, Tao Xue:
Document Specular Highlight Removal with Coarse-to-Fine Strategy. 63-78 - Travis Seng, Axel Carlier, Thomas Forgione, Vincent Charvillat, Wei Tsang Ooi:
SlideCraft: Synthetic Slides Generation for Robust Slide Analysis. 79-96 - Oshri Naparstek, Ophir Azulai, Inbar Shapira, Elad Amrani, Yevgeny Yaroker, Yevgeny Burshtein, Roi Pony, Nadav Rubinstein, Foad Abo Dahood, Orit Prince, Idan Friedman, Christoph Auer, Nikolaos Livathinos, Maksym Lysak, Ahmed Nassar, Peter W. J. Staar, Udi Barzelay:
KVP10k : A Comprehensive Dataset for Key-Value Pair Extraction in Business Documents. 97-116
Chinese Text Recognition
- Gang Yao, Ning Ding, Tianqi Zhao, Kemeng Zhao, Pei Tang, Yao Tao, Liangrui Peng:
Visual Prompt Learning for Chinese Handwriting Recognition. 119-133 - Yangyang Liu, Yi Chen, Fei Yin, Cheng-Lin Liu:
Context-Aware Confidence Estimation for Rejection in Handwritten Chinese Text Recognition. 134-151 - Zhongyuan Han, Jun Du, Mobai Xue, Jiefeng Ma, Pengfei Hu, Zhenrong Zhang:
Radical Similarity Based Model Optimization and Post-correction for Chinese Character Recognition. 152-168 - Pengjie Wang, Kaile Zhang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu:
Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction. 169-187
Document Understanding and NLP
- Jeff Yang, Huynh The Vu, Hai Luu Tuan:
Light-Weight Multi-modality Feature Fusion Network for Visually-Rich Document Understanding. 191-207 - Akkshita Trivedi, Akarsh Upadhyay, Rudrabha Mukhopadhyay, Santanu Chaudhury:
GDP: Generic Document Pretraining to Improve Document Understanding. 208-226 - He-Sen Dai, Xiao-Hui Li, Fei Yin, Xudong Yan, Shuqi Mei, Cheng-Lin Liu:
GraphMLLM: A Graph-Based Multi-level Layout Language-Independent Model for Document Understanding. 227-243 - Huynh The Vu, Van Pham Hoai, Jeff Yang:
One-Shot Transformer-Based Framework for Visually-Rich Document Understanding. 244-261 - Chun-Bo Xu, Yi-Ming Chen, Cheng-Lin Liu:
EntityLayout: Entity-Level Pre-training Language Model for Semantic Entity Recognition and Relation Extraction. 262-279 - Mohammad Minouei, Mohammad Reza Soheili, Didier Stricker:
Embedding Layout in Text for Document Understanding Using Large Language Models. 280-293 - Nil Biescas, Carlos Boned, Josep Lladós, Sanket Biswas:
GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document Understanding. 294-310
Transformers
- Jiawei Wang, Shunchi Zhang, Kai Hu, Chixiang Ma, Zhuoyao Zhong, Lei Sun, Qiang Huo:
Dynamic Relation Transformer for Contextual Text Block Detection. 313-330 - Yun Young Choi, Taehoon Kim, Namwook Kim, Taehee Lee, Seongho Joe:
End to End Table Transformer. 331-345
Charts and Tables
- Omar Moured, Jiaming Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen:
AltChart: Enhancing VLM-Based Chart Summarization Through Multi-pretext Tasks. 349-366 - Qiyu Hou, Jun Wang, Meixuan Qiao, Lujun Tian:
Synthesizing Realistic Data for Table Recognition. 367-388 - Takaya Kawakatsu:
Multi-cell Decoder and Mutual Learning for Table Structure and Character Recognition. 389-405 - Hui Shi, Yusheng Xie, Luis Goncalves, Sicun Gao, Jishen Zhao:
WikiDT: Visual-Based Table Recognition and Question Answering Dataset. 406-437 - Xinhong Chen, Bangdong Chen, Chenfan Qu, Dezhi Peng, Chongyu Liu, Lianwen Jin:
DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator. 438-452 - Pengyu Yan, Mahesh Bhosale, Jay Lal, Bikhyat Adhikari, David S. Doermann:
ChartReformer: Natural Language-Driven Chart Image Editing. 453-469 - Haochen Wang, Kai Hu, Haoyu Dong, Liangcai Gao:
DocTabQA: Answering Questions from Long Documents Using Tables. 470-487
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.