10th International Congress on Information and Communication Technology in concurrent with ICT Excellence Awards (ICICT 2025) will be held at London, United Kingdom | February 18 - 21 2025.
Authors - Ravichandran S, R Kasturi Rangan, Manjesh R, S Karthik, T N Hemanth Abstract - The page-level recognition and reordering of handwritten documents is important in digitizing and archiving systems. These systems focuses on solving the two problems of converting relevant parts of a handwriting document into recognizable formats for machines as well as correctly sequencing pages in order to preserve context. Building upon the state-of-the-art in Optical Character Recognition (OCR) and Document Layout Analysis (DLA), The paper suggests that these methods are effective for page-level text recognition that combines automatic reading order detection with advanced OCR modeling. This study evaluates the impact of a hybrid architecture combining Vision Transformers (ViT) for powerful feature extraction, and transformer-based Language Models (LMs) to provide context during text decoding. We then pose the task of reordering as a sorting problem and use a pairwise order-relation operator trained from annotated data to generalize to various layouts of input documents. The phenomenon under study reveals significant trends in the state-of-the-art performance on standard datasets with significant recognition accuracy gain and reordering precision. It opens up the efficient processing of handwritten documents in applications that range from preserving historical writing samples to today’s administrational scanned handwritten documents.