Document Analysis Systems

Name: Document Analysis Systems | 15th IAPR International Workshop, DAS 2022, La Rochelle, France, May 22-25, 2022, Proceedings
Brand: Springer
Price: 106.99 EUR
Availability: OnlineOnly

15th IAPR International Workshop, DAS 2022, La Rochelle, France, May 22-25, 2022, Proceedings

Seiichi Uchida Elisa Barney Véronique Eglin(Editor)

Springer (Publisher)

Published on 17. May 2022

XVIII, 789 pages

E-Book

PDF with digital watermarking

System requirements

978-3-031-06555-2 (ISBN)

€106.99incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

More details

Other editions

Content

Intro
Preface
Organization
Contents
Document Analysis Systems and Applications
Font Shape-to-Impression Translation
1 Introduction
2 Related Work
2.1 Subjective Impression Analysis of Fonts
2.2 Objective Impression Analysis of Fonts
2.3 Transformer
3 Dataset and Local Descriptor
4 Shape-Impression Relationship Analysis by Multi-label Classification Approach
4.1 Transformer as a Multi-label Classifier
4.2 Implementation Details
4.3 Classification Examples
4.4 Shape-Impression Relation Analysis by Group-Based Occlusion Sensitivity
5 Shape-Impression Relationship Analysis by Translation Approach
5.1 Transformer as a Shape-to-Impression Translator
5.2 Implementation Details
5.3 Translation Examples
5.4 Shape-Impression Relation Analysis Using Integrated Gradients
6 Experimental Results
6.1 Quantitative Evaluation of the Trained Transformer
6.2 Analysis Results of the Shape-Impression Relationship
7 Conclusion and Future Work
References
TrueType Transformer: Character and Font Style Recognition in Outline Format
1 Introduction
2 Related Work
2.1 Transformer
2.2 Font Analysis by Using Vector Format
3 TrueType Transformer (T3)
3.1 Representation of Outline
3.2 Transformer Model for T3
4 Experiment
4.1 Dataset
4.2 Implementation Details
4.3 Quantitative Comparison Between Outline-Based and Image-Based Recognition Methods
4.4 Qualitative Comparison Between Outline-Based and Image-Based Recognition Methods
4.5 Analysis of Learned Attention
5 Conclusion
References
Unified Line and Paragraph Detection by Graph Convolutional Networks
1 Introduction
2 Related Work
2.1 Text Line Detection
2.2 Paragraph Detection
3 Proposed Method
3.1 Pure Bounding Box Input
3.2 Problem Statement
3.3 Main Challenge
3.4 -skeleton Graph with 2-Hop Connections
3.5 GCN Predictions
3.6 Forming Lines
3.7 Forming Paragraphs
3.8 Overall System Pipeline
4 Limitations
4.1 Single-Line Paragraphs
4.2 Document Rotations
5 Experiments
5.1 PubLayNet Results
5.2 Real-World Evaluation Results
6 Conclusions and Future Work
References
The Winner Takes It All: Choosing the ``best'' Binarization Algorithm for Photographed Documents
1 Introduction
2 Quality-Time Evaluation Methods
3 Test Set
4 Results
4.1 Motorola Moto G9
4.2 Samsung A10S
4.3 Samsung S20
4.4 Apple iPhone SE
5 Conclusions
References
A Multilingual Approach to Scene Text Visual Question Answering
1 Introduction
2 Related Work
2.1 Word Embeddings
2.2 Scene Text Visual Question Answering
3 Methodology
3.1 Word Embeddings
3.2 Visual Question Answering Architecture
4 Experiments
4.1 Datasets
4.2 Evaluation Metrics
4.3 Implementation Details
4.4 VQA Experiments
5 Conclusions
References
Information Extraction and Applications
Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents
1 Introduction
2 Related Work
3 Methodology
3.1 Questions and Answers
3.2 Compound QAs
3.3 Sentence IDs and Canonical Format
4 Experimental Setup
4.1 Models
4.2 Datasets
4.3 Training and Inference
5 Results
5.1 Experiments for Compound QAs
5.2 Experiments for Sentence IDs and Canonical Format
5.3 Comparison with BERT on a NER Task
6 Conclusion
References
Contrastive Graph Learning with Graph Convolutional Networks
1 Introduction
2 Related Work
3 Methodology
3.1 Graph Representation
3.2 Contrastive Graph Learning
4 Graph Convolution
4.1 Loss
5 Experiments
5.1 Datasets
5.2 Implementation Details
5.3 Baseline Methods
5.4 Comparison with Baseline Methods
5.5 Supervised vs. Semi-supervised Contrastive Graph Learning
5.6 Ablation Studies
6 Conclusion and Future Work
References
Improving Information Extraction on Business Documents with Specific Pre-training Tasks
1 Introduction
2 Related Work
2.1 Information Extraction
2.2 Pre-training
3 Models
3.1 Architecture
3.2 ConfOpt Post-processing
4 Pre-training
4.1 Numeric Ordering Task
4.2 Layout Inclusion Task
5 Datasets
5.1 Business Documents Collection
5.2 Business Documents Collection - Purchase Orders
5.3 ICDAR 2019 - Scanned Receipts
6 Experiments
6.1 Post-processing
6.2 Business Document-Specific Pre-training
7 Conclusion
References
How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts-5pt
1 Introduction
2 Related Work
3 Methodology
3.1 Problem Statement
3.2 Framework
3.3 Baseline Models
4 Data Description and Experimental Setup
4.1 Implementation Details
5 Experimental Results and Discussion
5.1 Evaluation Criteria
5.2 Cross-Year Experiments
5.3 Ablation Study
5.4 Non Parametric (Levene's) Test
5.5 Observations
5.6 Error Analysis
6 Conclusion and Future Work
References
Historical Document Analysis + CSAWA
Recognition and Information Extraction in Historical Handwritten Tables: Toward Understanding Early 20th Century Paris Census
1 Introduction
2 Corpus and Ground-Truthed Datasets
2.1 Presentation of the Census
2.2 Annotation of Two Datasets
3 Processing Pipeline
4 Layout Analysis and Information Extraction
4.1 Segmentation and Dewarping of Tables
4.2 Page Classification
4.3 Segmentation of Tables into Rows
5 Handwriting Recognition
5.1 Architecture of the Optical Model
5.2 Results of the Optical Model
5.3 Self-training
6 Leveraging Domain Knowledge
6.1 Language Models
6.2 Normalization and Logical Deductions
7 Processing Time
8 Conclusion
References
Importance of Textlines in Historical Document Classification
1 Introduction
2 Related Work
3 Datasets
4 Document Classification Systems
4.1 Loss Functions
4.2 Patch System
4.3 Textline System
4.4 System Fusion
5 Experiments
5.1 Experimental Setup
5.2 Results
6 Conclusion
References
Historical Map Toponym Extraction for Efficient Information Retrieval
1 Introduction
2 Related Work
3 Dataset
4 Toponym Processing Approach
4.1 Toponym Detection Methods
4.2 Toponym Classification Method
4.3 OCR Models
5 Experiments
5.1 Toponym Detection
5.2 Toponym Classification
5.3 OCR Results
6 Conclusions and Future Work
References
Information Extraction from Handwritten Tables in Historical Documents
1 Introduction
2 Related Work
3 HisClima Dataset
4 Proposed Approaches
4.1 Heuristic Geometric Information
4.2 Log-Linear Model
4.3 Graph Neural Network
5 Evaluation Criteria and Metrics
6 Experimental Framework and Results
6.1 Experimental Settings
6.2 Text Recognition Results
6.3 Information Extraction Results
7 Discussion
8 Reproducibility
9 Conclusions
References
Named Entity Linking on Handwritten Document Images
1 Introduction
2 Named Entity Linking
3 Dataset
3.1 Synthetic HW-AIDA-CoNLL
3.2 IAM-DB
3.3 George Washington
4 Baseline Approach
4.1 Handwriting Text Recognizer
4.2 Named Entity Linking
5 Experiments
5.1 Evaluation Protocol
5.2 Results
6 Conclusion
References
Pattern Analysis Software Tools (PAST) for Written Artefacts
1 Introduction
2 The Handwriting Analysis Tool (HAT)
2.1 Basic Functionality
2.2 Use Case
3 The Visual-Pattern Detector (VPD)
3.1 Basic Functionality
3.2 Use Case
4 The Line Detection Tool (LDT)
4.1 Basic Functionality
4.2 Use Case
5 The XRF-Data Analysis Tool (XRF-DAT)
5.1 Basic Functionality
5.2 Use Case
6 The Artefact-Feature Analysis Tool (AFAT)
6.1 Basic Functionality
6.2 Use Case
7 Conclusion
References
TEI-Based Interactive Critical Editions
1 Introduction
2 Preliminaries
3 Related Work
4 Critical Edition - Critical Texts of Cankam Literature
5 Transforming Documents into TEI
6 Interactive Critical Editions
7 Databasing on Demand
8 An Annotation System for the Humanities
9 Application and Results
10 Conclusion and Future Work
References
Handwriting Text Recognition
Best Practices for a Handwritten Text Recognition System
1 Introduction
2 Related Work
3 Proposed HTR System
3.1 Preprocessing
3.2 Network Architecture
3.3 Training Scheme
4 Experimental Evaluation
4.1 Ablation Study
4.2 Comparison to State-of-the-Art Systems
5 Conclusions
References
Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes
1 Introduction
2 Related Work
3 Data
4 Methods
4.1 Preprocessing
4.2 Network-Architecture
4.3 Training
4.4 Inference
4.5 Language Model
4.6 Synthetic Line Generation
5 Results
5.1 Language Models
5.2 Pretraining on Synthetic Data
5.3 Training on Real Data
5.4 Enabling the Language Model
5.5 Varying the Beam Count
5.6 Ablation Study
5.7 Comparison with the State of the Art
6 Future Work
References
A Light Transformer-Based Architecture for Handwritten Text Recognition
1 Introduction
2 Related Works
2.1 Standard Approaches for HTR
2.2 Transformer-Based Architectures
3 Our Light Encoder-Decoder Transformer-Based Model
3.1 Summary of the Architecture
3.2 Network Encoder
3.3 Transformer Decoder
3.4 Hybrid Loss
4 Experiments and Results
4.1 Handwritten Text-line Data
4.2 Experimental Settings
4.3 Ablation Study of the Main Components of Our Network
4.4 Benefits of Using a Light Architecture
4.5 Interest of the Hybrid Loss
4.6 Comparison with the State of the Art
5 Conclusion and Future Works
References
Effective Crowdsourcing in the EDT Project with Probabilistic Indexes
1 Introduction
2 Preparing the PrIx System for the Crowdsourcing Platform
3 Description of the Collections
3.1 EDT Hungary
3.2 EDT Norway
3.3 EDT Portugal
3.4 EDT Spain
3.5 EDT Malta
4 HTR and PrIx Experiments and Results
5 Crowdsourcing Platform
6 Conclusions
References
Applications in Handwriting
Paired Image to Image Translation for Strikethrough Removal from Handwritten Words
1 Introduction
2 Related Works
2.1 Strikethrough Processing
2.2 Paired Image to Image Translation
2.3 Strikethrough Datasets
3 Image to Image Translation Models for Strikethrough Removal
4 Experiment Setup
4.1 Datasets
4.2 Neural Network Training Protocol
4.3 Evaluation Protocol
5 Results and Analysis
5.1 Models Trained on IAMsynth
5.2 Models Trained on Individual Partitions of Draculasynth
5.3 Models Trained on the Aggregation of Partitions from Draculasynth
5.4 Qualitative Results
6 Conclusions
A Dataset and Code Availability
References
Revealing Reliable Signatures by Learning Top-Rank Pairs
1 Introduction
2 Related Work
3 Learning Top-Rank Pairs
3.1 Feature Representation of Paired Samples
3.2 Optimization to Learn Top-Rank Pairs
3.3 Learning Top-Rank Pairs with Their Representation
3.4 Initial Features by SigNet
4 Experiments
4.1 Datasets
4.2 Experimental Settings
4.3 Evaluation Metrics
4.4 Quantitative and Qualitative Evaluations
5 Conclusion
References
On-the-Fly Deformations for Keyword Spotting
1 Introduction
2 Related Work
3 Reference KWS System
3.1 Preprocessing
3.2 Proposed Architecture
3.3 Training Process
3.4 Retrieval Application
4 On-the-Fly Deformation
4.1 Considered Deformations
4.2 Query-Based Deformation
4.3 Implementation Aspects
5 Experimental Evaluation
5.1 Ablation Study
5.2 Comparison to State-of-the-Art Systems
6 Conclusions and Future Work
References
Writer Identification and Writer Retrieval Using Vision Transformer for Forensic Documents
1 Introduction
2 Related Work
2.1 Methods with Enrollment
2.2 Methods Without Enrollment
3 WI/WR Using Vision Transformer
3.1 Preprocessing
3.2 ViT-Lite
3.3 Aggregation/Encoding
4 Experimental Setup
4.1 Datasets
4.2 Training Details
4.3 Evaluation
5 Results
5.1 CVL Dataset
5.2 ICDAR 2013 WI/WR Competition Dataset
5.3 WRITE Dataset
5.4 Comparison to State of the Art
6 Conclusion
References
Approximate Search for Keywords in Handwritten Text Images
1 Introduction
2 Probabilistic Indexing and Search
2.1 Multi-word Boolean Queries
3 Approximate-Spelling Queries
3.1 Algorithmics
4 Dataset, Assessment, Queries, and Empirical Settings
4.1 Dataset
4.2 Query Selection
4.3 Evaluation Protocols and Measures
4.4 Experimental Settings
5 Experiments and Results
5.1 Retrieval Performance
5.2 Computational Performance
5.3 Illustrative Retrieval Examples
6 Conclusion
References
Keyword Spotting with Quaternionic ResNet: Application to Spotting in Greek Manuscripts
1 Introduction
2 Quaternions in Neural Networks
2.1 Elementary Notions
2.2 Quaternionized Versions of Standard NN Layers
3 Why Are Quaternionic Layers Less Costly?
4 Proposed Model
5 Experiments
5.1 Datasets
5.2 Hyperparameters and Other Training Considerations
5.3 Results
6 Conclusion and Future Work
References
Open-Source Software and Benchmarking
A Comprehensive Comparison of Open-Source Libraries for Handwritten Text Recognition in Norwegian
1 Introduction
2 Related Work
3 The Hugin-Munin Dataset for HTR in Norwegian
3.1 Overview
3.2 Dataset
3.3 Transcription Process
3.4 Language
4 HTR Libraries and Models
4.1 Selection of the Libraries
4.2 Description of the Selected Libraries
4.3 Training of the Models
5 Results
5.1 Random Split
5.2 Random Split by Writer with Unseen Writers
6 Conclusion
References
Open Source Handwritten Text Recognition on Medieval Manuscripts Using Mixed Models and Document-Specific Finetuning
1 Introduction
2 Related Work
3 Data Sets
4 Methods
5 Experiments
5.1 Determining the Best Starting Model
5.2 Iterative Document-Specific Training
6 Discussion
7 Conclusion and Future Work
References
A Comprehensive Study of Open-Source Libraries for Named Entity Recognition on Handwritten Historical Documents
1 Introduction
2 Related Work
3 Handwritten Historical Document Corpora
3.1 Nested Entities in HOME Corpus
4 Named Entity Recognition Libraries
5 Experiments
5.1 Evaluation Metrics
5.2 Hyperparameters and Model Training
6 Results
7 Conclusions and Future Work
References
A Benchmark of Named Entity Recognition Approaches in Historical Documents Application to 19th Century French Directories
1 Introduction
2 OCR and NER on Historical Texts
2.1 Optical Character Recognition of Historical Texts
2.2 Named Entity Recognition in Historical Texts
2.3 Pipeline Summary
3 Dataset
3.1 A Selection of Paris Trade Directories from 1798 to 1854
3.2 A Dataset for OCR and NER Evaluation
3.3 Metrics for OCR and NER Quality Assessment
4 OCR Benchmark
5 NER Sensibility to the Number of Training Examples
5.1 Training and Evaluation Protocol
5.2 Results and Discussion
6 Impact of OCR Noise on Named Entity Recognition
6.1 Training and Evaluation Protocol
6.2 Results and Discussion
7 Conclusion and Future Works
References
NCERT5K-IITRPR: A Benchmark Dataset for Non-textual Component Detection in School Books
1 Introduction
2 Related Datasets
3 NCERT5K-IITRPR Dataset
3.1 Source and Statistics
3.2 Label Categories
3.3 Annotation Method
4 Benchmarking
4.1 Models
4.2 Experimental Setup
4.3 Results and Analysis
5 Conclusion
References
Poster Session 1
ReadOCR: A Novel Dataset and Readability Assessment of OCRed Texts
1 Introduction
2 Proposed Dataset
2.1 Document Collection
2.2 Proposed Text Corpus
2.3 Dataset Analysis
3 Readability Assessment
3.1 Methods
3.2 Experimental Results
4 Conclusions
References
Hard and Soft Labeling for Hebrew Paleography: A Case Study
1 Introduction
2 Related Work
3 Hebrew Paleography
4 VML-HP-ext Dataset Description
5 Case Study
5.1 Hard-Label Classification
5.2 Soft-Label Regression
5.3 Maximum Score Class Assignment
5.4 Nearest Neighbor Label Conversion
5.5 Comparison Between Soft and Hard-Label Classification
6 Conclusion and Further Research
References
AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
1 Introduction
2 Related Work
3 Attention-Based Encoder-Decoder Network
4 Experimental Results
4.1 Datasets
4.2 Hyper-parameters
4.3 Results
4.4 Hyper-parameter Tuning
4.5 Ablation Study
4.6 Test Set Errors
5 Error Analysis
5.1 Character-Level Error Analysis
5.2 Bias-Variance Analysis
5.3 Visual Analysis of Images
6 Conclusion
References
.26em plus .1em minus .1emHST-GAN: Historical Style Transfer GAN for Generating Historical Text Images
1 Introduction
2 Related Work
2.1 Datasets
2.2 Data Preparation
3 Method
3.1 Model Framework
3.2 Objective Function
4 Experiments
4.1 Style Transfer Evaluation
4.2 Data Augmentation Using Style Transfer
5 Conclusion
References
Challenging Children Handwriting Recognition Study Exploiting Synthetic, Mixed and Real Data
1 Introduction
1.1 Children Handwriting Recognition Context and Problematic
1.2 ScolEdit: A Small Real Children Handwriting Dataset
1.3 Investigating Variable Training Datasets Composition
2 Related Works for Latin Handwritten Text Recognition
3 Scoledit: A Real Children Handwriting Annotated Dataset
3.1 Line Cleaning
3.2 Words Detection
3.3 Words Annotation on IAM Format
4 HTR Architecture and Recognition Scenarios
4.1 Standard MDLSTM-RNN Word Transcriber
4.2 Mixing Real and Synthetic Data to Enhance the Recognition Rates
4.3 Scenarios for HTR Training and Data Preparation
5 Experiments and Results
5.1 First Scenario: Supervised Selection of Validation Datasets and Domain Transfer
5.2 Second Scenario: Training Focused on Dictation Words
5.3 Third Scenario: Large Lexicon Training with Transfer
6 Conclusions
References
Combining Image Processing Techniques, OCR, and OMR for the Digitization of Musical Books
1 Introduction
2 The Music in the Santo Domingo's Cathedral Book
3 Related Work
4 Methods
4.1 Block Detection
4.2 OCR
4.3 OMR
4.4 Data Storage
5 Discussion
6 Conclusions
References
Evaluation of Named Entity Recognition in Handwritten Documents
1 Introduction
2 Related Work
3 Framework
3.1 Characteristics of the Task
3.2 HTR and NER via a Coupled Model
3.3 Error Correction
4 Evaluation Metrics
4.1 Character and Word Error Rates
4.2 Precision, Recall and F1-Score
4.3 Entity CER and Entity WER
5 Experimental Method
5.1 Dataset
5.2 Implementation Details
5.3 Obtained Results
6 Conclusions
References
A Generic Image Retrieval Method for Date Estimation of Historical Document Collections
1 Introduction
2 Related Work
3 Datasets
4 Learning Objectives
5 Proposed Method
6 Application
6.1 Smooth-nDCG Human-in-the-Loop Architecture
6.2 Quantitaive Evaluation
7 Conclusions
References
Combining Visual and Linguistic Models for a Robust Recipient Line Recognition in Historical Documents
1 Introduction
2 Related Work
3 Background: Handwritten Text Recognition with Transformers
4 Methodology
4.1 Semantic Segmentation of Recipient Lines
4.2 Handwritten Text Recognition and Recipient Line Classification with CNN
4.3 Joint Recipient Line Recognition and Transcription
4.4 Combination of Different Approaches
5 Experimental Evaluation
5.1 Nuremberg Letterbooks
5.2 Experiment Details
5.3 Metrics
5.4 Results
6 Discussion
7 Conclusion
References
Investigating the Effect of Using Synthetic and Semi-synthetic Images for Historical Document Font Classification
1 Introduction
2 Related Work
3 Dataset and Image Generation
3.1 Dataset
3.2 Semi-synthetic Image Generation Using DocCreator
3.3 Synthetic Image Generation Using Generative Adversarial Networks
4 Experiments
5 Results
6 Conclusion and Future Work
References
Poster Session 2
3D Modelling Approach for Ancient Floor Plans' Quick Browsing
1 Introduction
2 Related Work
2.1 Wall Detection
2.2 3D Modelling
3 Proposed Approach
3.1 Floor Plan Digitization
3.2 Wall Mask Generation
3.3 3D modelling
4 Results and Evaluation
4.1 Dataset
4.2 Evaluation Protocol
4.3 Results
5 Perspectives and Conclusion
References
A Comparative Study of Information Extraction Strategies Using an Attention-Based Neural Network
1 Introduction
2 Related Works
2.1 Handwriting Recognition
2.2 Information Extraction
2.3 Our Statement
3 The Attention-Based Seq2seq Architecture
4 Strategies for Information Extraction
4.1 Comparing the Sequential and Joint Approaches
4.2 Exploring Additional Joint Learning Configurations
5 Experiments
5.1 Handwriting Recognition Using Seq2seq
5.2 Information Extraction Using Seq2seq
6 Conclusion
References
QAlayout: Question Answering Layout Based on Multimodal Attention for Visual Question Answering on Corporate Document
1 Introduction
2 Related Work
3 Problem Definition
4 Proposed Approach
4.1 Global Description
4.2 Encoder
4.3 Co-attention
5 Experiments
5.1 Dataset
5.2 Performance Evaluation
6 Conclusion and Future Work
References
Is Multitask Learning Always Better?
1 Introduction
2 Related Work
3 Datasets
4 Methodology
5 Evaluation
5.1 ResNet
5.2 Perceiver
6 Discussion
7 Conclusions
References
SciBERTSUM: Extractive Summarization for Scientific Documents
1 Introduction
2 Related Work
2.1 Summarization
2.2 Transformer Based Summarization
3 Method - SciBERTSUM
4 Language Model Architecture
4.1 Embedding Layer
4.2 Attention Mechanism
4.3 Transformer Layer
5 Sentence Extractor
5.1 Sentence Features
5.2 Document Embedding
5.3 Score Predictor
6 Reinforcement Learning
7 Experimental Results
7.1 Hardware
7.2 Experiments
8 Conclusions and Future Work
References
Using Multi-level Segmentation Features for Document Image Classification
1 Introduction
2 Related Work
3 Proposed Method
3.1 Integrated CNN Architecture
3.2 Implementation Details
4 Experiments
4.1 Datasets
4.2 Experimental Results
5 Conclusion
References
Eye Got It: A System for Automatic Calculation of the Eye-Voice Span
1 Introduction
2 Eye Got It
2.1 Eye Tracking
2.2 Speech Processing
2.3 EVS Computation
3 Experiment
3.1 Participants
3.2 Apparatus and Material
3.3 Procedure
3.4 Results
4 Discussion
4.1 Eye Tracking
4.2 Audio
5 Conclusion
References
Text Detection and Post-OCR Correction in Engineering Documents
1 Introduction
2 Related Work
2.1 Text Detection in Unconstrained Documents
2.2 Post-OCR Correction
3 Our Approach for Lexicon-Free Text Recognition
3.1 EAST-Based Text Detection
3.2 Open-Source Engine for Text Recognition
3.3 Post-OCR Correction of Tags and Lexicon-Free Worlds
4 Experimentation and Discussions
4.1 Dataset
4.2 Results
5 Conclusion
References
TraffSign: Multilingual Traffic Signboard Text Detection and Recognition for Urdu and English
1 Introduction
2 Related Work
2.1 Text Detection
2.2 Text Recognition
2.3 Standard Benchmark Datasets for Text Detection and Recognition
3 Dataset Preparation
3.1 Data Acquisition and Pre-processing
3.2 Multi-lingual Text Detection and Recognition
4 The Methodology
4.1 Multi-lingual Text Detection Architecture
4.2 Text Recognition Architecture
5 Experiments and Results
5.1 Evaluation of Multi-lingual Text Detection Methods
5.2 Evaluation of Multi-lingual Text Recognition Methods
5.3 Evaluation of Proposed Text-Detection and Recognition Models as an End-to-End Pipeline
6 Conclusions
References
Read While You Drive - Multilingual Text Tracking on the Road
1 Introduction
2 Related Work
2.1 Datasets for Text Spotting in Videos
2.2 Text Detection
2.3 Text Tracking
2.4 Multiple Object Tracking Metrics
3 RoadText-3K Dataset
3.1 Videos
3.2 Annotations
3.3 Analysis
4 Methodology
4.1 Text Detection
4.2 Text Tracking
4.3 CenterNet-Based Detection and Tracking
5 Results
5.1 Frame Level Text Detection
5.2 Tracking
5.3 Qualitative Analysis
6 Conclusions
References
A Fair Evaluation of Various Deep Learning-Based Document Image Binarization Approaches
1 Introduction
2 Overview of Evaluated Binarization Methods
2.1 Document Enhancement Generative Adversarial Network
2.2 SauvolaNet
2.3 Two-Stage GAN
2.4 Robin U-Net Model
2.5 DP-LinkNet
2.6 Selectional Auto-Encoder
2.7 DeepOtsu
3 Materials and Methods
3.1 Datasets
3.2 Metrics
3.3 Training
4 Evaluation
5 Conclusion
References
Correction to: How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts
Correction to: Chapter "How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts" in: S. Uchida et al. (Eds.): Document Analysis Systems, LNCS 13237, https://doi.org/10.1007/978-3-031-06555-2_9
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Document Analysis Systems

Description

More details

Other editions

Additional editions

Content

System requirements