
Linking Theory and Practice of Digital Libraries
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
This book constitutes the refereed proceedings of the 28th International Conference on Linking Theory and Practice of Digital Libraries, TPDL 2024, held in Ljubljana, Slovenia, during September 24-27.
The 13 full papers, 19 short papers and 11 papers of other types included in this book were carefully reviewed and selected from 83 submissions. Over the years, TPDL has established itself as an important international forum focused on digital libraries and associated technical, practical, and social issues. In 2024, TPDL expanded its scope to prominently include Document Analysis/Recognition and Information Retrieval, acknowledging the vital role of those research areas in the creation (by means of digitization and information extraction from heterogeneous sources), access, discovery, and dissemination of digital content.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Keynotes
- Libraries, Digital Libraries, and Data: 30 years of Development in Central Europe
- Searching for Climate Impact
- Impresso Beyond Borders: Connecting Historical Newspapers and Radio
- Contents - Part I
- Contents - Part II
- Full Papers
- Learning Reading Order via Document Layout with Layout2Pos
- 1 Introduction
- 2 Related Work
- 2.1 Multimodal Pre-trained Language Models
- 2.2 Addressing Reading Order Issues
- 2.3 Generative Methods for Information Extraction
- 3 Preliminary Experiments: OCR Serialization Errors
- 4 Reconstructing Positional Information from 2D Positions
- 4.1 Encoding Layout Information
- 4.2 Learning 1D Position Embeddings from Layout Information
- 4.3 Integrating Layout2Pos Into a Sequence-to-Sequence Framework
- 5 Experiments
- 5.1 Data
- 5.2 Experimental Settings
- 5.3 Visual Information Extraction
- 6 Results and Discussion
- 6.1 Next Token Position Prediction
- 6.2 Visual Information Extraction
- 7 Conclusion
- 8 Limitations
- References
- SWARM-SLR - Streamlined Workflow Automation for Machine-Actionable Systematic Literature Reviews
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 Requirements
- 3.2 Tools
- 3.3 Workflow
- 4 Evaluation
- 4.1 Design
- 4.2 Results
- 5 Discussion and Future Work
- 6 Conclusion
- References
- A Reputation System for Scientific Contributions Based on a Token Economy
- 1 Introduction
- 2 Problem Statement
- 3 Approach
- 3.1 Stakeholders
- 3.2 Use Case
- 3.3 Requirements and Constraints
- 3.4 Reputation Token Economy
- 4 Implementation
- 4.1 Reputation Token
- 4.2 User Interface and Visualization
- 5 Related Work
- 6 Qualitative Evaluation
- 7 Discussion
- 8 Conclusion
- 9 Future Work
- References
- Bibliotheca Eugeniana Digital-Unveiling and Visualizing the Treasures of Prince Eugene of Savoy's Library
- 1 Introduction
- 2 From a Handwritten Catalog to a Digital Edition
- 2.1 Digitizing the Handwritten Catalog
- 2.2 Supralibros Analyses
- 3 Visualizing the Bibliotheca Eugeniana
- 4 Discussion
- References
- Comparative Analysis of Evaluation Measures for Scientific Text Simplification
- 1 Introduction
- 2 Measures
- 3 Corpora
- 4 Results
- 4.1 Simplicity
- 4.2 Meaning Preservation
- 5 Discussion
- 5.1 Interpretation of Results
- 5.2 Limitations
- 6 Conclusion
- References
- Promoting Interoperability on the Datasets of the Arrowheads Findings of the Chalcolithic and the Early/Middle Bronze Age
- 1 Introduction
- 2 Background
- 3 Methodology
- 3.1 Objectives
- 3.2 Process
- 4 Results
- 4.1 The Arrowhead Metadata Application Profile Domain Model
- 4.2 The Arrowhead Metadata Application Profile (Ah MAP)
- 5 Discussion
- 6 Conclusion and Future Work
- References
- Tracing the Retraction Cascade: Identifying Non-retracted but Potentially Retractable Articles
- 1 Introduction
- 2 Literature Review
- 2.1 Importance of Citations
- 2.2 Citations of Retracted Articles
- 3 Dataset Formulation
- 3.1 Dataset Design Decision
- 4 Retraction-Centric Approach
- 5 Experimental Setting and Evaluation
- 6 A Case in Point
- 7 Discussion and Conclusion
- 8 Limitation and Future Work
- References
- Mapping Techniques for an Automated Library Classification: The Case Study of Library Loans at Bibliotheca Hertziana
- 1 Introduction
- 2 Related Work
- 3 Methodology Overview
- 3.1 Data Visualisation
- 3.2 Generation of Subjects and Descriptions
- 3.3 Limitations
- 4 Case Study Investigation
- 5 Evaluation Insights
- 5.1 Expert Feedback
- 6 Conclusion
- References
- LIT: Label-Informed Transformers on Token-Based Classification
- 1 Introduction
- 2 Related Work
- 3 Model
- 4 Experiment
- 4.1 Datasets
- 4.2 Baseline
- 4.3 Settings
- 5 Results and Discussion
- 5.1 Historical NER
- 5.2 Term Extraction
- 6 Conclusion
- References
- Improving Retrieval and Expression of Iconographical and Iconological Semantic Statements: An Extension of the ICON Ontology
- 1 Introduction
- 2 The ICON Ontology
- 3 Extension Motivation
- 3.1 Iconology Dataset
- 3.2 IICONGRAPH
- 3.3 Content Motivation
- 3.4 Technical Motivation
- 4 Extension Description
- 5 Evaluation
- 5.1 Datasets
- 5.2 Competency Questions Evaluation
- 5.3 Efficiency Evaluation
- 6 Conclusion
- References
- Scholarly Quality Measurements: A Systematic Literature Review
- 1 Introduction
- 2 Background and Related Work
- 3 Methodology
- 3.1 Research Questions
- 3.2 Search Strategy
- 3.3 Selection Criteria
- 3.4 Data Extraction
- 3.5 Conduct
- 3.6 Results and Discussion
- 4 Conclusion
- References
- Assessing the Accessibility and Usability of Web Archives for Blind Users
- 1 Introduction
- 2 Related Work
- 2.1 Web Accessibility
- 2.2 Web Usability for Blind Users
- 3 Materials and Methods
- 3.1 Accessibility Analysis of Web Archives
- 3.2 Usability Analysis of Web Archives
- 4 Results
- 4.1 Accessibility of Web Archives
- 4.2 Usability Analysis of Web Archives
- 4.3 Qualitative Feedback
- 5 Discussion
- 5.1 Unique Issues of Web Archives
- 5.2 Limitations and Future Work
- 5.3 Suggestions for Web Archive Developers
- 5.4 Suggestions for Assistive Technologies Developers
- 6 Conclusion
- References
- Leveraging Transfer Learning for Article Segmentation in Historical Newspapers
- 1 Introduction
- 2 Preliminaries
- 2.1 Transfer Learning
- 2.2 Pre-trained Models
- 2.3 Concepts and Definitions
- 3 Related Work
- 4 Methodology
- 4.1 Network Architecture
- 4.2 Models Description
- 4.3 Model Training
- 4.4 Post-processing and Article Segmentation Module
- 5 Experimental Setup and Results
- 5.1 Datasets and Evaluation Metrics
- 5.2 Results and Discussion
- 6 Conclusions and Future Outlook
- References
- Findings Papers
- Multi-dimensional Edge-Embedded GCNs for Arabic Text Classification
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 4 Experiments
- 4.1 Datasets
- 4.2 Baselines
- 4.3 Experimental Results
- 4.4 Impact of Edge Dimensions
- 4.5 Impact of Varying Pooling Method
- 4.6 Complexity Analysis
- 4.7 Qualitative Analysis
- 5 Conclusion and Future Work
- References
- LIAS: Layout Information-Based Article Separation in Historical Newspapers
- 1 Introduction
- 2 Related Work
- 3 The LIAS Model
- 3.1 Separator Lines Detection and Region Segmentation
- 3.2 Paragraphs Grouping and Linking
- 3.3 Link Classification
- 4 Experiments
- 4.1 Dataset
- 4.2 Metrics
- 4.3 Benchmarks
- 4.4 Detailed Setup and Hyperparameters
- 5 Results and Discussion
- 5.1 Ablation Study
- 6 Conclusion and Limitations
- References
- CALM: Context Augmentation with Large Language Model for Named Entity Recognition
- 1 Introduction
- 2 Backbone Model
- 3 Framework
- 3.1 What to Ask?
- 3.2 How Does a LLM React to Different Formulations?
- 4 Protocol
- 4.1 Datasets
- 4.2 Baselines and Effectiveness Metrics
- 4.3 Qualitative Metrics
- 4.4 Training Details
- 5 Results
- 5.1 Measuring the Effectiveness of LLM-Based Context Augmentation
- 5.2 Analyzing the Quality of Generated Contexts
- 6 Related Work
- 7 Conclusion
- References
- Database Approaches to the Modelling and Querying of Musical Scores: A Survey
- 1 Introduction
- 2 The Content of a Musical Score
- 3 Document-Oriented Models
- 3.1 ASCII-Based Documents
- 3.2 Semi-structured Documents
- 4 Graph-Oriented Models
- 5 Time-Series-Oriented Models
- 5.1 Time-Series Algebra
- 5.2 Representation in an Euclidean Space
- 6 Discussion
- 7 Conclusion and Perspectives
- References
- Content-Based Dataset Retrieval Methods: Reproducibility of the ACORDAR Test Collection
- 1 Introduction
- 2 Original Contribution
- 3 Analysis of the ACORDAR Test Collection
- 4 Reproducibility Results
- 5 Further Analyses on the Impact of Data
- 6 Conclusions
- References
- Enhancing Identification of Scholarly Reference on YouTube
- 1 Introduction
- 2 Related Work
- 2.1 Investigation of Medical and Health-Related Information on YouTube
- 2.2 YouTube Videos Cited from Scholarly Articles
- 2.3 Bibliometric/Quantitative Analysis of Scholarly References on YouTube
- 3 Materials and Methods
- 3.1 Methodological Approaches by Altmetric and the Proposed Method
- 3.2 Dataset Construction
- 3.3 Structure of the Dataset
- 3.4 Analysis Methods
- 4 Results and Discussion
- 4.1 Comparison between the dataset constructed by the proposed method and the Altmetric dataset
- 4.2 Characteristics of external links as scholarly reference on YouTube
- 4.3 Contribution of each domain name to dataset coverage
- 5 Conclusion
- References
- Mining Literary Trends: A Tool for Digital Library Analysis
- 1 Introduction
- 2 Related Work
- 2.1 Topic Modeling
- 2.2 Large Language Models
- 2.3 Data Visualization
- 2.4 Tools for Digital Libraries
- 2.5 Our Proposal
- 3 Software Architecture and Functionality
- 3.1 Metadata Ingestion and Aggregation
- 3.2 PDF Knowledge Extraction
- 3.3 Metadata Aggregation and Normalization
- 3.4 Interactive Visualization
- 4 Use Case
- 4.1 Data Set Overview
- 4.2 Methodology and Analysis
- 5 Conclusion
- References
- PRET19: Automatic Recognition and Indexing of Handwritten Loan Registers from 19th Century Parisian Universities
- 1 Introduction
- 2 Loan Registers for Parisian Libraries in the 19th Century : Description of the Corpus
- 3 Building a Relational Database for Nineteenth-Century Library Loan Registers
- 3.1 Principles of the Data Modeling
- 3.2 Description of the Data Model
- 4 Semi-automatic Indexing of Handwritten Registers
- 4.1 Analysis of the Corpus Complexity
- 4.2 Description of the Workflow
- 5 Conclusion and Future Work
- 6 Appendices
- 6.1 Data model detailled
- 6.2 Structure of the website
- References
- Leveraging Open Large Language Models for Historical Named Entity Recognition
- 1 Introduction
- 2 Related Work
- 2.1 Named Entity Recognition in Historical Corpora
- 2.2 Large-Scale Language Models
- 3 Few-Shot Prompting for Historical NER
- 3.1 Datasets
- 3.2 Instruct Models
- 3.3 Prompt Designs
- 3.4 Post-processing Steps
- 3.5 Evaluation Metrics
- 4 Results
- 5 Conclusion
- References
- Enriching Archival Linked Data Descriptions with Information from Wikidata and DBpedia
- 1 Introduction
- 2 Background
- 2.1 ArchOnto
- 2.2 DBpedia
- 2.3 Wikidata
- 3 Methodology
- 4 Results and Discussion
- 4.1 Identification of Entities and Properties
- 4.2 Manual Representation of ``Processo de João''
- 4.3 Properties to Extend the Model
- 5 Conclusions
- References
- OpenPSS: An Open Page Stream Segmentation Benchmark
- 1 Introduction
- 2 Related Work
- 2.1 Page Stream Segmentation
- 2.2 Other Datasets
- 3 Method
- 3.1 Datasets
- 3.2 PSS Variants
- 3.3 Models
- 3.4 Model Ensembling
- 3.5 Evaluation
- 4 Results
- 4.1 Standard Page Stream Segmentation Task
- 4.2 Robust Page Stream Segmentation Task
- 4.3 Model Ensembling
- 5 Relevance for Information Retrieval
- 6 Discussion and Future Work
- 7 Conclusion
- References
- Correction to: Comparative Analysis of Evaluation Measures for Scientific Text Simplification
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.