
Applying AI-Based Tools and Technologies Towards Revitalization of Indigenous and Endangered Languages
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
This book emphasises the need for language resource development and its impact on society. It covers latest AI based tools and techniques used to preserve indigenous and endangered languages. The book also highlights latest AI based technologies such as Generative Pre-trained Transformer (GPT) towards endangered language preservation. It discusses morphology analysis, translation support and shallow parsing of various tribal languages of India and abroad. This book tries to answer how digital technologies can make language revitalization accessible to future generations.
Reviews / Votes
"This book is a valuable contribution to the field of language preservation. It offers a detailed exploration of how AI and NLP can be leveraged to save endangered languages, backed by case studies and expert insights. . the book is a significant resource for researchers, developers, and policymakers interested in the intersection of technology and language preservation." (Mario Antoine Aoun, Computing Reviews, December 26, 2024)
More details
Other editions
Additional editions

Persons
Dr. Sushree Sangita Mohanty currently working as an Assistant Professor in the department of
Anthropology as well as leading the project Mother Tongue based Multilingual Education at Kalinga
Institute of Social Sciences (KISS) which has recently received the UNESCO International Literary
Prize 2022. Her expertise in Multilingual Education facilitates easing the process to build a strong
educational foundation among the indigenous children of KISS. Her research interests are
multidisciplinary in nature which centres around socio-cultural life, multilingualism and livelihood
vulnerability of indigenous & low-income communities of Odisha/India. She has been listed as a
UNESCO Inclusive Policy Lab Expert.
Dr. Satya Ranjan Dash is currently working as an associate professor at KIIT University, India. His
current research includes Epileptic Seizure Detection based on EEG Signal through Spiking neural
network (SNN), Classification of Schizophrenia Patients from EEG and fMRI using SNN and SSN,
fetal heart rate signals classification through extreme learning machine (ELM), Mammogram
Analysis with Local binary pattern (LBP), generative adversarial network (GAN) model, Machine
Learning , Medical Image Processing, Machine Translation, Natural Language Processing and Fuzzy
Mathematical Models.
Dr. Shantipriya Parida currently working as a Senior AI Scientist at Silo AI, Finland. Before joining
Silo AI, Shantipriya worked as a Postdoctoral Researcher at Idiap Research Institute, Switzerland.
He has obtained his Postdoc in Machine Translation from Charles University, Prague, Czech
Republic, Ph.D. in Computational Neuroscience from Utkal University, Odisha, India. Before joining
postdoc at Charles University, he worked as a System Architect at Huawei Technologies India Pvt
Ltd, Bangalore, India. He has 15 years of experience in software development and architecture, as
well as expertise in machine learning, deep learning, and Natural Language Processing. He has 4
years of research experience in leading NLP tasks in EU H2020 and InnoSuisse projects with
publications in top-tier conferences and journals. He is part of the program committee/organizer
for many top-tier NLP conferences and workshops. Recently published an edited book "Natural
Language Processing in Healthcare: A Special Focus on Low Resource Language" in collaboration
with other NLP researchers.
Content
- Intro
- Preface
- Contents
- About the Editors
- Language Revitalization & Artificial Intelligence
- Kuvi Character Set: A Mobile Interface for the Revitalization of the Kuvi Language
- 1 Introduction: The Imperative Need for Developing a Kuvi Character Set for an Unwritten Endangered Language
- 1.1 About the Speaker
- 1.2 Historical Background of Kuvi Language
- 2 KISS Model of Character Set Development
- 2.1 Participants
- 2.2 Phases
- 2.3 Process
- 2.4 Principles
- 3 Summary
- References
- Reviving Endangered Languages: Exploring AI Technologies for the Preservation of Tanzania's Hehe Language
- 1 Introduction
- 2 Literature Review
- 3 Proposed Model
- 4 Conclusion and Future Prospect
- References
- Preservation of Vedda's Language in Sri Lanka
- 1 Introduction
- 1.1 About the Language
- 1.2 Challenges and Opportunities
- 2 Literature Survey
- 3 Propose Model
- 4 Preserve and Promote the Vedda Language
- 5 Conclusion and Future Work
- References
- Role of Digital Technology in the Education, Promotion, and Revitalization of "Ho" Languages
- 1 Introduction
- 2 Methodology
- 3 Role of Digital Technology
- 3.1 Indigenous Communities and Technology
- 4 Proposed Digital Technology for Ho Language
- 4.1 Different Factors for Promotion of Ho Language
- 5 Revitalization of Ho Language
- 5.1 Technology in Endangered Language Contexts
- 6 Ho Language Education
- 6.1 Documentation, Preservation, and Revitalization
- 6.2 Language Pedagogy
- 7 Conclusion
- References
- Changing the Trajectory: Preserving the Linguistic Diversity of Shi Language Using AI and NLP
- 1 Introduction
- 2 The Shi Language: Overview and Challenges
- 2.1 Linguistic Characteristics of Shi Language
- 2.2 Language Endangerment Factors
- 2.3 Sociocultural Implications
- 3 AI and NLP in Language Revitalization
- 3.1 Role of AI in Language Preservation
- 3.2 NLP Applications for Endangered Languages
- 4 Future Prospects
- 4.1 Data Collection and Analysis
- 4.2 Community Engagement and Collaboration
- 5 AI-Based Solutions for Shi Language Revitalization
- 5.1 Automatic Speech Recognition (ASR) Systems
- 5.2 Machine Translation and Language Generation
- 5.3 Language Learning Applications
- 5.4 Digital Archives and Preservation
- 6 Ethical Considerations and Cultural Sensitivity
- 6.1 Informed Consent and Community Involvement
- 6.2 Preserving Cultural Nuances and Context
- 6.3 Balancing Technological Advancements with Traditional Knowledge
- 7 Case Studies of AI Implementation in Language Revitalization
- 7.1 Impact Assessment and Evaluation
- 8 Future Directions and Recommendations
- 8.1 Long-Term Sustainability Strategies
- 8.2 Collaboration with Indigenous Communities
- 8.3 Policy and Funding Support
- 9 Conclusion
- References
- Kuvi Calendar: Harnessing Indigenous Calendar for Language Revitalization
- 1 Introduction: Understanding the Cultural and Practical Significance of Kuvi Calendar
- 1.1 The Cultural and Practical Importance of Indigenous Calendar
- 2 Process of Making Kuvi Calendar
- 2.1 The Multi-Step Process for the Development of Kuvi (Physical) Calendar First Phase
- 2.2 The Second Phase-Development of Parallel Corpus for Kuvi (Digital) Calendar
- 3 Embodying Kuvi Cultural Heritage: The Physical Kuvi Calendar
- 3.1 Week Structure
- 3.2 Month Structure
- 3.3 Day Structure in Each Month
- 4 Conclusion
- References
- Natural Language Process (NLP) for Language Analysis
- Contemplating Dialects When Building a Guarani Corpus for NLP
- 1 Introduction
- 2 Minority Languages in South America
- 2.1 Brief Socio-historical Background of Guarani in Paraguay
- 2.2 Guarani Features
- 3 Challenges Faced While Building a Guarani-Spanish Corpus
- 3.1 Challenge 1: Lack of Data to Build a Corpus
- 3.2 Challenge 2: Guarani and Spanish Meet in Jopara
- 3.3 Challenge 3: The Unbearable Lightness of Guarani Orthography
- 4 Conclusion and Future Prospect
- References
- The Role of NLP to Facilitate the Growth of Ge'ez Language
- 1 Introduction
- 1.1 About the Ge'ez Language
- 1.2 Number of Speakers
- 2 Literature Review
- 2.1 The Role of NLP to Facilitate the Growth of Ge'ez Language
- 2.2 Applications of NLP
- 3 Conclusion
- 4 Future Work
- References
- Developing Multilingual Glossaries for STEM Terminology Using AI-NLP
- 1 Introduction
- 2 Building the Glossary
- 3 AI-Mediated NLP-Based Word Creation
- 4 Conclusion and Future Perspectives
- References
- Development of Parallel Speech Data Repository for Ho Language
- 1 Introduction
- 2 Literature Survey
- 3 Proposed Model
- 3.1 Digital Resources
- 3.2 Data Scraping from Ho Wikipedia
- 3.3 Optical Character Recognition
- 3.4 Parallel Corpus
- 3.5 Manually Correction from Human Volunteers
- 3.6 Speech to Text
- 4 Conclusion
- References
- Challenges to Prepare the Parallel Corpus for Luganda Language
- 1 Introduction
- 2 Literature Survey
- 3 Proposed Model
- 3.1 Optical Character Recognition
- 3.2 Speech to Text
- 3.3 Web Scraping
- 3.4 Newspapers
- 4 Conclusion and Future Work
- References
- Proposed Model for Automatic Dialect Classification of Binjhal Language
- 1 Introduction
- 2 Binjhal Language
- 2.1 Language Identification
- 3 Literature Reviews
- 4 Data Collection and Preparation
- 5 Proposed Model
- 5.1 Preprocessing
- 5.2 Types of Preprocessing
- 6 Experiment Result and Evaluation
- 7 Conclusion
- References
- Twi Speech Processing: Techniques and Applications
- 1 Introduction
- 2 Literature Review
- 2.1 Twi Language and Linguistic Characteristics
- 2.2 Challenges in Twi Speech Processing
- 2.3 Techniques in Twi Speech Processing
- 2.4 Applications of Twi Speech Processing
- 2.5 Future Directions in Twi Speech Processing
- 3 Methodology
- 3.1 Data Collection and Preprocessing
- 3.2 Feature Extraction
- 3.3 Speech Processing Applications
- 3.4 Dialectal Variations Analysis
- 3.5 Speaker Identification and Verification
- 3.6 Evaluation and Validation
- 3.7 Future Directions
- 4 Techniques and Working Principle
- 5 Conclusion and Future Directions
- References
- Cultural Survival Heritage of Bambara Language by Using NLP
- 1 Introduction
- 2 Literature Survey
- 3 Cultural Significance of Bambara Language in Malian Literature and Music
- 4 Socio-Cultural Factors Impacting the Preservation of Bambara
- 5 Language Revitalization Efforts in Mali
- 6 AI-Based Language Documentation Projects for Endangered Languages
- 7 Government Policies and International Cooperation for Language Preservation
- 8 NLP-Based Language Revitalization Projects in Other Regions
- 9 Proposed Model
- 10 Data Collection of Bambara Texts
- 11 Machine Learning (Clustering the Collected Bambara Texts)
- 12 Data Processing (Tokenization and Stemming)
- 13 Sentiment Analysis (Optional)
- 14 Model Evaluation
- 15 Feature Extraction
- 16 Result and Application
- 17 Conclusion
- References
- Dialect Identification of Gondar, Gojjami, and Showa Language of Amharic Using AI and NLP
- 1 Introduction
- 2 Literature Survey
- 2.1 Literature Survey
- 2.2 Tigrinya Dialect Identification
- 2.3 Assamese Dialects
- 2.4 Santali Dialect Identification
- 2.5 Kamrupi Dialect Identification
- 2.6 Maghrebian Dialect Recognition
- 2.7 Algerian Dialect Recognition
- 2.8 Tunisian Dialect Recognition
- 2.9 Goalparia Dialect Identification
- 2.10 Ao Dialect Identification
- 3 Proposed Method
- 3.1 Data Collection and Preprocessing
- 3.2 Data Collection and Preprocessing
- 4 Results and Discussions
- 4.1 Model Performance
- 4.2 Challenges and Limitations
- 4.3 Future Directions
- 5 Conclusion
- References
- Creating a Parallel Corpus for Machine Translation: A Case Study of Kru and Krio
- 1 Introduction
- 1.1 Krio
- 1.2 Kru
- 1.3 Syntax and Alphabet
- 2 Related Works
- 2.1 Works Done on Similar Languages
- 3 Proposed Model
- 3.1 Optical Character Recognition
- 3.2 Books
- 3.3 Existing Database
- 3.4 Web Scraping
- 3.5 Speech-To-Text
- 4 Conclusion
- 5 Future Works
- References
- Developing Parallel Corpus for the Machine Translation System in Dzongkha Language
- 1 Introduction
- 2 Literature Review
- 3 Proposed Model
- 4 Conclusion and Future Prospects
- References
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.