What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
Reihe
Sprache
Verlagsort
Verlagsgruppe
Zielgruppe
Maße
Höhe: 229 mm
Breite: 152 mm
Gewicht
ISBN-13
978-1-84334-749-1 (9781843347491)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Klassifikation
Emma Tonkin is a Senior Research Associate in the Faculty of Engineering at the University of Bristol. She has held positions in several universities, having previously worked in Digital Library research at UKOLN, University of Bath, and in the Department of Digital Humanities, King's College, London. She holds a PhD in Computer Science from the University of Bristol. Her primary research interests include text and data mining, human computer interaction and the development of hybrid systems that combine human and machine classification Gregory Tourte is Senior Research Associate at the University of Bristol in the School of Geographical Sciences where he started as a system administrator for the research group's supercomputer and used this opportunity to research data management due to the large quantity of data being generated. He continues his work with deep time climate modelling within the Bristol Research Initiative for the Dynamic Global Environment (BRIDGE).
Autor*in
Senior Research Associate, Faculty of Engineering, University of Bristol, UK
Senior Research Associate, School of Geographical Sciences, University of Bristol, UK
Chapter 1: Working with Text
Chapter 2: A Day at Work (with Text): A Brief Introduction
Chapter 3: If You Find Yourself in a Hole, Stop Digging: Legal and Ethical Issues of Text/Data Mining in Research
Chapter 4: Responsible Content Mining
Chapter 5: Text Mining for Semantic Search in Europe PubMed Central Labs
Chapter 6: Extracting Information from Social Media with GATE
Chapter 7: Newton: Building an Authority-Driven Company Tagging and Resolution System
Chapter 8: Automatic Language Identification
Chapter 9: User-Driven Text Mining of Historical Text
Chapter 10: Automatic Text Indexing with SKOS Vocabularies in HIVE
Chapter 11: The PIMMS Project and Natural Language Processing for Climate Science: Extending the ChemicalTagger Natural Language Processing Tool with Climate Science Controlled Vocabularies
Chapter 12: Building Better Mousetraps: A Linguist in NLP
Chapter 13: Raul Garreta, Co-founder of Tryolabs.com, Tells Emma Tonkin About the Journey from Software Engineering Graduate to Startup Entrepreneur