Practical Weak Supervision

Name: Practical Weak Supervision
Brand: O'Reilly
Price: 50.49 EUR
Availability: OnlineOnly

Wee Hyong Tok Amit Bahree Senja Filipi(Author)

O'Reilly (Publisher)

Published on 30. September 2021

192 pages

E-Book

PDF with Adobe-DRM

System requirements

978-1-4920-7703-9 (ISBN)

€50.49incl. 7% vat

System requirements

for PDF with Adobe-DRM

E-Book Single Licence

Available for download

Description

More details

Other editions

Content

Intro
Copyright
Table of Contents
Foreword by Xuedong Huang
Foreword by Alex Ratner
Preface
Who Should Read This Book
Navigating This Book
Conventions Used in This Book
Using Code Examples
O'Reilly Online Learning
How to Contact Us
Acknowledgments
Chapter 1. Introduction to Weak Supervision
What Is Weak Supervision?
Real-World Weak Supervision with Snorkel
Approaches to Weak Supervision
Incomplete Supervision
Inexact Supervision
Inaccurate Supervision
Data Programming
Getting Training Data
How Data Programming Is Helping Accelerate Software 2.0
Summary
Chapter 2. Diving into Data Programming with Snorkel
Snorkel, a Data Programming Framework
Getting Started with Labeling Functions
Applying the Labels to the Datasets
Analyzing the Labeling Performance
Using a Validation Set
Reaching Labeling Consensus with LabelModel
Intuition Behind LabelModel
LabelModel Parameter Estimation
Strategies to Improve the Labeling Functions
Data Augmentation with Snorkel Transformers
Data Augmentation Through Word Removal
Snorkel Preprocessors
Data Augmentation Through GPT-2 Prediction
Data Augmentation Through Translation
Applying the Transformation Functions to the Dataset
Summary
Chapter 3. Labeling in Action
Labeling a Text Dataset: Identifying Fake News
Exploring the Fake News Detection(FakeNewsNet) Dataset
Importing Snorkel and Setting Up Representative Constants
Fact-Checking Sites
Is the Speaker a "Liar"?
Twitter Profile and Botometer Score
Generating Agreements Between Weak Classifiers
Labeling an Images Dataset: Determining Indoor Versus Outdoor Images
Creating a Dataset of Images from Bing
Defining and Training Weak Classifiers in TensorFlow
Training the Various Classifiers
Weak Classifiers out of Image Tags
Deploying the Computer Vision Service
Interacting with the Computer Vision Service
Preparing the DataFrame
Learning a LabelModel
Summary
Chapter 4. Using the Snorkel-Labeled Dataset for Text Classification
Getting Started with Natural Language Processing (NLP)
Transformers
Hard Versus Probabilistic Labels
Using ktrain for Performing Text Classification
Data Preparation
Dealing with an Imbalanced Dataset
Training the Model
Using the Text Classification Model for Prediction
Finding a Good Learning Rate
Using Hugging Face and Transformers
Loading the Relevant Python Packages
Dataset Preparation
Checking Whether GPU Hardware Is Available
Performing Tokenization
Model Training
Testing the Fine-Tuned Model
Summary
Chapter 5. Using the Snorkel-Labeled Dataset for Image Classification
Visual Object Recognition Overview
Representing Image Features
Transfer Learning for Computer Vision
Using PyTorch for Image Classification
Loading the Indoor/Outdoor Dataset
Utility Functions
Visualizing the Training Data
Fine-Tuning the Pretrained Model
Summary
Chapter 6. Scalability and Distributed Training
The Need for Scalability
Distributed Training
Apache Spark: An Introduction
Spark Application Design
Using Azure Databricks to Scale
Cluster Setup for Weak Supervision
Fake News Detection Dataset on Databricks
Labeling Functions for Snorkel
Setting Up Dependencies
Loading the Data
Fact-Checking Sites
Transfer Learning Using the LIAR Dataset
Weak Classifiers: Generating Agreement
Type Conversions Needed for Spark Runtime
Summary
Index
About the Authors
Colophon

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Practical Weak Supervision

Description

More details

Other editions

Additional editions

Content

System requirements