
Anonymizing Health Data
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Intro
- Copyright
- Table of Contents
- Preface
- Audience
- Conventions Used in this Book
- Safari® Books Online
- How to Contact Us
- Content Updates
- August 2014
- Acknowledgements
- Chapter 1. Introduction
- To Anonymize or Not to Anonymize
- Consent, or Anonymization?
- Penny Pinching
- People Are Private
- The Two Pillars of Anonymization
- Masking Standards
- De-Identification Standards
- Anonymization in the Wild
- Organizational Readiness
- Making It Practical
- Making It Automated
- Use Cases
- Stigmatizing Analytics
- Anonymization in Other Domains
- About This Book
- Chapter 2. A Risk-Based De-Identification Methodology
- Basic Principles
- Steps in the De-Identification Methodology
- Step 1: Selecting Direct and Indirect Identifiers
- Step 2: Setting the Threshold
- Step 3: Examining Plausible Attacks
- Step 4: De-Identifying the Data
- Step 5: Documenting the Process
- Measuring Risk Under Plausible Attacks
- T1: Deliberate Attempt at Re-Identification
- T2: Inadvertent Attempt at Re-Identification
- T3: Data Breach
- T4: Public Data
- Measuring Re-Identification Risk
- Probability Metrics
- Information Loss Metrics
- Risk Thresholds
- Choosing Thresholds
- Meeting Thresholds
- Risky Business
- Chapter 3. Cross-Sectional Data: Research Registries
- Process Overview
- Secondary Uses and Disclosures
- Getting the Data
- Formulating the Protocol
- Negotiating with the Data Access Committee
- BORN Ontario
- BORN Data Set
- Risk Assessment
- Threat Modeling
- Results
- Year on Year: Reusing Risk Analyses
- Final Thoughts
- Chapter 4. Longitudinal Discharge Abstract Data: State Inpatient Databases
- Longitudinal Data
- Don't Treat It Like Cross-Sectional Data
- De-Identifying Under Complete Knowledge
- Approximate Complete Knowledge
- Exact Complete Knowledge
- Implementation
- Generalization Under Complete Knowledge
- The State Inpatient Database (SID) of California
- The SID of California and Open Data
- Risk Assessment
- Threat Modeling
- Results
- Final Thoughts
- Chapter 5. Dates, Long Tails, and Correlation: Insurance Claims Data
- The Heritage Health Prize
- Date Generalization
- Randomizing Dates Independently of One Another
- Shifting the Sequence, Ignoring the Intervals
- Generalizing Intervals to Maintain Order
- Dates and Intervals and Back Again
- A Different Anchor
- Other Quasi-Identifiers
- Connected Dates
- Long Tails
- The Risk from Long Tails
- Threat Modeling
- Number of Claims to Truncate
- Which Claims to Truncate
- Correlation of Related Items
- Expert Opinions
- Predictive Models
- Implications for De-Identifying Data Sets
- Final Thoughts
- Chapter 6. Longitudinal Events Data: A Disaster Registry
- Adversary Power
- Keeping Power in Check
- Power in Practice
- A Sample of Power
- The WTC Disaster Registry
- Capturing Events
- The WTC Data Set
- The Power of Events
- Risk Assessment
- Threat Modeling
- Results
- Final Thoughts
- Chapter 7. Data Reduction: Research Registry Revisited
- The Subsampling Limbo
- How Low Can We Go?
- Not for All Types of Risk
- BORN to Limbo!
- Many Quasi-Identifiers
- Subsets of Quasi-Identifiers
- Covering Designs
- Covering BORN
- Final Thoughts
- Chapter 8. Free-Form Text: Electronic Medical Records
- Not So Regular Expressions
- General Approaches to Text Anonymization
- Ways to Mark the Text as Anonymized
- Evaluation Is Key
- Appropriate Metrics, Strict but Fair
- Standards for Recall, and a Risk-Based Approach
- Standards for Precision
- Anonymization Rules
- Informatics for Integrating Biology and the Bedside (i2b2)
- i2b2 Text Data Set
- Risk Assessment
- Threat Modeling
- A Rule-Based System
- Results
- Final Thoughts
- Chapter 9. Geospatial Aggregation: Dissemination Areas and ZIP Codes
- Where the Wild Things Are
- Being Good Neighbors
- Distance Between Neighbors
- Circle of Neighbors
- Round Earth
- Flat Earth
- Clustering Neighbors
- We All Have Boundaries
- Fast Nearest Neighbor
- Too Close to Home
- Levels of Geoproxy Attacks
- Measuring Geoproxy Risk
- Accounting for Geoproxy Risk
- Final Thoughts
- Chapter 10. Medical Codes: A Hackathon
- Codes in Practice
- Generalization
- The Digits of Diseases
- The Digits of Procedures
- The (Alpha)Digits of Drugs
- Suppression
- Shuffling
- Final Thoughts
- Chapter 11. Masking: Oncology Databases
- Schema Shmema
- Data in Disguise
- Field Suppression
- Randomization
- Pseudonymization
- Frequency of Pseudonyms
- Masking On the Fly
- Final Thoughts
- Chapter 12. Secure Linking
- Let's Link Up
- Doing It Securely
- Don't Try This at Home
- The Third-Party Problem
- Basic Layout for Linking Up
- The Nitty-Gritty Protocol for Linking Up
- Bringing Paillier to the Parties
- Matching on the Unknown
- Scaling Up
- Cuckoo Hashing
- How Fast Does a Cuckoo Run?
- Final Thoughts
- Chapter 13. De-Identification and Data Quality: A Clinical Data Warehouse
- Useful Data from Useful De-Identification
- Degrees of Loss
- Workload-Aware De-Identification
- Questions to Improve Data Utility
- A Clinical Data Warehouse
- GI Protocol
- Chlamydia Protocol
- Date Shifting
- Final Thoughts
- Index
- About the Authors
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.