
Microbiome Statistics Set
CRC Press
1st Edition
Will be published approx. on 7. January 2026
Book
1172 pages
978-1-041-22013-8 (ISBN)
Description
Microbiome Statistics Set addresses the statistical analysis of correlation, association, interaction, and composition in microbiome research and talks about the challenges of machine learning statistics with an emphasis on the importance of performance valuation by appropriate metrics and independent data.
The books define the study of the microbiome as a hypothesis-driven experimental science and investigate challenges for statistical analysis of microbiome data using the standard statistical methods while also providing the step-by-step procedures to perform machine learning microbiome data, including feature engineering, algorithm selection and optimization, performance evaluation and model testing. They comment on the benefits and limitations of using machine learning for microbiome statistics and remarks on the advantages and disadvantages of each machine learning algorithm.
This set consists of 15 chapters on applied microbiome statistics and 19 chapters on machine learning for microbiome statistics and is an excellent reference for researchers, students, academics and data analysts in the field.
Key Features:
? Discusses the issues of statistical analysis of microbiome data: high dimensionality, compositionality, sparsity, overdispersion, zero-inflation, and heterogeneity.
? Describes important concepts of machine learning, including bias and variance tradeoff, accuracy and precision, overfitting and underfitting, model complexity and interpretability, and feature engineering.
? Investigates statistical methods on multiple comparisons and multiple hypothesis testing and applications to microbiome data.
? Introduces a series of exploratory tools to visualize composition and correlation of microbial taxa by barplot, heatmap, and correlation plot.
? Introduces confusion matrix and its derived measures. Comprehensively describes the properties of F1, Matthews' correlation coefficient (MCC), area under the receiver operating characteristic curve (AUC-ROC), and area under the precision-recall curve (AUC-PR), as well as discusses their advantages and disadvantages when using for microbiome data.
? Employs the Kruskal-Wallis rank-sum test to perform model selection for further multi-omics data integration.
? Offers all related R codes and the datasets from the authors' first-hand microbiome research and publicly available data.
The books define the study of the microbiome as a hypothesis-driven experimental science and investigate challenges for statistical analysis of microbiome data using the standard statistical methods while also providing the step-by-step procedures to perform machine learning microbiome data, including feature engineering, algorithm selection and optimization, performance evaluation and model testing. They comment on the benefits and limitations of using machine learning for microbiome statistics and remarks on the advantages and disadvantages of each machine learning algorithm.
This set consists of 15 chapters on applied microbiome statistics and 19 chapters on machine learning for microbiome statistics and is an excellent reference for researchers, students, academics and data analysts in the field.
Key Features:
? Discusses the issues of statistical analysis of microbiome data: high dimensionality, compositionality, sparsity, overdispersion, zero-inflation, and heterogeneity.
? Describes important concepts of machine learning, including bias and variance tradeoff, accuracy and precision, overfitting and underfitting, model complexity and interpretability, and feature engineering.
? Investigates statistical methods on multiple comparisons and multiple hypothesis testing and applications to microbiome data.
? Introduces a series of exploratory tools to visualize composition and correlation of microbial taxa by barplot, heatmap, and correlation plot.
? Introduces confusion matrix and its derived measures. Comprehensively describes the properties of F1, Matthews' correlation coefficient (MCC), area under the receiver operating characteristic curve (AUC-ROC), and area under the precision-recall curve (AUC-PR), as well as discusses their advantages and disadvantages when using for microbiome data.
? Employs the Kruskal-Wallis rank-sum test to perform model selection for further multi-omics data integration.
? Offers all related R codes and the datasets from the authors' first-hand microbiome research and publicly available data.
More details
Series
Language
English
Place of publication
London
United Kingdom
Publishing group
Taylor & Francis Ltd
Target group
College/higher education
Professional and scholarly
Academic, Postgraduate, Professional Practice & Development, and Professional Reference
Illustrations
57 s/w Tabellen, 100 farbige Zeichnungen, 40 s/w Zeichnungen, 4 Farbfotos bzw. farbige Rasterbilder, 104 farbige Abbildungen, 40 s/w Abbildungen
57 Tables, black and white; 100 Line drawings, color; 40 Line drawings, black and white; 4 Halftones, color; 104 Illustrations, color; 40 Illustrations, black and white
Dimensions
Height: 234 mm
Width: 156 mm
Weight
2370 gr
ISBN-13
978-1-041-22013-8 (9781041220138)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Persons
Dr. Yinglin Xia is a Clinical Professor in the Department of Medicine at the University of Illinois Chicago. He has published six books on statistical analysis of microbiome and metabolomics data and more than 180 statistical methodology and research papers in peer-reviewed journals. He serves on the editorial boards of several scientific journals including as an Associate Editor of Gut Microbes and has served as a reviewer for over 100 scientific journals.
Dr. Jun Sun is a tenured Professor of Medicine at the University of Illinois Chicago and an internationally recognized expert on microbiome and human diseases, e.g., vitamin D receptor in inflammation, dysbiosis and intestinal dysfunction in amyotrophic lateral sclerosis (ALS). Her lab is the first to discover that chronic effects and molecular mechanisms of Salmonella infection and risk of colon cancer. Dr. Sun has published over 260 scientific articles in peer-reviewed journals and 10 books on microbiome.
Dr. Jun Sun is a tenured Professor of Medicine at the University of Illinois Chicago and an internationally recognized expert on microbiome and human diseases, e.g., vitamin D receptor in inflammation, dysbiosis and intestinal dysfunction in amyotrophic lateral sclerosis (ALS). Her lab is the first to discover that chronic effects and molecular mechanisms of Salmonella infection and risk of colon cancer. Dr. Sun has published over 260 scientific articles in peer-reviewed journals and 10 books on microbiome.
Content
Applied Microbiome Statistics: Correlation, Association, Interaction and Composition
Preface Acknowledgements About the Authors 1. Introduction to Microbiome Statistics 2. Classical Parametric Correlation 3. Classical Nonparametric Correlation 4. Composition Barplots 5. Composition Heatmaps 6. Correlation Heatmaps and plots 7. Model Selection for Correlation and Association Analysis 8. Alpha Diversity-Based Association Analysis 9. Beta Diversity-Based Association Analysis 10. Multiple Comparisons and Multiple Hypothesis Testing 11. Multiple Comparisons and Multiple Hypothesis Testing in Microbiome Research 12. Linear Discriminant Analysis Effect Size (LEfSe) 13. Sparse and Compositional Methods for Inferencing Microbial Interactions 14. Network Construction and Comparison for Microbiome Data 15. Microbial Networks in Semi-Parametric Rank-Based Correlation and Partial Correlation Estimation References
Machine Learning for Microbiome Statistics
Preface Acknowledgements About the Authors Chapter 1 Introduction to Machine Learning Chapter 2 Overview of Machine Learning in Microbiome Research Chapter 3 Accessing Model Accuracy and Goodness of Fit Tests for Normality Chapter 4 Overfitting and Underfitting Chapter 5 Assessing Model Accuracy Using Cross-Validation Chapter 6 Feature Engineering and Model Selection Chapter 7 Logistic Regression Chapter 8 Support Vector Machines Chapter 9 Classification Trees Chapter 10 Random Forest Chapter 11 The Evolution of Tree-Based Algorithms Chapter 12 Extreme Gradient Boosting (XGBoost) Chapter 13 Artificial Neural Networks and Deep Learning Chapter 14 Machine Learning Microbiome with SIAMCAT Chapter 15 Basic Performance Metrics for Machine Learning Models Chapter 16 Matthews Correlation Coefficient Chapter 17 Area Under the Receiver Operating Characteristic Curve (AUC-ROC) Chapter 18 Area Under the Precision-Recall Curve (AUC-PR) Chapter 19 Comparisons of Machine Learning Classification Models with Tidymodels
Preface Acknowledgements About the Authors 1. Introduction to Microbiome Statistics 2. Classical Parametric Correlation 3. Classical Nonparametric Correlation 4. Composition Barplots 5. Composition Heatmaps 6. Correlation Heatmaps and plots 7. Model Selection for Correlation and Association Analysis 8. Alpha Diversity-Based Association Analysis 9. Beta Diversity-Based Association Analysis 10. Multiple Comparisons and Multiple Hypothesis Testing 11. Multiple Comparisons and Multiple Hypothesis Testing in Microbiome Research 12. Linear Discriminant Analysis Effect Size (LEfSe) 13. Sparse and Compositional Methods for Inferencing Microbial Interactions 14. Network Construction and Comparison for Microbiome Data 15. Microbial Networks in Semi-Parametric Rank-Based Correlation and Partial Correlation Estimation References
Machine Learning for Microbiome Statistics
Preface Acknowledgements About the Authors Chapter 1 Introduction to Machine Learning Chapter 2 Overview of Machine Learning in Microbiome Research Chapter 3 Accessing Model Accuracy and Goodness of Fit Tests for Normality Chapter 4 Overfitting and Underfitting Chapter 5 Assessing Model Accuracy Using Cross-Validation Chapter 6 Feature Engineering and Model Selection Chapter 7 Logistic Regression Chapter 8 Support Vector Machines Chapter 9 Classification Trees Chapter 10 Random Forest Chapter 11 The Evolution of Tree-Based Algorithms Chapter 12 Extreme Gradient Boosting (XGBoost) Chapter 13 Artificial Neural Networks and Deep Learning Chapter 14 Machine Learning Microbiome with SIAMCAT Chapter 15 Basic Performance Metrics for Machine Learning Models Chapter 16 Matthews Correlation Coefficient Chapter 17 Area Under the Receiver Operating Characteristic Curve (AUC-ROC) Chapter 18 Area Under the Precision-Recall Curve (AUC-PR) Chapter 19 Comparisons of Machine Learning Classification Models with Tidymodels