Enhance your BioInformatics toolbox with this great book of recipes, tips and tricks, using Python to accomplish key tasks like aligning sequence data, calling variants, and building Infrastructure As Code.
Key Features
Perform sequence analysis; primary, secondary, and tertiary analysis with Python libraries.
Solve real-world problems in the fields of phylogenetics, protein design, & annotation
Use Language Models and other AI techniques to work with multimodal bioinformatics data
Book DescriptionBioinformatics with Python Cookbook is a recipe-based guide that helps you choose from a wide range of Python packages and approaches for classic bioinformatics problems.
Starting with the fundamentals of key Python libraries used for data science and BioInformatics, you will progress through key tasks in sequencing analysis, quality control, alignment and variant calling. Along the way you learn important tips about modern coding practices and recent advances in the field. You will deep dive into core bioinformatics tasks such as phylogenetic analysis and population genomics. You'll work with practical examples using libraries such as numpy, pandas, & sci-kit learn. The book provides exposure to the wealth of modern public bioinformatics databases available. You'll gain hands-on knowledge of important cloud computing approaches and learn how to set up workflow orchestration systems for controlling bioinformatics pipelines. You will gain an understanding of how bioinformatics is evolving to include AI and we'll try out new techniques such as Large Language Models to design proteins and DNA.
By the end of this book you will gain a functional understanding of using Python for Bioinformatics and how to launch Bioinformatics pipelines.
What you will learn
Process, analyze, and align sequencing data.
Call variants and interpret their biological meaning.
Use modern cloud infrastructure and launch bioinformatics workflows.
Ingest and transform Data with ease
See how AI is impacting BioInformatics
Leverage imaging and single-cell sequencing data to explore biological effects and cluster gene expression data
Who this book is forThis book is for early and mid-level practitioners in bioinformatics, data science, and software engineering who want to improve their skills and learn practicable approaches to real problems. The reader would benefit from a basic knowledge of biology and previous exposure to Python programming and software engineering techniques. Knowledge of basic biology including DNA, proteins, and cell structure is important. Knowledge of at least one cloud computing platform (AWS, GCP, or Azure) is helpful. Previous exposure to machine learning with Python is beneficial but not required.
Auflage
Sprache
Verlagsort
Editions-Typ
Maße
Höhe: 235 mm
Breite: 191 mm
ISBN-13
978-1-83664-275-6 (9781836642756)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Klassifikation
Shane Brubaker is a Senior Bioinformatics Manager and researcher at Myriad Genetics, where he contributes to advancing genetic testing and cancer research through cutting-edge sequencing technologies.
He holds a B.S. in Computer Science from Purdue University and an M.S. in Biology from the University of Oregon.
With extensive experience in synthetic biology and human health, Shane has leveraged bioinformatics to drive scientific innovation, develop new products, and advocate for impactful solutions in both human and environmental health.
Table of Contents
Computer Specification and Python Setup
Basics of Data Manipulation
Modern Coding Practices and AI generated Coding
Data Science and Graphing
Alignment and Variant Calling
Annotation and Biological Interpretation
Genomes and Genome Assembly
Nucleic Acid Database
Protein Databases
Phylogenetics
Population Genomics
Metabolic Modeling and Other Applications
Genome Editing
Cloud Basics, Infrastructure as Code, and Containers
Workflow Systems
Machine Learning, Deep Learning, and LLMs for Nucleic Acid and Protein Design
Single Cell Technology and Imaging Data