
Fundamentals of Cost-Efficient AI
In Healthcare and Biomedicine
Rohit Kumar(Author)
Academic Press
Published on 9. December 2025
Book
Paperback/Softback
434 pages
978-0-443-33362-0 (ISBN)
Description
Fundamentals of Cost-Efficient AI: In Healthcare and Biomedicine provides a comprehensive yet accessible introduction to the principles of designing, training, and deploying efficient artificial intelligence systems. It explains the theory behind cost-aware machine learning and data mining and examines methods across deep learning, graph neural networks (GNNs), transformer architectures, diffusion models, reinforcement learning, and knowledge distillation.
The book covers fine-tuning and compression techniques such as low-rank adaptation (LoRA), parameter-efficient fine-tuning (PEFT), adapter-based tuning, pruning, and quantization. It also explores inference acceleration through Flash Attention, prefill optimization, and speculative decoding, and explains how mixture-of-experts (MoE) architectures can scale models efficiently across GPUs and edge devices.
To build a strong conceptual understanding, the text introduces fundamentals of GPU architecture, matrix multiplication, memory hierarchies, and parallelization strategies, helping readers develop an intuition for optimizing training and inference pipelines.
While applicable across domains, the book places special emphasis on healthcare and biomedicine, where efficient AI can reduce costs and improve diagnostics, precision medicine, and clinical decision support. Real-world case studies and interviews with experts from organizations such as Google and Microsoft provide practical insights into building scalable healthcare AI systems. Aimed at graduate students, researchers, clinicians, biomedical engineers, data scientists, and AI practitioners, this book bridges algorithmic principles with applied implementation.
The book covers fine-tuning and compression techniques such as low-rank adaptation (LoRA), parameter-efficient fine-tuning (PEFT), adapter-based tuning, pruning, and quantization. It also explores inference acceleration through Flash Attention, prefill optimization, and speculative decoding, and explains how mixture-of-experts (MoE) architectures can scale models efficiently across GPUs and edge devices.
To build a strong conceptual understanding, the text introduces fundamentals of GPU architecture, matrix multiplication, memory hierarchies, and parallelization strategies, helping readers develop an intuition for optimizing training and inference pipelines.
While applicable across domains, the book places special emphasis on healthcare and biomedicine, where efficient AI can reduce costs and improve diagnostics, precision medicine, and clinical decision support. Real-world case studies and interviews with experts from organizations such as Google and Microsoft provide practical insights into building scalable healthcare AI systems. Aimed at graduate students, researchers, clinicians, biomedical engineers, data scientists, and AI practitioners, this book bridges algorithmic principles with applied implementation.
More details
Language
English
Place of publication
San Diego
United States
Publishing group
Elsevier Science Publishing Co Inc
Target group
Professional and scholarly
Dimensions
Height: 235 mm
Width: 191 mm
Weight
450 gr
ISBN-13
978-0-443-33362-0 (9780443333620)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Other editions
Additional editions

E-Book
12/2025
Elsevier
€171.99
Available for download
Person
Rohit Kumar studied at Stanford, IIT Delhi, and RPI, specializing in machine learning. He is the Global Head of AI & Analytics at HCLTech (Digital Business), a visiting faculty at Shiv Nadar University, and a PhD scholar at IIT researching AI hallucinations. With over 20 years of product development experience in Silicon Valley, he has served as the Head of R&D at the Ministry of IT (Government of India), Senior Director at WalmartLabs, and CEO of a blockchain startup. He holds multiple patents and publications on generative AI, data mining, and large-scale distributed systems.
Author
HCLTech, Noida, Uttar Pradesh, India; IIT Delhi, New Delhi, India; Shiv Nadar University, Chennai, Tamil Nadu, India; 500 Startups, San Francisco, CA, United States
Content
Introduction
Efficient transformer architectures
Efficient model fine-tuning
Model compression techniques
Efficient reinforcement learning
Efficient graph algorithms
Training data augmentation
Training data generation
Cost efficient mixture of experts
GPU fundamentals and model inference
Fast matrix multiplication algorithms
Running models locally
Expert interviews and use cases
Efficient transformer architectures
Efficient model fine-tuning
Model compression techniques
Efficient reinforcement learning
Efficient graph algorithms
Training data augmentation
Training data generation
Cost efficient mixture of experts
GPU fundamentals and model inference
Fast matrix multiplication algorithms
Running models locally
Expert interviews and use cases