
Building Machine Learning Systems with a Feature Store
Batch, Real-Time, and LLM Systems
Jim Dowling(Author)
O'Reilly (Publisher)
Will be published approx. on 30. November 2025
Book
Paperback/Softback
450 pages
978-1-0981-6523-9 (ISBN)
Description
Get up to speed on a new unified approach to building machine learning (ML) systems with batch data, real-time data, and large language models (LLMs) based on independent, modular ML pipelines and a shared data layer. With this practical book, data scientists and ML engineers will learn in detail how to develop, maintain, and operate modular ML systems.
Author Jim Dowling introduces fundamental MLOps principles and practices for developing and operating reliable ML systems and describes the key data platform that you'll use to build and operate your ML systems: the feature store. Through examples, you'll look at how the feature store helps solve the hardest problem in ML-the data. When building systems, you'll move seamlessly from managing incremental datasets for training and fine-tuning to real-time data access and retrieval-augmented generation for online ML systems.
With this book, you'll be able to:
Make the leap from training ML models to building ML systems
Develop an ML system as modular feature, training, and inference pipelines
Design, develop, and operate batch ML systems, real-time ML systems, and fine-tuned LLM systems with retrieval-augmented generation
Learn the problems a feature store for ML solves when building ML systems
Understand the principles of MLOps for developing and safely updating ML systems
Jim Dowling is CEO of Hopsworks and an associate professor at KTH Royal Institute of Technology in Stockholm, Sweden.
Author Jim Dowling introduces fundamental MLOps principles and practices for developing and operating reliable ML systems and describes the key data platform that you'll use to build and operate your ML systems: the feature store. Through examples, you'll look at how the feature store helps solve the hardest problem in ML-the data. When building systems, you'll move seamlessly from managing incremental datasets for training and fine-tuning to real-time data access and retrieval-augmented generation for online ML systems.
With this book, you'll be able to:
Make the leap from training ML models to building ML systems
Develop an ML system as modular feature, training, and inference pipelines
Design, develop, and operate batch ML systems, real-time ML systems, and fine-tuned LLM systems with retrieval-augmented generation
Learn the problems a feature store for ML solves when building ML systems
Understand the principles of MLOps for developing and safely updating ML systems
Jim Dowling is CEO of Hopsworks and an associate professor at KTH Royal Institute of Technology in Stockholm, Sweden.
More details
Language
English
Place of publication
Sebastopol
United States
Product notice
Paperback (trade)
Unsewn / adhesive bound
Dimensions
Height: 234 mm
Width: 178 mm
Thickness: 27 mm
Weight
792 gr
ISBN-13
978-1-0981-6523-9 (9781098165239)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Person
Jim Dowling is CEO of Hopsworks and an Associate Professor at KTH Royal Institute of Technology. He's led the development of Hopsworks that includes the first open-source feature store for machine learning. He has a unique background in the intersection of data and AI. For data, he worked at MySQL and later led the development of HopsFS, a distributed file system that won the IEEE Scale Prize in 2017. For AI, his PhD introduced Collaborative Reinforcement Learning, and he developed and taught the first course on Deep Learning in Sweden in 2016. He also released a popular online course on serverless machine learning using Python at serverless-ml.org. This combined background of Data and AI helped him realize the vision of a feature store for machine learning based on general purpose programming languages, rather than the earlier feature store work at Uber on DSLs. He was the first evangelist for feature stores, helping to create the feature store product category through talks at industry conferences, like Data/AI Summit, PyData, OSDC, and educational articles on feature stores. He is the organizer of the annual feature store summit conference and the featurestore.org community, as well as co-organizer of PyData Stockholm.