Preface
We're living through a GenAI revolution-where AI is no longer just a backend component but a copilot, content creator, and decision-maker. And yet, many GenAI applications still struggle with hallucinations, lack of contextual understanding, and opaque reasoning. That's where this book comes in.
This book was born out of a core belief: knowledge graphs are the missing link between GenAI power and real-world intelligence. By combining the strengths of Large Language Models (LLMs) with the structured, connected data of Neo4j, and enhancing them with Retrieval-Augmented Generation (RAG) workflows, we can build systems that are not only smart but also grounded, contextual, and transparent.
We wrote this book because we've spent the last few years building and showcasing intelligent applications that go far beyond basic chatbot use cases. From developing AI-powered recommendation engines to integrating frameworks such as Haystack, LangChain4j, and Spring AI with Neo4j, we saw a growing need for a practical, hands-on guide that bridges GenAI concepts with production-ready knowledge graph architectures.
The vision for this book is to equip developers, architects, and AI enthusiasts with the tools, concepts, and real-world examples they need to design search and recommendation systems that are explainable, accurate, and scalable. You won't just learn about LLMs or graphs in isolation-you'll build end-to-end applications that bring these technologies together across cloud platforms, vector search, graph reasoning, and more.
As you journey through the chapters, you'll go from understanding foundational concepts to implementing advanced techniques such as embedding-powered retrieval, graph reasoning, and cloud-native GenAI deployments using Google Cloud, AuraDB, and open source tools.
Whether you're a data engineer, AI developer, or just someone curious about the future of intelligent systems, this book will help you build applications that are not only smarter but also produce better answers.
Who this book is for
This book is for database developers and data scientists who want to learn and use knowledge graphs using Neo4j and its vector search capabilities to build intelligent search and recommendation systems. To get started, working knowledge of Python and Java is essential. Familiarity with Neo4j, the Cypher query language, and fundamental concepts of databases will come in handy.
What this book covers
Chapter 1 , Introducing LLMs, RAGs, and Neo4j Knowledge Graphs, introduces the core concepts of LLMs, RAG, and how Neo4j knowledge graphs enhance LLM performance by adding structure and context.
Chapter 2, Demystifying RAG, breaks down the RAG architecture, explaining how it augments LLMs with external knowledge. It covers key components such as retrievers, indexes, and generators with real-world examples.
Chapter 3, Building a Foundational Understanding of Knowledge Graph for Intelligent Applications, explains the basics of knowledge graphs and how they model real-world relationships. It highlights Neo4j's property graph model and its role in powering intelligent, context-aware applications.
Chapter 4, Building Your Neo4j Graph with the Movies Dataset, walks through constructing a Neo4j knowledge graph using a real-world movies dataset. It covers data modeling, Cypher queries, and importing structured data for graph-based search and reasoning.
Chapter 5, Implementing Powerful Search Functionalities with Neo4j and Haystack, shows how to integrate Neo4j with Haystack to enable semantic and keyword-based search. It covers embedding generation, indexing, and retrieving relevant results using vector search.
Chapter 6, Exploring Advanced Knowledge Graph Capabilities, dives into multi-hop reasoning, context-aware search, and leveraging graph structure for deeper insights. It showcases how Neo4j enhances intelligent retrieval beyond basic keyword or vector search.
Chapter 7, Introducing the Neo4j Spring AI and LangChain4j Frameworks for Building Recommendation Systems, introduces the Spring AI and LangChain4j frameworks to build LLM applications with Neo4j.
Chapter 8, Constructing a Recommendation Graph with the H&M Personalization Dataset, follows from the data modeling approaches discussed in , to load the H&M personalization dataset into a graph to build a better recommendation system.
Chapter 9, Integrating LangChain4j and Spring AI with Neo4j, provides a step-by-step guide to building Spring AI and LangChain4j applications to augment the graph by leveraging LLM chat APIs and the GraphRAG approach. It also covers embedding generation and adding these embeddings to a graph for machine learning purposes.
Chapter 10, Creating an Intelligent Recommendation System, explains how we can leverage Graph Data Science algorithms to further enhance the knowledge graph to provide better recommendations. It also discusses vector search and why it is not enough to provide good recommendations, as well as how leveraging KNN similarity and community detection gives us better results.
Chapter 11, Choosing the Right Cloud Platform for GenAI Application, compares major cloud platforms for deploying GenAI applications, focusing on scalability, cost, and AI/ML capabilities. It guides you in selecting the best-fit environment for your use case.
Chapter 12, Deploying Your Application on Google Cloud, provides a step-by-step guide to deploying your GenAI application on Google Cloud. It covers services such as Vertex AI, Cloud Functions, and Firebase for scalable and efficient deployment.
Chapter 13, Epilogue, reflects on the journey of building intelligent applications with GenAI and Neo4j. It summarizes key takeaways and highlights future opportunities in the evolving AI ecosystem.
To get the most out of this book
To fully benefit from this book, you should have a basic understanding of databases, familiarity with Neo4j and its Cypher query language, and a working knowledge of LLMs and GenAI concepts. Prior experience with Python and Java will also be helpful for implementing the code examples and working with frameworks such as Haystack, as well as LangChain4j and Spring AI for Java-based applications.
You'll be guided through building and deploying intelligent applications, so you may need to create free accounts on platforms such as Neo4j AuraDB, Google Cloud Platform (GCP), and OpenAI (or equivalent embedding providers). While no special hardware is required, a machine with at least 8 GB RAM and internet access is recommended for smooth development and testing.
Download the example code files and database dump
The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Building-Neo4j-Powered-Applications-with-LLMs. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing. Check them out!
You can download the database dump from this link https://packt-neo4j-powered-applications.s3.us-east-1.amazonaws.com/Building+Neo4j-Powered+Applications+with+LLMs+Database+Dump+files.zip.
Download the color images
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/gbp/9781836206231.
Conventions used
There are a number of text conventions used throughout this book.
CodeInText
: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and X (Twitter) handles. For example: "Install the Hugging Face Transformers library for handling model-related functionalities: pip install transformers
."
A block of code is set as follows:
documents = [ "The IPL 2024 was a thrilling season with unexpected results.", ..... "Dense Passage Retrieval (') is a state-of-the-art technique for information retrieval." ]
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
tokenizer = T5Tokenizer.from_pretrained('t5-small', legacy=False) model = T5ForConditionalGeneration.from_pretrained('t5-small')
Any command-line input or output is written as follows:
pip install numpy==1.26.4 neo4j transformers torch faiss-cpu datasets
Bold: Indicates a new term, an important word, or words that you see on the screen. For instance, words in menus or dialog boxes appear in the text like this. For example: "Artificial Intelligence (AI) is evolving beyond niche and specialized fields to become more accessible and able to assist with day-to-day tasks."
Warnings or important notes appear like this.
Tips and tricks appear like this.
Declaration
The authors acknowledge the use of cutting-edge AI, such as ChatGPT, with...