Coming from a background of Knowledge Graph (KG) backed Medical Search, I don't need to be convinced about the importance of manually curated structured knowledge on the quality of search results. Traditional search is being rapidly replaced with Generative AI using a technique called Retrieval Augmented Generation (RAG), where the pipeline produces an answer summarizing the search results retrieved instead of the ten blue links that the searcher had to parse and retrieve an answer from earlier. In any case, I had been experimenting with Using KGs to enhance RAG to support this intuition, and when Microsoft announced their work on GtaphRAG, it felt good to be vindicated. So when Manning reached out to me to ask if I would be interested in reviewing the book Essential GraphRAG by Tomaž Bratanič and Oskar Hane, I jumped at the chance.
Both authors are from Neo4j, so it is not surprising that the search component is also Neo4j, even for vector search, and hybrid search is really vector + graph search (rather than the more common vector + lexical search). However, most people nowadays would prefer a multi-backend search that would include graph search as well as vector and lexical search, so the examples can help you learn (a) how to use Neo4j for vector search and (b) how to implement graph search with Neo4j. Since Neo4j is a leading graph database provider, this is useful information to know if you decide to incorporate graph search into your repertoire of tools, as you very likely are if you are reading this book.
The book is available under the Manning Early Access Program (MEAP) and is expected to be published in August 2025. It is currently organized into 8 chapters as follows:
Improving LLM accuracy -- here the authors introduce what LLMs are, what they are capable of as well as their limitations when used for question answering, i.e. not knowing about recent events post its training date, its tendency to hallucinate when it cannot answwe a question from the knowledge it was trained on, and its inability to know of company confidential or otherwise private information, since it is trained on public data only. They cover solutions to mitigate this, i.e. finetuning and RAG, and why RAG is a better alternaive in most cass. Finally they cover why KGs are the best general purpose datastore for RAG pipelines.
Vector Similarity Search and Hybrid Search -- here the authors cover the fundamentals of vector search, such as vector similarity functions, embedding models used to support vector search, and the reasoning behind chunking. They describe what a typical RAG pipeline looks like, although as mentioned earlier, they showcase Neo4j's vector search capabilities instead of relying on more popular vecror search alternatives. I thought it was good information though, since I wasn't aware that Neo4j supported vector search. They also cover hybrid search, in this case vector + graph search (this is a book about GraphRAG after all). Although I can definitely see Graph Search as one of the components of a hybrid search pipeline.
Advanced Vector Retrieval Strategies -- in this chapter, the authors introduce some interesting techniques to make your Graph Search produce more relevant context for your GraphRAG pipeline. Techniques on the query side include Step Back Prompting (SBP) to look for more generic concepts then drill down using Graph Search to improve recall, and the Parent Document Retriever pattern of retrieving parent documents of the chunks that matched, rather than the chunks themselves. On the indexing side, they talk about creating additional synthetic chunks that summarize actual chunks and can be queried as well as the chunks, and representing document chunks as pre-generated questions the chunk can answer instead of its text content.
Text2Cypher -- in this chapter, the authors show how an LLM can be prompted using Few Shot Learning (FSL) to generate Cypher queries from natural language. Users would type in a query using natural language, knowing nothing about the schema structure of the underlying Graph Database. The LLM, through detailed prompts and examples, would translate the natural language query to Cypher query. The authors also reference pre-trained models from Neo4j that have been fine-tuned to do this. While these models are generally not as effective as the one built from LLMs through prompting, they are more efficient on large volumes of data.
Agentic RAG -- Agentic RAG allows autonomous / semi-autonomous LLM backed software components, called Agents, to modify and enhance the standard control flow for RAG. One change could be for an Agent (the Router) to determine query intent and call on one or more retrieveers from the available pool of retrievers, or another (the Critic) to determine if the answer generated so far is adequate given the user's query, and if not, to rerun the pipeline with a modified query until the query is fully answered. The authors go on to describe a system (with code) consisting of a Router and Critic and several Retrieval Agents.
Constructing Knowledge Graph with LLM -- this chapter focuses on the index creation. Search is traditionally done on unstructured data such as text documents. This chapter describes using the LLM to extract entities of known types (PERSON, ORGANIZATION, LOCATION, etc), followed by a manual / semi-manual Graph Modeling step to set up relations between these extracted entities and build a schema. It then talks a little about convert specific query types into structured Cypher queries that leverage this schema.
Microsoft GraphRAG Implementation -- this chapter deals specifically with Microsoft's GraphRAG implementation. While most people think of GraphRAG as any infrastructure that supports incorporating Graph Search into a RAG pipeline, Microsoft specifies it as a multi-step recipe to build your KG from your data sources and use results from your KG to support a RAG pipeline. The steps involved are structured extraction and community detection, followed by summarization of community chunks into synthetic nodes. To some extent this is similar to Chonkie's Semantic Double Pass Merging (SDPM) chunker, except that the size of the skip window is unbounded. These synthetic chunks can be useful to answer global questions that span multiple ideas across the corpus. However, as the authors show, this approach can be effective for local queries as well.
RAG Application Evaluation -- because of the stochastic nature of LLMs, evaluating RAG pipelines in general present some unique challenges. Here these challenges are investigated with particular reference to GraphRAG systems, i.e. where the retrieval context is provided by Knowledge Graphs. The authors describe some metrics fro the RAGAS library, where LLMs are used to generate these metrics from outputs at different stages of the RAG pipeline. It also discusses ideas for setting up an evaluation dataset. The metrics covered in the example sare RAGAS context recall, faithfulness and answwr correctness.
Overall, the book takes a very practical, hands-on approach to the subject. It is filled with code examples and practical advice for leveraging KGs in RAG, and using Large Language Models (LLM) to build KGs, as well as evaluating such pipelines. If you were thinking of incorporating Graph Search into your search pipeline, be it traditional, hybrid, RAG or agentic, you will find the information in the book useful and beneficial.