Springing into AI - Part 10: RAG
Welcome back, In Part 9 of the series we had a look at using tool-calling as a means to perform custom business logic functions and presenting the business data obtained from there to LLM so that it may have the capability to respond to prompts more contextually aimed towards a particular business use case. We continue our journey on the pre-trained to specific date shortcomings of a LLM and see how we can adapt solution further where we address the problem of presenting our documents to it using Retrieval Augmented Generation (RAG), so that we can empower end user with information from their prompts about it. Excited ? Let's get into it.
Retrieval Augmented Generation (RAG) - Theory
So my dear friend, from above:
- RAG - ETL : As mentioned above, this step revolves around pre-loading content dhere you or the domain experts present the relevant content information in form of documents pre-loaded . The ETL process involved is typically composed of Reader, Transformer and Writer. Spring AI offers a series of classes that we can use to adapt the ETL or even the flexibility to create our own custom ETL workflow. Spring AI out of the box provides default implementations for these.
- Document Reader: This is the first step of the ETL process, where document(s) are being read and passed to transformer for further processing.
- Document Transformer: In the second step of the ETL, the transformer is responsible for performing operations such as splitting the document into smaller chunks, or modifying the content obtained depending on the use case.
- Document Writer: This is the last step of the ETL, where the transformed documents be it chunked or modified are then finally stored in the persistence store. For persistence solutions where vector store is supported, Vector Embeddings is the cornerstone of the entire process as it is here where the text, images are stored in form of multi dimensional vectors that are then analyzed further for semantically similarity search against user prompts. In the process of semantic similarity search vector comparison is done using one of different distance type approach such as Cosine, Euclidean or Negative Inner Product. To appreciate the dimensionality involved, as humans our brain at most is capable of processing in 2D or 3D formats, while the AI semantic similar search happens in 1536 (default) dimensions or more. How amazing is that.
- Chat Client : When a user prompts the bot, their prompt is then compared against embedded vectors as mentioned above for semantically similar texts. The result of the search in combination with the user prompt is then presented to the LLM providing it enough context for it to operate upon to answer the user a meaningful response. Spring AI offers us this functionality for augmenting it through means of a QuestionAndAnswers advisor that takes in our VectorStore and carries out the entire sequence of events discussed.
Retrieval Augmented Generation (RAG) - Playground
For our understanding, we will use a sample PDF that is a bank statement with fake data and present that as a use case to our application so that as an end user we can prompt on it for information from our awesome AI application, and fingers crossed it does whatever we have discussed above. If not we will use the force to sway the application to behave the way we want it to 😏. Before we dive into the code let's get some admin out of the way with regards to our setup and then we will do a code walkthrough.
- Source Code: can be found here
- Dependencies:
- spring-ai-starter-vector-store-pgvector : Required for supporting pgvector store
- spring-ai-advisors-vector-store : Required for using QuestionAndAnswersAdvisor
- spring-ai-tikka-document-reader : Required for reading PDF's
- Container:
- pgvector : Our chosen vector store implementation which is a Postgres extension
- Embedding Model:
- mxbai-embed-large : Required as an embedding model that can pulled from ollama. It is this that would be used under the hood in the transformation phase from PgVectorStore to embed our chunked documents into vectors.
- API Endpoint:
- http://localhost:8080/chat/rag : Test endpoint where we will prompt for information against the sample bank statement PDF.
Code Walkthrough
As mentioned, before we use RAG, we have to provide our supported documents via the ETL process. This in our case is a mocked banked statement that resides in src/main/resources/pdfs/bank.pdf. To ingest this document at startup we have a class called IngestionService that is doing the ETL for us. Let's look at that below:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | public class IngestionService implements CommandLineRunner { @Value("classpath:/pdfs/bank.pdf") private Resource resource; @Autowired private VectorStore vectorStore; @Override public void run(String... args) throws Exception { log.info("Beginning to ETL custom document.."); // Read final DocumentReader documentReader = new TikaDocumentReader(resource); List<Document> documentList = documentReader.read(); log.info("Document read"); // Chunk final DocumentTransformer documentTransformer = new TokenTextSplitter(); List<Document> chunkedDocuments = documentTransformer.apply(documentList); log.info("Splitted document into {}", chunkedDocuments.size()); // Write vectorStore.add(chunkedDocuments); log.info("Saved documents..."); } } |
- Line 1: We make use of Spring CommandLineRunner that is used to run custom code once the application starts up. It is here where the method exposed on Line 10 is invoked allowing us to do ETL on the supplied document.
- Line 4: We supply the document for our demo that we would be using as a Resource that would be supplied to the Document Reader.
- Line 15-16: We make use of a TikaDocumentReader which is a PDF reader that reads our supplied document.
- Line 20-21: Using the TokenTextSplitter as our DocumentTransformer, we split the supplied document into chunks of individual document.
- Line 25: We use the spring boot configured vector store from our dependencies which in our case would be the PgVectorStore that acts as a DocumentWriter to store the vectors in the pgvector datasource. This pgvector is an extension to Postgres. For our use case we run this persistence as a docker container.
.webp)

Comments
Post a Comment