Springing into AI - Part 10: RAG
Problem
We have static resources such as company policies, employee guidelines, portfolios etc that we would like to expose to LLM and offer customers ability for them to query on that information. Since LLM's are pre-trained models they lack information about the content in our own resources. We want to empower our end users with ability to query on our resources for wide variety of use cases.
Solution
So my dear friend, from above:
- RAG - ETL : As mentioned above, this step revolves around pre-loading content dhere you or the domain experts present the relevant content information in form of documents pre-loaded . The ETL process involved is typically composed of Reader, Transformer and Writer. Spring AI offers a series of classes that we can use to adapt the ETL or even the flexibility to create our own custom ETL workflow. Spring AI out of the box provides default implementations for these.
- Document Reader: This is the first step of the ETL process, where document(s) are being read and passed to transformer for further processing.
- Document Transformer: In the second step of the ETL, the transformer is responsible for performing operations such as splitting the document into smaller chunks, or modifying the content obtained depending on the use case.
- Document Writer: This is the last step of the ETL, where the transformed documents be it chunked or modified are then finally stored in the persistence store. For persistence solutions where vector store is supported, Vector Embeddings is the cornerstone of the entire process as it is here where the text, images are stored in form of multi dimensional vectors that are then analyzed further for semantically similarity search against user prompts. In the process of semantic similarity search vector comparison is done using one of different distance type approach such as Cosine, Euclidean or Negative Inner Product. To appreciate the dimensionality involved, as humans our brain at most is capable of processing in 2D or 3D formats, while the AI semantic similar search happens in 1536 (default) dimensions or more. How amazing is that.
- Chat Client : When a user prompts the bot, their prompt is then compared against embedded vectors as mentioned above for semantically similar texts. The result of the search in combination with the user prompt is then presented to the LLM providing it enough context for it to operate upon to answer the user a meaningful response. Spring AI offers us this functionality for augmenting it through means of a QuestionAndAnswers advisor that takes in our VectorStore and carries out the entire sequence of events discussed.
Playground
For our understanding, we will use a sample PDF that is a bank statement with fake data and present that as a use case to our application so that as an end user we can prompt on it for information from our awesome AI application, and fingers crossed it does whatever we have discussed above. If not we will use the force to sway the application to behave the way we want it to 😏. Before we dive into the code let's get some admin out of the way with regards to our setup and then we will do a code walkthrough.
- Source Code: can be found here
- Dependencies:
- spring-ai-starter-vector-store-pgvector : Required for supporting pgvector store
- spring-ai-advisors-vector-store : Required for using QuestionAndAnswersAdvisor
- spring-ai-tikka-document-reader : Required for reading PDF's
- Container:
- pgvector : Our chosen vector store implementation which is a Postgres extension
- Embedding Model:
- mxbai-embed-large : Required as an embedding model that can pulled from ollama. It is this that would be used under the hood in the transformation phase from PgVectorStore to embed our chunked documents into vectors.
- API Endpoint:
- http://localhost:8080/chat/rag : Test endpoint where we will prompt for information against the sample bank statement PDF.
Code Walkthrough
As mentioned, before we use RAG, we have to provide our supported documents via the ETL process. This in our case is a mocked banked statement which is a static resource that resides in the location: src/main/resources/pdfs/bank.pdf. To ingest this document at startup we have a class called IngestionService that is doing the ETL for us. Let's look at that below: 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28 | public class IngestionService implements CommandLineRunner {
@Value("classpath:/pdfs/bank.pdf")
private Resource resource;
@Autowired
private VectorStore vectorStore;
@Override
public void run(String... args) throws Exception {
log.info("Beginning to ETL custom document..");
// Read
final DocumentReader documentReader = new TikaDocumentReader(resource);
List<Document> documentList = documentReader.read();
log.info("Document read");
// Chunk
final DocumentTransformer documentTransformer = new TokenTextSplitter();
List<Document> chunkedDocuments = documentTransformer.apply(documentList);
log.info("Splitted document into {}", chunkedDocuments.size());
// Write
vectorStore.add(chunkedDocuments);
log.info("Saved documents...");
}
}
|
- Line 1: We make use of Spring CommandLineRunner that is used to run custom code once the application starts up. It is here where the method exposed on Line 10 is invoked allowing us to do ETL on the supplied document.
- Line 4: We supply the document for our demo that we would be using as a Resource that would be supplied to the Document Reader.
- Line 15-16: We make use of a TikaDocumentReader which is a PDF reader that reads our supplied document.
- Line 20-21: Using the TokenTextSplitter as our DocumentTransformer, we split the supplied document into chunks of individual document.
- Line 25: We use the spring boot configured vector store from our dependencies which in our case would be the PgVectorStore that acts as a DocumentWriter to store the vectors in the pgvector datasource. This pgvector is an extension to Postgres. For our use case we run this persistence as a docker container.
.webp)

Comments
Post a Comment