Posts

Showing posts from August, 2025

Springing into AI - Part 8: Chat Memory

Image
Problem By default, interactions with LLM are stateless and it has no clue about what was asked previously to it so it can have a more meaningful engaged conversation with our end user. You want to make your awesome GenAI application an enriching experience for the end user. If you are a superhero fan, you must have heard Batman re-iterating "I am Batman" to villains every single time as Batman assumes they have no memory or recollection.    Solution Spring offers the benefit of using the mighty Advisor(s). It is through this useful feature where we can tap into the user input request before it reaches the LLM, and enrich the user input with some past chat history thereby giving LLM some more context and having the end user a "conversational" type experience. The figure below paints a bird eye view of what our application would result in.        In the figure below, we introduce a segment of Chat Memory .  This segment can be a choice of either using "...

Springing into AI - Part 7: Observability

Image
Problem Tokens as we know form the social currency for our GenAI applications. When we run LLM's locally on our machine this is fine for experimentation, but in production grade applications we would often rely on external providers, and costs start to incur. Wouldn't it be cool if we had some insights, visibility of how our Gen AI application is being used in terms of tokens being used, requests being made so we can take and remediate actions should we wish ?  Solution Enters Observability , which is a fundamental utility to have in our arsenal that can be our eyes for our application. This cannot be stressed enough on how powerful Observability can be for enterprise applications.  Someone please picture Darth Vader going like " If only you knew the power of Observability ".  In the development community, there exists plethora of tools, some commercial and some open sourced.       Prometheus  and Grafana  are the most commonly used tools ado...

Springing into AI - Part 6: Chat Client

Image
Problem       We have the chosen LLM we want to use, we even have the programming language and the framework we want to use to create our awesome Gen AI application. How do we interact with the LLM through the SpringAI framework ? Solution     Spring AI offers us the powerful class called ChatClient.  This is the core class that we would use as developers to empower our applications to interact with the configured LLM. The heavy lifting of the underlying operations is all done by SpringAI. From an architecture perspective, on a high level, the fundamental design for our GenAI applications would look like below.     From figure above, the user (that is us or a front end application), would be using the API endpoints we expose via our application. These API's would then internally make use of ChatClient offered as part of SpringAI configured to interact with a local running LLM.  Machine Setup     For our adventurous journey, we need ...

Springing into AI - Part 5: SpringAI

Image
Problem     You have fueled your curiosity about Gen AI and chosen the LLM you want to work with. Excitement is leaving you on the edge of the seat and you just cannot wait to get hands dirty for integrating them with your applications. Being a Java developer, you wish there is a framework that is of enterprise standards, has active maintenance and always stays updated to latest design trends, growing community. Something magical, that offers ease of usage while taking away the heavy lifting away so you can do what you do best. Solution     Spring framework has been the defacto for enterprise applications over the years, and it is no surprise that from them comes SpringAI that has evolved rapidly to offer us developers a powerful framework which allows us to interact with the different LLM's in the most simplest way possible. The figure below shows an overview of the SpringAI eco-system.           From the figure above: LLM Models: Thro...