Posts

Springing into AI - Part 11: Model Context Protocol (MCP) - Theory

Image
Problem As we introduce more LLM models, we have to adapt our custom tools to each kind of model. This make it very tedious to develop, difficult to maintain and introduces a level of complexity. Often referred to as N x M problem where N is the number of models and M is the number of tools. So, say we using 5 different models, and have 10 different tools we would like to expose. This means we would end up with 50 permutations that we would have to manage, ouch !!.  Solution Anthropic designed an open source standard known as Model Context Protocol (MCP) that in its simplest form acts as a universal interface allowing LLM's to discover, communicate and securely use our external tools, resources, etc. This reduces the N x M problem by N + M problem. So, given the problem statement, instead of having an overhead of 50 permutations to manage,  it reduces drastically to mere 15 ( 5 models and 10 tools ) permutations.       It achieves this by having a client-se...

Springing into AI - Part 10: RAG

Image
Problem We have static resources such as company policies, employee guidelines, portfolios etc that we would like to expose to LLM and offer customers ability for them to query on that information. Since LLM's are pre-trained models they lack information about the content in our own resources. We want to empower our end users with ability to query on our resources for wide variety of use cases. Solution Retrieval Augmented Generation   (RAG) is the solution we have been waiting for to enrich our  application. It can summarized as a two step process.  In the first step, we store our information content through a custom ETL process into a special kind of persistence store named Vector Store where each chunk is stored as a series of multi dimension N dimension vectors through process called Vector Embeddings . In the second step, a semantical similar search is performed between these vectors and the user chat prompt so that it maybe augmented before presented to the LLM for...

Springing into AI - Part 9: Tool Calling

Image
Problem In GenAI applications, the LLM's are pre-trained models for information relative to a point in time. This does solve for cases when certain prompts are made for that information. However, they fail to provide adequate response to request for information outside its knowledge bank. The demand for this can vary per business use case and the problem we trying to solve to provide that information to LLM. Information can be static such as a resource, or a dynamic information obtained from your own knowledge bank from business information to wide variety of application use cases.  Solution For static resources such as document, we have a solution that we will discover in upcoming posts. For dynamic information such as retrieving information from our database or a custom business logic where we return the data back to the LLM for it to make sense of it, we can use a concept termed Tool Calling.    In its simplest form this is merely your custom business logic done as a f...

Springing into AI - Part 8: Chat Memory

Image
Problem By default, interactions with LLM are stateless and it has no clue about what was asked previously to it so it can have a more meaningful engaged conversation with our end user. You want to make your awesome GenAI application an enriching experience for the end user. If you are a superhero fan, you must have heard Batman re-iterating "I am Batman" to villains every single time as Batman assumes they have no memory or recollection.    Solution Spring offers the benefit of using the mighty Advisor(s). It is through this useful feature where we can tap into the user input request before it reaches the LLM, and enrich the user input with some past chat history thereby giving LLM some more context and having the end user a "conversational" type experience. The figure below paints a bird eye view of what our application would result in.        In the figure below, we introduce a segment of Chat Memory .  This segment can be a choice of either using "...