Search
Calendar
July 2025
S M T W T F S
« Jun    
 12345
6789101112
13141516171819
20212223242526
2728293031  
Archives

Posts Tagged ‘LLM’

PostHeaderIcon [Voxxed Amsterdam 2025] From Zero to AI: Building Smart Java or Kotlin Applications with Spring AI

At VoxxedDaysAmsterdam2025, Christian Tzolov, a Spring AI team member at VMware and lead of the MCP Java SDK, delivered a comprehensive session titled “From Zero to AI: Building Smart Java or Kotlin Applications with Spring AI.” Spanning nearly two hours, the session provided a deep dive into integrating generative AI into Java and Kotlin applications using Spring AI, a framework designed to connect enterprise data and APIs with AI models. Through live coding demos, Tzolov showcased practical use cases, including conversation memory, tool/function calling, retrieval-augmented generation (RAG), and multi-agent systems, while addressing challenges like AI hallucinations and observability. Attendees left with actionable insights to start building AI-driven applications, leveraging Spring AI’s portable abstractions and the Model Context Protocol (MCP).

Overcoming LLM Limitations with Spring AI

Tzolov began by outlining the challenges of large language models (LLMs): they are stateless, frozen in time, and lack domain-specific knowledge, requiring developers to provide context, manage state, and handle interactions with external systems. Spring AI addresses these issues with high-level abstractions like the ChatClient, similar to Spring’s RestClient or WebClient, enabling seamless integration with models like OpenAI’s GPT-4o, Anthropic’s Claude, or open-source alternatives like LLaMA. A live demo of a flight booking assistant illustrated these concepts. Tzolov started with a basic Spring Boot application connected to OpenAI, demonstrating a simple chat interface. To ground the model, he used system prompts to define its behavior as a customer support agent for “Fun Air,” ensuring contextually appropriate responses. He then introduced conversation memory using Spring AI’s ChatMemoryAdvisor, which retains a chronological list of messages to maintain state, addressing the stateless nature of LLMs. For long-term memory, Tzolov employed a vector store (Chroma) to store conversation history semantically, retrieving only relevant data for queries, thus overcoming context window limitations. This setup allowed the assistant to respond accurately to queries like “What is my flight status?” by fetching booking details (e.g., booking number 103) from a mock database.

Enhancing AI Applications with Tool Calling and RAG

To enable LLMs to interact with external systems, Tzolov demonstrated tool/function calling, where Spring AI wraps existing services (e.g., a flight booking service) as tools with metadata (name, description, JSON schema). In the demo, the assistant used a getBookingDetails tool to query a database, allowing it to provide accurate flight status updates. Tzolov emphasized the importance of descriptive tool metadata to guide the LLM in deciding when and how to invoke tools, reducing the risk of misinterpretation. For domain-specific knowledge, he introduced prompt stuffing—injecting additional context into prompts—and RAG for dynamic data retrieval. In a RAG demo, cancellation policies were loaded into a Chroma vector store, chunked into meaningful segments, and retrieved dynamically based on user queries. This approach mitigated hallucinations, as seen when the assistant correctly cited a 50% refund policy for premium economy bookings within 40 hours. Tzolov highlighted advanced RAG techniques, such as data compression and reranking, supported by Spring AI’s APIs, and stressed the importance of evaluating responses to ensure relevance, referencing frameworks like those from contributor Thomas Vitali.

Building Multi-Agent Systems with MCP

Tzolov explored the Model Context Protocol (MCP), initiated by Anthropic, as a standardized way to integrate AI applications with external tools and resources across platforms. Using Spring AI’s MCP Java SDK, he demonstrated how to build and consume MCP-compliant tools. In one demo, a Spring AI application connected to MCP servers for Brave Search (JavaScript-based) and file system access, enabling an agent to answer queries about Spring AI support for MCP and write summaries to a file. Another demo reversed the setup, exposing a Spring AI weather tool (using Open-Meteo) as an MCP server, accessible by third-party clients like Claude Desktop via standard I/O or HTTP/SSE transports. Tzolov explained MCP’s bidirectional architecture, where clients can act as servers, supporting features like sampling (allowing servers to request LLM processing from clients). He addressed security concerns, noting Spring AI’s integration with Spring Security (referencing a blog by Daniel Garnier-Moiroux) to secure MCP servers with OAuth 2.1. The session also introduced agentic systems, where LLMs act as a “brain” for planning and tools as a “body” for interaction, with an agentic loop evaluating and refining responses. A work-in-progress demo showcased an orchestration pattern, delegating tasks to searcher, fact-checker, and writer agents, to be published on the Spring AI Community Portal.

Observability and Multimodality for Robust AI Systems

Observability was a key focus, as Tzolov underscored its importance in debugging complex AI interactions. Spring AI integrates with Micrometer to provide metrics (e.g., token usage, model throughput, latency), tracing, and logging (via Loki). A dashboard demo displayed real-time metrics for the flight booking assistant, highlighting tool calls and errors, crucial for diagnosing issues in agentic systems. Tzolov also explored multimodality, demonstrating a voice assistant using OpenAI’s GPT-4o audio preview, which processes audio input and output. Configured as “Marvin the Paranoid Android,” the assistant responded to voice queries with humorous, contextually appropriate replies, showcasing Spring AI’s support for non-text modalities like images, PDFs, and videos (e.g., Gemini’s video support). Tzolov noted that multimodality enables richer interactions, such as analyzing images or converting PDFs to markdown, and Spring AI’s abstractions handle these seamlessly. He concluded by encouraging developers to explore Spring AI’s documentation, experiment with MCP, and contribute to the community, emphasizing its role in building robust, interoperable AI applications.

Hashtags: #SpringAI #GenerativeAI #ModelContextProtocol #ChristianTzolov #VoxxedDaysAmsterdam2025 #AIAgents #RAG #Observability