In-house LLM-application by Spring AI + Ollama

8 min readApr 5, 2024

Introduction

Hello everyone, another of LLM blog post from product backend developer 😂 . In this blog post, We are going to explore more about Spring AI + Ollama.

Spring Boot, If we discuss in Java developer community, no one not know about this wonderful framework for sure. And recently, Spring team has been developing the way to integrated to LLM. As a developer, Our time for expanding knowledge has come.

Anyway, I wrote another blog post about LangChain4j which it’s able to integrate to Spring Boot as well. Feel free to check if you interested.

Java + Ollama — Unlock capability of Generative AI to Java developer with LangChain4j (Model on…

Example LangChain4j project with Ollama by design as exercise coach for office worker.

tpbabparn.medium.com

Okay, Let’s start our Spring AI + Ollama project

Prerequisites

LLM (Large Language Models)

AI model that are created from large dataset for thinking and generating the ideas/contents like human. On each model has its own Pros depend on the purpose of training and the using dataset. Some of them are using in general context, some of them able to using in special way e.g. Coding and etc.

Well-known example

GPT-3/4 (OpenAI)
Llama 2 (Meta)

Ollama

Ollama — The one of option that you can run LLM on your laptop or container to serve open-source LLM. So you don’t need to connect AI provider directly e.g. GPT model from OpenAI but using alternative model as alternative.

As of now, There are many options for Ollama. For example, Mistral, Llama2, Gemma, and etc. And we can interact with them by using CLI (Command Line Interface), REST API and SDK (Software Development Kit).

Checkout below sites for more informations

RAG (Retrieval-Augmented Generation)

Retrieval-Augmented Generation (RAG) is the process to optimizing output of LLM by adding knowledge base as extras from the trained data before the model make the response/answer.

Generation is meaning LLM generate data from user query and create the new result base on the knowledge of LLM

But if the data is too general, out of date, or you need the data that specific to your business?
So? Retrieval-Augmented is meaning we will placing the data source (somewhere else) and called it documents.
Then before LLM create the response, We will retrieve the similar documents and attach together with prompt and send it to LLM for consideration of answer.

More information about RAG, Please kindly check here. I think this is the best one.

What is Retrieval-Augmented Generation (RAG)?

Vector Database

Vector Database is the database which designed to store data as vector (series of numerical).

Inserting => Transform document/chunk of text to vector and store to database
Retrieving => Searching data by similarity/context of data and return. (Similarity search)

Spring AI Project

GitHub — spring-projects/spring-ai: An Application Framework for AI Engineering

An Application Framework for AI Engineering. Contribute to spring-projects/spring-ai development by creating an account…

github.com

Spring AI has inspiration from LangChain (Python) which able to integrated with AI and many LLM. As of now, There are options for models and vector database. You can see the documentation of Spring AI directly on the site.

Spring AI

The Spring AI project aims to streamline the development of applications that incorporate artificial intelligence…

docs.spring.io

Vaadin

Vaadin | The Web App Platform for Java

Build modern Java web apps faster with Vaadin. Our framework includes a large library of UI components and is designed…

vaadin.com

Vaadin is the framework to create web application with Java/Kotlin. Nothing much, I just want to trying something with Vaadin so I took the chance. If you interested, you can start the project by below site.

Create a new Vaadin app: configure views and customize the theme

A tool that allows you to visually create a custom Spring Boot based Vaadin Flow or Hilla app starter that you can…

start.vaadin.com

Note: Personally, It’s easier to create web application in JavaScript/TypeScript but if I compare with JSP (Java Server Pages). Okay, Vaadin is better. 😅

Get started

GitHub Repository + Concept

GitHub — marttp/20240331-developer-assistant-vaadin: Developer Assistance created by Vaadin +…

Developer Assistance created by Vaadin + Spring Boot + Spring AI — marttp/20240331-developer-assistant-vaadin

github.com

Our concept

Create something that can response the existing data and help me write the code

Here the example result of this project

Discuss Tech Trend

Pair Programming (Not much impressive for experienced developers)

Feasibility for integration

Let’s manual test with Ollama on locally

Okay, Then integration to our Vaadin project with Spring AI. Below are the relate configurations.

spring:
  ai:
    ollama:
      base-url: ${AI_OLLAMA_BASE_URL:http://localhost:11434}
      chat:
        options:
          model: mistral

Let’s implement the idea!

It’s time for my pet project 😅. I split to this below steps

Setup for RAG
UI by Vaadin
Create communication flow

Setup for RAG

Diagram when we put data to vector database

In this project, I used Apache Tika for read PDF file because I have some problems with spring-ai-pdf-document-reader

If you want to get the data like my project, I used the Stack Overflow Developer Survey 2023, Please kindly go to the site and save as PDF by yourself and place it under classpath resource /resources

Stack Overflow Developer Survey 2023

In May 2023 over 90,000 developers responded to our annual survey about how they learn and level up, which tools…

survey.stackoverflow.co

And when the application start, The process will start read the PDF and storing data to embedded vector database.

Note — It’s very slow from my laptop. My application is ready after an hour.

UI by Vaadin

It’s the simple message input for receiving prompt (I tried to create UI like ChatGPT anyway but using Vaadin)

Create communication flow

Last but not least, We will create service to communicate with model by attached system prompt by using system-qa.st and we will replace it by using HashMap and then use SystemPromptTemplate to create message before sending to LLM

Enhancement suggestion

If you follow all steps here, You can see a lot of thing need to be improved in the future. Here are my suggestions

Separate RAG worker and Frontend component

Currently, This is my project looks like

I believe it’s better if we separate workload into distributed system.

Use proper vector database instead of embedded simple one

Instead of embedding which consume massive memory. If we need to promote to upper environment or open to end-user. I think use proper vector database is much better.

As of now, Spring AI has these options

Azure Vector Search — The Azure vector store.
ChromaVectorStore — The Chroma vector store.
MilvusVectorStore — The Milvus vector store.
Neo4jVectorStore — The Neo4j vector store.
PgVectorStore — The PostgreSQL/PGVector vector store.
PineconeVectorStore — PineCone vector store.
QdrantVectorStore — Qdrant vector store.
RedisVectorStore — The Redis vector store.
WeaviateVectorStore — The Weaviate vector store.

Vector Databases

A vector databases is a specialized type of database that plays an essential role in AI applications. In vector…

docs.spring.io

My issues

My laptop cannot catch up this setup — GG Me

I have attached my laptop spec which used for running Spring Boot + Ollama + Embedded Vector. It’s consuming a lot of CPU when start communicate to Ollama. I believe you might have better result than mine if you have laptop that has higher computation power. I don’t have budget to buy new one e.g. M1, M2, M3. 🥹

spring-ai-pdf-document-reader cannot work well with some pdf

I use spring-ai-pdf-document-reader and got OOM Killed because of some font setting of PDF Box

<dependency>
   <groupId>org.springframework.ai</groupId>
   <artifactId>spring-ai-pdf-document-reader</artifactId>
</dependency>

After spending around 4 hours, I change to Apache Tika by using spring-ai-tika-document-reader instead.

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-tika-document-reader</artifactId>
</dependency>

My prompt is not efficient enough

System Message is the key factor to optimizing result from LLM. You can see my system message is not the best. Some sentence I use other generative AI to help think system message.

Alternative model if needed

Currently I used Mistral. Anyway, you can try alternative model by exploring more on Ollama library

library

Get up and running with large language models.

ollama.com

Other references which help me

Conclusion

That’s it for my pet project. It’s really small but I have made my hand dirty by exploring and making something

Spring AI
Integrate between Spring AI + Ollama
RAG
Try Vaadin

Anyway, Thank you for reading through this blog post. Apologize if it’s not meet your expectation. See you in the next blog post. 🙇‍♂️

Facebook: Thanaphoom Babparn
FB Page: TP Coder
LinkedIn: Thanaphoom Babparn
Website: TP Coder — Portfolio

In-house LLM-application by Spring AI + Ollama

Introduction

Java + Ollama — Unlock capability of Generative AI to Java developer with LangChain4j (Model on…

Example LangChain4j project with Ollama by design as exercise coach for office worker.

Prerequisites

LLM (Large Language Models)

Ollama

RAG (Retrieval-Augmented Generation)

Vector Database

Spring AI Project

GitHub — spring-projects/spring-ai: An Application Framework for AI Engineering

An Application Framework for AI Engineering. Contribute to spring-projects/spring-ai development by creating an account…

Spring AI

The Spring AI project aims to streamline the development of applications that incorporate artificial intelligence…

Vaadin

Vaadin | The Web App Platform for Java

Build modern Java web apps faster with Vaadin. Our framework includes a large library of UI components and is designed…

Create a new Vaadin app: configure views and customize the theme

A tool that allows you to visually create a custom Spring Boot based Vaadin Flow or Hilla app starter that you can…

Get started

GitHub Repository + Concept

GitHub — marttp/20240331-developer-assistant-vaadin: Developer Assistance created by Vaadin +…

Developer Assistance created by Vaadin + Spring Boot + Spring AI — marttp/20240331-developer-assistant-vaadin

Feasibility for integration

Let’s implement the idea!

Setup for RAG

Stack Overflow Developer Survey 2023

In May 2023 over 90,000 developers responded to our annual survey about how they learn and level up, which tools…

UI by Vaadin

Create communication flow

Enhancement suggestion

Separate RAG worker and Frontend component

Use proper vector database instead of embedded simple one

Vector Databases

A vector databases is a specialized type of database that plays an essential role in AI applications. In vector…

My issues

My laptop cannot catch up this setup — GG Me

spring-ai-pdf-document-reader cannot work well with some pdf

My prompt is not efficient enough

Alternative model if needed

library

Get up and running with large language models.

Other references which help me

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Thanaphoom Babparn

Responses (6)

More from Thanaphoom Babparn

Spring Boot 3.3 + OpenTelemetry (Agent) with otel-lgtm

Sample project for integrating Spring Boot 3.3 with OTLP agent with LGTM stack with otel-lgtm

Java + Ollama — Unlock capability of Generative AI to Java developer with LangChain4j (Model on…

Example LangChain4j project with Ollama by design as exercise coach for office worker.

รีวิว Pixel 8 Pro จากคนซื้อมาใช้งานจริง ไม่ทดสอบอะไรแปลก ๆ ให้นะ เปลืองตัง

ทางผู้เขียนได้ทำการซื้อ Pixel 8 Pro มา เลยมาเขียนรีวิวเวอร์ชันการใช้งานจริง ผสมกับการทดสอบใช้งาน

Using Spring AI Function Calling with OpenAI — Reference method to your agent

Short project for demonstrate Spring AI Function Calling with OpenAI integration in usable way.

Recommended from Medium

Spring Boot AI + Azure OpenAI Hello World Example

In this tutorial, we will walk through the process of setting up Azure OpenAI and connecting it with Spring AI. Integrating artificial…

Exploring Spring AI: Simplifying LLM Integration for Spring Developers

In this three-part blog post, we will explore Spring AI and its seamless integration with large language models (LLMs). We’ll start with an…

Part 4: Chatting about company documents using RAG and Spring AI

A Step-by-Step Guide to Implementing RAG in Spring AI

Gemma 3 + MistralOCR + RAG Just Revolutionized Agent OCR Forever

Not a Month Ago, I made a video about Ollama-OCR. Many of you like this video

Optimizing Spring Boot for Extreme Scale: How to Handle 1M Requests Per Second

High-performance web applications must be designed to handle extreme traffic efficiently. Achieving 1 million requests per second (RPS) in…

Building Scalable Microservices with Spring Boot and Spring Cloud

1. Introduction to Microservices Architecture