Integrate AI with Spring Boot Now

Web developers are curious about how to integrate the power of Large Language Models (LLMs) into their projects. It's becoming increasingly clear that LLMs will play an important role in shaping user experiences across all kinds of apps. However, a lot of engineers also do not where to start and how to start experimenting.

In this article we will provide you simple steps and code to start using open source LLMs with your app. Before you understand how to use LLMs in your app, please watch our high level design video that describe how AI can be used in your app.

Ollama

Ollama is an open source tool that makes it simple for you to run any open source model on your server. Open source models such as Meta’s LLama, Google’s Gemma etc. are essentially very large files that you load in memory and using their own APIs you ask them questions. Each open source model has a different interface, different capabilities.

As an application developer it is best not to concern yourself with low level details of the model apis. However, you might want to experiment with different models to figure out which one is better for your needs. Y

Ollama is a wrapper around these open source models and lets you pick and chose the model and create a uniform interface around it.

Model	Parameters	Size	Download
Llama 3.3	70B	43GB	`ollama run llama3.3`
Llama 3.2	3B	2.0GB	`ollama run llama3.2`
Llama 3.2	1B	1.3GB	`ollama run llama3.2:1b`
Llama 3.2 Vision	11B	7.9GB	`ollama run llama3.2-vision`
Llama 3.2 Vision	90B	55GB	`ollama run llama3.2-vision:90b`
Llama 3.1	8B	4.7GB	`ollama run llama3.1`
Llama 3.1	405B	231GB	`ollama run llama3.1:405b`
Phi 3 Mini	3.8B	2.3GB	`ollama run phi3`
Phi 3 Medium	14B	7.9GB	`ollama run phi3:medium`
Gemma 2	2B	1.6GB	`ollama run gemma2:2b`
Gemma 2	9B	5.5GB	`ollama run gemma2`
Gemma 2	27B	16GB	`ollama run gemma2:27b`
Mistral	7B	4.1GB	`ollama run mistral`
Moondream 2	1.4B	829MB	`ollama run moondream`
Neural Chat	7B	4.1GB	`ollama run neural-chat`
Starling	7B	4.1GB	`ollama run starling-lm`
Code Llama	7B	3.8GB	`ollama run codellama`
Llama 2 Uncensored	7B	3.8GB	`ollama run llama2-uncensored`
LLaVA	7B	4.5GB	`ollama run llava`
Solar	10.7B	6.1GB	`ollama run solar`

This gives you an idea of how useful Ollama is.

Using Ollama with Spring Boot

Ollama is a library that you can install as a command line tool. Once done, you can use that interface to start a server that gives you a REST based interface.

Install ollama

curl -fsSL https://ollama.com/install.sh | sh

Start the ollama server

./ollama serve

# In a seprate shell

./ollama run llama3.2

You can then query this server.

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.2",
  "prompt":"Why is the sky blue?"
}'

Integrating with Spring Boot

To use ollama with spring boot you should create a separate server that just runs Ollama and use REST interface to make your Spring Boot talk to the server.

Spring Boot has a helpful library that allows you to interface with Ollama server. Just add the following line to your gradle.build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-ollama-spring-boot-starter'
}

Once you add this dependency you can configure the ollama parameters right into your spring boot application.

Configuring Ollama parameters.

Property	Description	Default
spring.ai.ollama.base-url	Base URL where Ollama API server is running.	`localhost:11434`

Then you can define your custom controller where you can call the Ollama Chat api.

@RestController
public class ChatController {

    private final OllamaChatModel chatModel;

    @Autowired
    public ChatController(OllamaChatModel chatModel) {
        this.chatModel = chatModel;
    }

    @GetMapping("/ai/generate")
    public Map<String,String> generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", this.chatModel.call(message));
    }

    @GetMapping("/ai/generateStream")
    public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return this.chatModel.stream(prompt);
    }

}

OllamaChatModel is a standard API to chat with the model. UserMessage implies the prompt that comes from User. We will cover the actual usage of these APIs in future posts. But this gives you a good starting point.

Function calling with Ollama

Ollama is smart enough to actually call methods based on output of the LLM. For example you might want to ask LLM current temperature in a city and then present it to the user. For this LLM might have to call a method inside your app. One way to do this is by creating a function and letting LLM know that it is available. When it needs to be called is smartly determined by LLM.

@SpringBootApplication
public class OllamaApplication {

    public static void main(String[] args) {
        SpringApplication.run(OllamaApplication.class, args);
    }

    @Bean
    CommandLineRunner runner(ChatClient.Builder chatClientBuilder) {
        return args -> {
            var chatClient = chatClientBuilder.build();

            var response = chatClient.prompt()
                .user("What is the weather in Amsterdam and Paris?")
                .functions("weatherFunction") // reference by bean name.
                .call()
                .content();

            System.out.println(response);
        };
    }

    @Bean
    @Description("Get the weather in location")
    public Function<WeatherRequest, WeatherResponse> weatherFunction() {
        return new MockWeatherService();
    }

    public static class MockWeatherService implements Function<WeatherRequest, WeatherResponse> {

        public record WeatherRequest(String location, String unit) {}
        public record WeatherResponse(double temp, String unit) {}

        @Override
        public WeatherResponse apply(WeatherRequest request) {
            double temperature = request.location().contains("Amsterdam") ? 20 : 25;
            return new WeatherResponse(temperature, request.unit);
        }
    }
}