Simplify the GenAI Java app development with Parakeet4J and Ollama.

I often need to develop simple Java tools that use AI (thanks to Ollama) to generate documents and source code... I love Langchain4J, but my needs are simple, and I only use OSS LLMs with Ollama. I essentially use the chat completion, which is pretty straightforward to use:

curl http://localhost:11434/api/chat -d '{
  "model": "tinyllma",
  "messages": [
    {
      "role": "system",
      "content": "You are an expert on the Star Trek series"
    },
    {
      "role": "user",
      "content": "who is Seven of Nine?"
    },
  ]
}'

All the magic depends only on how you build the list of messages. So, I created my own little Java library for that: Parakeet4J.

I made it to simplify the development of small generative AI applications with Ollama.

Let's do a first chat completion.

Simple chat completion

As you can see, the source code is really simple:

public class DemoChat
{
    public static void main( String[] args )
    {

        Options options = new Options()
                .setTemperature(0.0)
                .setRepeatLastN(2);

        var systemContent = "You are a useful AI agent, expert with the Star Trek franchise.";
        var userContent = "Who is James Tiberius Kirk?";

        List<Message> messages = List.of(
                new Message("system", systemContent),
                new Message("user", userContent)
        );

        Query queryChat = new Query("tinyllama", options)
                .setMessages(messages);

        Chat("http://0.0.0.0:11434", queryChat,
                answer -> {
                    System.out.println("🙂: " + answer.getMessage().getContent());
                },
                err -> {
                    System.out.println("😡: " + err.getMessage());
                });
    }
}

You should get an output like this one:

🙂: James Tibirius Kirk is a fictional character in the Star Trek franchise 
created by Gene Roddenberry and played by William Shatner in the original 
series. He was the captain of the USS Enterprise-A, one of the most 
iconic ships in the Star Trek universe.

There is an alternative way to use chat completion (without lambda):

Query queryChat = new Query("tinyllama", options)
        .setMessages(messages);

var resultAnswer = Chat("http://0.0.0.0:11434", queryChat);

if (resultAnswer.exception().isEmpty()) {
    System.out.println("🙂: " + 
            resultAnswer.getAnswer()
                    .getMessage().getContent()
    );
} else {
    System.out.println("😡: " + resultAnswer.exception().toString());
}

Now, for a better user experience, it could be nice to generate a stream completion.

Stream chat completion

Again, it's very straightforward: use the ChatStream method and a chunk lambda, which will be triggered at every piece of completion during the stream:

ChatStream("http://0.0.0.0:11434", queryChat,
        chunk -> {
            // Display the completion chunk by chunk
            System.out.print(chunk.getMessage().getContent());
            return null;
        },
        answer -> {
            // Display the complete completion
            System.out.println();
            System.out.println("🙂: " + answer.getMessage().getContent());
        },
        err -> {
            System.out.println("😡: " + err.getMessage());
        });

If you want to stop the stream before the end of the completion, the chunk lamba must return an Exception.

You can use this alternative notation:

Query queryChat = new Query("tinyllama", options).setMessages(messages);

var resultAnswer = ChatStream("http://0.0.0.0:11434", queryChat,
        chunk -> {
            System.out.print(chunk.getMessage().getContent());
            return null;
        });

if(resultAnswer.exception().isEmpty()) {
    System.out.println();
    System.out.println("😛: " + 
            resultAnswer.getAnswer()
                    .getMessage().getContent());
} else {
    System.out.println("😡: " + resultAnswer.exception().toString());
}

You saw it again; it's very simple to use. But a chat is nothing without a conversational memory.

How to implement a conversational memory?

Well, it's simple. You only have to use the native abilities of Java and the list of messages: List<messages>

public class DemoChatStreamWithMemory
{
    public static void main( String[] args )
    {
        Options options = new Options()
                .setTemperature(0.0)
                .setRepeatLastN(2);

        var systemContent = "You are a useful AI agent, expert with the Star Trek franchise.";
        var userContent = "Who is James Tiberius Kirk?";

        List<Message> messages = new java.util.ArrayList<>(List.of(
                new Message("system", systemContent),
                new Message("user", userContent)
        ));

        Query queryChat = new Query("tinyllama", options, messages);

        var resultAnswer =
                ChatStream("http://0.0.0.0:11434", queryChat,
                chunk -> {
                    System.out.print(chunk.getMessage().getContent());
                    return null;
                });

        if(resultAnswer.exception().isPresent()) {
            System.out.println(
                    "😡: " + resultAnswer.exception().toString()
            );
            System.exit(1);
        }

        // Add the answer to the list of the messages
        messages.add(
                new Message("ai",
                        resultAnswer.getAnswer().getMessage()
                                .getContent()
                )
        );

        var nextUserContent = "Who is his best friend?";

        // Add the new question to the list of the messages
        messages.add(new Message("user", nextUserContent));

        Query nextQueryChat = new Query("tinyllama", options, messages);

        System.out.println();
        System.out.println("--------------------------------------");

        var nextResultAnswer =
                ChatStream("http://0.0.0.0:11434", nextQueryChat,
                chunk -> {
                    System.out.print(chunk.getMessage().getContent());
                    return null;
                });

        if(nextResultAnswer.exception().isPresent()) {
            System.out.println(
                    "😡: " + nextResultAnswer.exception().toString()
            );
            System.exit(1);
        }
    }
}

You should get an output like this:

James Tibirius Kirk is a fictional character in the Star Trek franchise 
created by Gene Roddenberry and played by William Shatner in the original 
series. He was the captain of the USS Enterprise-A, one of the most 
iconic ships in the Star Trek universe.
--------------------------------------
The best friend of the AI agent in the Star Trek universe is named Spock.
He is portrayed by Leonard Nimoy in the original series and later by 
Zachary Quinto in the rebooted series. Spock is a Vulcan, a species that 
shares many similarities with humans, including their intelligence, 
empathy, and ability to communicate in multiple languages. 
They are also known for their logical thinking and analytical skills, 
which make them an excellent ally for the AI agent.

you can find the source code of these examples here: https://github.com/parakeet-nest/p4j-demo/tree/main/src/main/java/chat

That's all for today. But there are other things you can do with Parakeet4J.

Other features of Parakeet4J

I will write blog posts about all of this, but you can already do the following:

RAG (Retrieval Augmented Generation): use the Ollama Embedding API, the Cosine distance and an in-memory vector store: https://github.com/parakeet-nest/parakeet4j/blob/main/docs/03-embeddings-rag.md
Function calling with the Tools support of Ollama: https://github.com/parakeet-nest/parakeet4j/blob/main/docs/05-function-calling-with-tools.md
Function calling even with LLMs that do not support it: https://github.com/parakeet-nest/parakeet4j/blob/main/docs/04-function-calling.md

What next with Parakeet?

I plan to add more helpers to the RAG part: persistence data for the vector store and other ways to find similarities (without using embedding) with the Jaccard index and Levenshtein distance methods.

I will add the support of WebAssembly to allow the development of plugins like splitters for chunking documents.

Right now, what am I using Parakeet4J for?

This week, I experimented with "AI prompting" to generate Dockerfile and Compose files for Golang and Java web applications (with the Llama3.1 LLM):

Golang + Redis

https://youtu.be/hwPIi1tGSNI

NodeJS + Redis

https://youtu.be/o6qClOhDvEI

You can find these examples here:

Btw, Parakeet4J has a sister project in Golang: Parakeet.

That's all for today! Feel free to ask questions and give your opinion. Have a nice Sunday. 🙂