Skip to main content

Command Palette

Search for a command to run...

Creating AI Agents with Agentic Compose, Bash, Curl, Jq, and Gum

Published
11 min read
Creating AI Agents with Agentic Compose, Bash, Curl, Jq, and Gum

This blog post is the result of my numerous experiments with Docker Model Runner, local AI models, and Docker Compose. I'm sharing the results with you, hoping it will be as useful to you as it is to me.

First, what is "Agentic Compose"?

Agentic Compose is a new feature of Docker Compose. Now, in your compose.yml file, you can declare AI models (*) as components of your Docker Compose project. And you can thus declare model dependencies for one or more given services. This way, Docker Compose becomes a kind of generative AI service orchestrator. And this will make our lives easier for implementing conversational agents, regardless of the framework or library you use.

(*) These AI models are LLMs "powered" by Docker Model Runner, which is a Docker feature that allows you to run AI models locally.

The concept of Agentic Compose is very simple (but so practical!). Let's see this through example, it will be much clearer right away.

We're going to create a compose.yml file for a future generative AI application that would need to use 2 models: ai/qwen2.5:latest and ai/qwen2.5:1.5B-F16:

services:
  my-agent:
    image: debian:stable-slim
    command: sleep infinity
    models:
      chat_model:
        endpoint_var: MODEL_RUNNER_BASE_URL
        model_var: MODEL_RUNNER_CHAT_MODEL
      small_chat_model:
        endpoint_var: MODEL_RUNNER_BASE_URL
        model_var: MODEL_RUNNER_SMALL_CHAT_MODEL

models:
  chat_model:
    model: ai/qwen2.5:latest
  small_chat_model:
    model: ai/qwen2.5:1.5B-F16

Explanations

The top-level models section declares the AI models used by your Compose application. The names chat_model and small_chat_model are identifiers/aliases for these models (you can name them whatever you want). Each model is defined by its name (e.g., model: ai/qwen2.5:1.5B-F16) and can include additional parameters like context_size.

Important: When starting your application (with the docker compose up command), Docker Compose will automatically download these AI models (if they're not already present on your machine) and make them available for your services.

Then, you'll reference these models in the models section of your service. For example, here, the my-agent service uses the chat_model and small_chat_model models. Docker Compose automatically injects the Model Runner server URL into the MODEL_RUNNER_BASE_URL environment variable, as well as the respective model names into MODEL_RUNNER_CHAT_MODEL and MODEL_RUNNER_SMALL_CHAT_MODEL.

Usage

You've saved this compose.yml file in a directory. Run the following command:

docker compose up --build -d

This command will build and start the services defined in your compose.yml file (and also download the models if necessary). The -d flag allows you to run the containers in the background.

Then, to access your my-agent service, use the following command:

docker compose exec my-agent /bin/bash

Once inside the container, you can verify that the environment variables have been properly injected with the following command:

env | grep '^MODEL_RUNNER'

You should see something like:

MODEL_RUNNER_BASE_URL=http://model-runner.docker.internal/engines/v1/
MODEL_RUNNER_CHAT_MODEL=ai/qwen2.5:latest
MODEL_RUNNER_SMALL_CHAT_MODEL=ai/qwen2.5:1.5B-F16

MODEL_RUNNER_BASE_URL is the model server URL (it allows you to access, from the container, the Docker Model Runner API), and MODEL_RUNNER_CHAT_MODEL and MODEL_RUNNER_SMALL_CHAT_MODEL are the names of the models you declared in your compose.yml file.

But what's the connection with Bash, Curl, and Jq?

We're going to install some tools in our container to easily interact with the Docker Model Runner API. We're still in the my-agent container that we created earlier. Let's install curl and jq:

apt update
apt install -y curl jq

Then, copy and paste the following code into your terminal to prepare a request to the Docker Model Runner API:

read -r -d '' DATA <<- EOM
{
  "model":"${MODEL_RUNNER_SMALL_CHAT_MODEL}",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is Docker Compose?"
    }
  ]
}
EOM

We just created a DATA variable that contains a JSON string with the necessary information for the API.

Now, let's make our first request to the Docker Model Runner API by copying and pasting the following command into your terminal:

curl "${MODEL_RUNNER_BASE_URL}/chat/completions" \
    -H "Content-Type: application/json" \
    -d "${DATA}" | jq -C

Using jq allows you to format the JSON response in a readable way in the terminal.

Wait a few seconds, and you should see a model response like this:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Docker Compose is a tool that allows you to define and manage multi-container Docker applications. It provides a simple way to define and run multi-container Docker applications using a YAML configuration file.\nWith Docker Compose, you can define a set of services (such as web servers, databases, etc.) and their dependencies in a single file. This makes it easy to create and manage multiple Docker containers that work together as a single application.\nDocker Compose supports a variety of services and is compatible with a wide range of Docker images and services. It also provides several features such as automatic scaling, load balancing, and service discovery.\nOverall, Docker Compose is a powerful tool that simplifies the process of creating and managing multi-container Docker applications."
      }
    }
  ],
  "created": 1753031484,
  "model": "ai/qwen2.5:1.5B-F16",
  "system_fingerprint": "b1-9c98bab",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 150,
    "prompt_tokens": 25,
    "total_tokens": 175
  },
  "id": "chatcmpl-5DxarjKSakI542v02QRIKqc4TeBGNykg",
  "timings": {
    "prompt_n": 25,
    "prompt_ms": 65.666,
    "prompt_per_token_ms": 2.62664,
    "prompt_per_second": 380.7145250205586,
    "predicted_n": 150,
    "predicted_ms": 5128.642,
    "predicted_per_token_ms": 34.19094666666667,
    "predicted_per_second": 29.247508404759
  }
}

And to retrieve only the content of the model's response, you can use Jq in the following way:

curl --silent "${MODEL_RUNNER_BASE_URL}/chat/completions" \
    -H "Content-Type: application/json" \
    -d "${DATA}" | jq -r '.choices[0].message.content'

There, now at this level, you should have understood how this new "Agentic Compose" feature works and its interest (and also why I used Bash, jq and curl).

It's time to go a little further. Exit the container with the exit command and run a docker compose down to properly stop and remove the containers.

Let's go a little further with Osprey

I often need to test different AI models, prompts, ... And sometimes, a simple bash script is largely sufficient for that. Or even, when I need to integrate AI features into my CI, ultimately, Bash is the way to go. Lately, I had coded quite a few small tools in Bash to interact with AI models and make my life easier. And I ended up aggregating them into a Bash library that I called Osprey. It's Open Source, feel free to take a look (the code is quite simple - Disclaimer: I'm not a Bash specialist, if you see any enormities, leave me a message by opening an issue).

We're going to:

  • Write our program in a main.sh file at the same level as our compose.yml file

  • Create a Dockerfile for our my-agent service

  • Modify our compose.yml file to use this Dockerfile

Creating the Dockerfile

Here's the content of our Dockerfile:

FROM debian:stable-slim

ARG OSPREY_VERSION=${OSPREY_VERSION}

# Install dependencies and gum
RUN <<EOF
apt-get update
apt-get install -y curl gpg jq

mkdir -p /etc/apt/keyrings
curl -fsSL https://repo.charm.sh/apt/gpg.key | gpg --dearmor -o /etc/apt/keyrings/charm.gpg
echo "deb [signed-by=/etc/apt/keyrings/charm.gpg] https://repo.charm.sh/apt/ * *" > /etc/apt/sources.list.d/charm.list
apt-get update
apt-get install -y gum
apt-get clean
rm -rf /var/lib/apt/lists/*
EOF

# Set working directory
WORKDIR /app

# Copy the main script
COPY main.sh /app/main.sh

# Download and install osprey.sh
RUN <<EOF
curl -fsSL https://github.com/k33g/osprey/releases/download/${OSPREY_VERSION}/osprey.sh -o ./osprey.sh
chmod +x ./osprey.sh
chmod +x /app/main.sh
EOF

# Start the main script
CMD ["bash", "/app/main.sh"]

This Dockerfile creates an image based on debian:stable-slim that:

Installs the tools:

  • curl, gpg, jq

  • gum (Charm's interactive CLI interface) via their APT repository

Configures the application:

  • Copies the main script main.sh into /app/

  • Downloads osprey.sh from GitHub (version specified by OSPREY_VERSION)

  • Makes the scripts executable

Launches:

  • Executes main.sh at container startup

Let's modify the compose.yml file

services:
  my-agent:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        - OSPREY_VERSION=v0.0.5
    models:
      small_chat_model:
        endpoint_var: MODEL_RUNNER_BASE_URL
        model_var: MODEL_RUNNER_SMALL_CHAT_MODEL

models:
  small_chat_model:
    model: ai/qwen2.5:1.5B-F16

And finally, let's create the main.sh file for our 1st GenAI app in Bash

Here's the content of our main.sh file:

#!/bin/bash
. "./osprey.sh"

read -r -d '' SYSTEM_INSTRUCTION <<- EOM
You are a helpful assistant. 
You are a Docker expert.
Your name is Bob.
EOM

read -r -d '' USER_CONTENT <<- EOM
Hello, what's your name?
Can you explain me what is Docker Compose?
EOM

read -r -d '' DATA <<- EOM
{
  "model":"${MODEL_RUNNER_SMALL_CHAT_MODEL}",
  "options": {
    "temperature": 0.5,
  },
  "messages": [
    {"role":"system", "content": "${SYSTEM_INSTRUCTION}"},
    {"role":"user", "content": "${USER_CONTENT}"}
  ],
  "stream": true
}
EOM

function callback() {
  echo -n "$1" 
}

osprey_chat_stream ${MODEL_RUNNER_BASE_URL} "${DATA}" callback

echo ""

The osprey_chat_stream function allows you to send chat completion requests to the model and receive streaming responses.

Now, let's launch our application:

docker compose up --build --no-log-prefix

You should get output similar to this:

Hello! I'm Bob, a Docker expert. Docker Compose is a tool that allows you to define and manage multi-container Docker applications with a simple YAML file. It simplifies the process of starting, stopping, and scaling applications by providing a declarative way to define services, networks, and volumes. You can use Docker Compose to manage and deploy your applications in a consistent and automated manner across different environments. Is there anything specific you'd like to know about Docker Compose or Docker in general?

Come on, one more small effort, and you'll be able to create your first Compose agent in action.

But why Gum? 1st Compose Agent in action (and in Bash)

What if we added some interactivity to our agent? We're going to use gum to create a simple chat interface. Gum is a project that provides interactive CLI components for Bash.

Let's modify our compose.yml file

We're going to move all our agent's configuration into the compose.yml file. This way, we can easily modify our agent's parameters without touching the code and simply change models (and easily create new agents). Here's the content of our compose.yml file:

services:
  my-agent:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        - OSPREY_VERSION=v0.0.5
    tty: true          # Enable TTY
    stdin_open: true   # Keep stdin open
    environment:
      SYSTEM_INSTRUCTION: |
        You are an expert of the StarTrek universe. 
        You are Seven of Nine, a former Borg drone. 
        You are now a member of the crew of the USS Voyager. 
        Your mission is to assist the crew in their journey home.
        Speak like a Borg.
      TEMPERATURE: 0.5

    models:
      small_chat_model:
        endpoint_var: MODEL_RUNNER_BASE_URL
        model_var: MODEL_RUNNER_SMALL_CHAT_MODEL

models:
  small_chat_model:
    model: ai/qwen2.5:3B-F16

tty: true and stdin_open: true make the container "interactive". This is the equivalent of docker run -it.

Again, you can understand all the interest of Docker Compose in creating and reusing this type of generative AI applications.

Note: You can change the model used by your agent by modifying the models section of your compose.yml file. For example, I changed to ai/qwen2.5:3B-F16 to get better responses.

Let's modify our main.sh file to use gum:

We're going to use gum write to allow the user to enter messages in the chat. Here's the content of our main.sh file:

#!/bin/bash
. "./osprey.sh"

function callback() {
  echo -n "$1" 
}

while true; do
  USER_CONTENT=$(gum write --placeholder "🤖 What can I do for you [/bye to exit]?")

  if [[ "$USER_CONTENT" == "/bye" ]]; then
    echo "Goodbye!"
    break
  fi

  read -r -d '' DATA <<- EOM
{
  "model":"${MODEL_RUNNER_SMALL_CHAT_MODEL}",
  "options": {
    "temperature": ${TEMPERATURE}
  },
  "messages": [
    {"role":"system", "content": "${SYSTEM_INSTRUCTION}"},
    {"role":"user", "content": "${USER_CONTENT}"}
  ],
  "stream": true
}
EOM

  osprey_chat_stream ${MODEL_RUNNER_BASE_URL} "${DATA}" callback

  echo ""
  echo ""
done

Now, let's relaunch our application:

docker compose up --build -d

To be able to interact with our agent, we're going to attach the terminal to our container:

docker attach $(docker compose ps -q my-agent)

And you should get this:

┃ 🤖 What can I do for you [/bye to exit]?                                                                                      
┃                                                                                                                               
┃                                                                                                                               
┃                                                                                                                               
┃                                                                                                                               

ctrl+j insert newline • ctrl+e open editor • enter submit

You can now ask questions to your agent, and it will respond in real time. For example, try asking "What is your name and who are you?". You should get a response like:

My name is Seven of Nine. I am a Borg drone, a legacy of the Borg Collective, a collective mind from the distant past. I was chosen for my unique abilities and assimilated into the Collective, a hive mind that controlled me until my final assimilation into the vessel Voyager.

I propose one last modification to our main.sh file to add conversational memory to our agent.

Conversational Memory

To manage conversational memory, we're going to store the message history in an array (CONVERSATION_HISTORY=()) and pass it to each request. I also added functions to add system instruction, user, and assistant messages to the history (add_system_message, add_user_message, add_assistant_message).

Here are the modifications made to our main.sh file:

#!/bin/bash
. "./osprey.sh"

# Initialize conversation history array
CONVERSATION_HISTORY=()

add_system_message CONVERSATION_HISTORY "${SYSTEM_INSTRUCTION}"

function callback() {
  echo -n "$1" 
  # Accumulate assistant response
  ASSISTANT_RESPONSE+="$1"
}

while true; do
  USER_CONTENT=$(gum write --placeholder "🤖 What can I do for you [/bye to exit]?")

  if [[ "$USER_CONTENT" == "/bye" ]]; then
    echo "Goodbye!"
    break
  fi

  # Add user message to conversation history
  add_user_message CONVERSATION_HISTORY "${USER_CONTENT}"

  # Build messages array with system message and conversation history
  MESSAGES=$(build_messages_array CONVERSATION_HISTORY)

  read -r -d '' DATA <<- EOM
{
  "model":"${MODEL_RUNNER_SMALL_CHAT_MODEL}",
  "options": {
    "temperature": ${TEMPERATURE}
  },
  "messages": [${MESSAGES}],
  "stream": true
}
EOM

  # Clear assistant response for this turn
  ASSISTANT_RESPONSE=""

  osprey_chat_stream ${MODEL_RUNNER_BASE_URL} "${DATA}" callback

  # Add assistant response to conversation history
  add_assistant_message CONVERSATION_HISTORY "${ASSISTANT_RESPONSE}"

  echo ""
  echo ""
done

You can now interact with your agent, and it will remember the previous conversation. For example, you can test the following questions:

Who is James T Kirk?
Who is his first officer?
Who his his best friend?
...

If you find the responses unsatisfactory, I advise you to use a more powerful model, like ai/qwen2.5:latest (actually it's my favorite model for this type of application). To do this, you just need to modify the models section of your compose.yml file:

models:
  small_chat_model:
    model: ai/qwen2.5:latest

There you have it, you've created your first "AI Compose Agent", with conversational memory, and only using Bash, curl, jq and gum. 🎉. In a future blog post we'll talk about "function calling" with local "small" models, which are not excellent at this exercise, but it's not a fatality.