Run Ollama on a Pi5

Run Ollama on a Pi5

Host Ollama and TinyDolphin LLM on a Pi5 with Docker Compose

This Sunday morning, I decided to verify if Ollama could run on a Pi5.

My Pi5 has 8Gb of RAM and uses a SanDisk 256 Go Extreme PRO microSDXC (until 200 Mo/s). I plan to do more experiments with an NVMe Base extension board + SSD in the future, but the SD card is now enough.

I used a small LLM, Tinydolphin.

By the way, I did everything in "headless" mode.

Prerequisites (OS install, etc.)

I used the RaspberryPI Imager to burn the OS on the SD Card, and I chose the RaspberryPI OS Lite 64b (no desktop).

For the rest of the article, my user is k33g, and my Pi is seen on the network with the following name: hal.local (you can configure this with the RaspberryPI Imager tool). Replace the information with your information.

I decided to use Docker Compose to run Ollama and load the LLM, ...

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Then, to manage Docker as a non-root user, use these commands:

sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker

# Verify that you can run docker commands without sudo.
docker run hello-world

This setup is available with the Docker documentation:

Install Ollama with Docker Compose

I did a Compose file with three services:

  • The first service to run the Ollama platform: ollama-service

  • The second service to load the LLM: download-model

  • The third service is a demo image with Python scripts interacting with Ollama to test the setup: python-environment

services:
  ollama-service:
    container_name: ollama_pi_local
    image: ollama/ollama:latest
    volumes:
      - ./ollama:/root/.ollama
    ports:
      - 11434:11434

  download-model:
    image: curlimages/curl:8.6.0
    entrypoint: ["curl", "ollama-service:11434/api/pull", "-d", '{"name": "tinydolphin"}']
    # You can use other models
    #entrypoint: ["curl", "ollama-service:11434/api/pull", "-d", '{"name": "phi"}']
    #entrypoint: ["curl", "ollama-service:11434/api/pull", "-d", '{"name": "orca-mini:7b"}']
    depends_on:
      ollama-service:
        condition: service_started

  # 🚧 This is a work in progress
  # more samples to come
  python-environment:
    profiles: [demo]
    build:
      context: ./.docker/langchain-python
      dockerfile: Dockerfile
    container_name: python-demo
    depends_on:
      ollama-service:
        condition: service_started
    environment:
      - OLLAMA_BASE_URL=http://ollama-service:11434
    volumes:
      - ./python-demo:/python-demo
    ports:
      - 8000:8000

And this is the Dockerfile of the python-environment:

FROM langchain/langchain

WORKDIR /app

RUN <<EOF
apt-get update
apt-get install -y build-essential curl software-properties-common
rm -rf /var/lib/apt/lists/*
EOF

COPY requirements.txt .
COPY index.html .

RUN pip install --upgrade -r requirements.txt

ENTRYPOINT [ "python3", "-m", "http.server", "8000" ]

requirements.txt contains the list of the Python dependencies:

openai
ollama
langchain
langchain-openai
langchain-community
chromadb

I think that openai and langchain-openai are not mandatory, this is someting I need to check for the next version of this little GenAI stack.

At the start of the container, I start an HTTP server to keep the container alive so that you can put any content into the index.html file.

✋ I created a GitHub repository for this Docker Compose project: https://github.com/bots-garden/pi-genai-stack with all the required files. Then, the installation on the Pi will be easier.

So, to install and start it on the Pi:

# Connect to the pi
ssh k33g@hal.local
# Install the GenAI stack
git clone https://github.com/bots-garden/pi-genai-stack.git
# Start the stack
cd pi-genai-stack
docker compose --profile demo up
# ⏳ wait for a moment...

To stop the stack, use: docker compose down

✋ it's important to use the --profile flag with the demo value to start the Python Demo container, otherwise you will only start Ollama.

Start Ollama at the Pi boot

Use these commands to allow the start of Ollama at every reboot:

Edit /etc/rc.local:

ssh k33g@hal.local
sudo nano /etc/rc.local

Add this content before the last line (the one with exit 0):

cd /home/k33g/pi-genai-stack
su k33g -c 'docker compose up'
# do not remove `exit 0` at the end

Then, save the file, make it executable and reboot the Pi:

sudo chmod +x /etc/rc.local
sudo reboot

Play with Ollama and Tinydolphin

Once the Pi restarted, connect to it and run the python-demo container in interactive mode:

# Connec to the pi
ssh k33g@hal.local
cd pi-genai-stack
docker exec --workdir /python-demo -it python-demo /bin/bash
ls -lh
# you should get: 
total 8.0K
-rw-r--r-- 1 1000 1000 480 Feb 12 05:33 1-give-me-a-dockerfile.py
-rw-r--r-- 1 1000  992 483 Feb 11 08:50 2-tell-me-more-about-docker-and-wasm.py

And now, you can run the Python scripts:

python3 1-give-me-a-dockerfile.py

Wait for some seconds, and you should get an answer like this one:

Sure! Here's a basic example of how you could do this in Dockerfile:

```Dockerfile
# syntax: dockerfile:1.13

# Start with a base image, we will use Node.js with YARN installed for simplicity
FROM node:alpine

# Install additional tools needed by the app
RUN apk add --update yarn \
    && rm -r /var/cache/yarn/*

# Install required packages
COPY package.json ./
RUN npm install

# Copy your code
COPY . /app

# Start app
CMD [ "npm", "start" ]
```
This Dockerfile will:
- Build a new image using Node.js with YARN installed. This is the recommended way of doing this, as it's faster and more robust than using NPM directly.
- Install all required tools (Node, YARN) and packages used by your application.
- Copy your code from source to Docker image. This allows you to maintain your code in a single Docker container.
- Run your app with `npm start`. This will start your app server listening on port 3000.

To build the image, you can use the command:
```bash
docker build -t myapp .
```
This will create a new Docker image with Node.js and YARN installed and your app code. You can then tag this image as desired using `docker tag`.

🎉 it works! And if it runs on a Pi, it should run on your machine. 😉

In a future blog post, I will explain to you how to use the Ollama API with JavaScript, but also we will take the opportunity to take our first steps with LangChain.