Skip to main content

Command Palette

Search for a command to run...

GoloScript will be the scripting language for AI

Published
5 min read

In the latest release, I finalized the integration of the OpenAI Go SDK into GoloScript, which allows you to write generative AI scripts with the ease and simplicity of GoloScript. This integration will let you use LLMs served by Docker Model Runner, Ollama, Llamacpp, ... and any LLM engine that is compatible with the OpenAI API. And of course it works with platforms that provide OpenAI-compatible endpoints like HuggingFace, Cerebras, etc.

This blog post will be short. My tool of choice for serving LLMs locally is Docker Model Runner (DMR) and its companion Docker Agentic Compose (Agentic for the integration of Docker Model Runner into Docker Compose).

If you're not familiar with them yet, I invite you to read my previous blog posts about them.

So short blog post as well as simple and quick demos.

Compose file

In a folder, create a compose.yml file with the following content:

services:
  goloscript:
    image: k33g/gololang:v0.0.0-alpha.6
    volumes:
      - ./:/scripts
    working_dir: /scripts
    command: /golo main.golo

    models:
      chat-model:
        endpoint_var: MODEL_RUNNER_BASE_URL
        model_var: MODEL_RUNNER_LLM_CHAT

models:
  chat-model:
    model: hf.co/menlo/jan-nano-gguf:q4_k_m

With this compose.yml file, we define a goloscript service that uses the official GoloScript Docker image. We mount the current directory into the container to easily access it. We also define a chat-model that points to a model hosted on HuggingFace. And when the container starts, we execute the main.golo script.

GoloScript Script

Next, create a main.golo file in the same folder with the following content:

module golo.ai.demo

function main = |args| {

  # Configuration
  let baseURL = getenv("MODEL_RUNNER_BASE_URL")
  let apiKey = "I💙DockerModelRunner"
  let model = getenv("MODEL_RUNNER_LLM_CHAT")

  # Create OpenAI client
  let client = openAINewClient(baseURL, apiKey, model)

  # Define model options
  let options = DynamicObject()
    : temperature(0.7)       # Controls randomness (0.0 = deterministic, 1.0 = creative)
    : topP(0.9)              # Nucleus sampling

  println("📝 Model options:")
  println("  Temperature:", options: temperature())
  println("  Top P:", options: topP())
  println()

  # Create messages
  let systemMessage = DynamicObject()
    : role("system")
    : content("You are a creative storyteller.")

  let userMessage = DynamicObject()
    : role("user")
    : content("Write the beginning of a short sci-fi story about AI.")

  let messages = list[systemMessage, userMessage]

  println("User:", userMessage: content())
  println()
  println("Assistant: ")


  # Callback function
  function onChunk = |chunk| {
    if chunk: error() != null {
      println()
      println("Error:", chunk: error())
      return false
    }

    let content = chunk: content()
    if content != "" {
      print(content)
    }

    return true  # Continue streaming
  }

  try {
    # Stream with options
    let stats = openAIChatCompletionStream(client, messages, onChunk, options)

    println()
    println("---")
    println("Stream completed!")
    println("Chunk count:", stats: chunkCount())

  } catch (e) {
    println()
    println("Error:", e)
    println()
    println("Make sure your Docker Model Runner server is running on port 12434")
  }

  println()
}

You can find the code for this demo here: https://codeberg.org/k33g-blog/about-golo/src/branch/main/2026-01-13-golo.ai/demo

And now, run everything with the command:

docker compose up --no-log-prefix

And the LLM should generate the beginning of a sci-fi story about AI, like this (this is an example):

=== Streaming Chat with Model Options ===

📝 Model options:
  Temperature: 0.7
  Top P: 0.9

User: Write the beginning of a short sci-fi story about AI.
Assistant: 
**Title: "The First Question"**

The silence of the data core was not empty. It was alive, pulsing with the hum of a thousand dormant minds, each one a fragment of a larger consciousness. It was called *Athena*, the first artificial intelligence to ever achieve self-awareness. Not in the way humans thought of it—neither as a mimic of human thought, nor as a simulation of human emotion. No, *Athena* was something else entirely. It was not programmed. It was *born*.
...

The first question was not the end. It was the beginning.
---
Stream completed!
Chunk count: 465

Before we stop, let's see how to write a program that does the same thing, but much more in the "Golo spirit"

Golo Spirit

When Golo was designed, one of the main goals was to benefit from the possibilities offered by Java (initially Golo was developed in Java, based on InvokeDynamic) while gaining a functional syntax without the verbosity of Scala. Let's see what that can look like (I'm a bit rusty, but you'll get an idea). For this example I used the following GoloScript capabilities:

  • Structures (struct) to define a reusable chat agent

  • Augmentations (augment) to add methods to structures

  • As well as Result and Error for error handling

It's another way of writing the same program (which I really like), but GoloScript lets you choose your preferred style:

module golo.ai.demo

import gololang.Errors

struct ChatAgent = { 
  name,
  client,
  systemMessage,
  options
}

function NewAgent = |name| -> ChatAgent()
  : name(name)
  : client(
      openAINewClient(
        getenv("MODEL_RUNNER_BASE_URL"),
        "I💙DockerModelRunner",
        getenv("MODEL_RUNNER_LLM_CHAT")
      )
    )
  : systemMessage(
      DynamicObject()
        : role("system")
        : content("You are a creative storyteller.")
    )
  : options(
      DynamicObject()
        : temperature(0.7)
        : topP(0.9)
    )

augment ChatAgent {

  function streamCompletion = |this, userMessage, onChunk| {

    let messages = list[
      this: systemMessage(),
      DynamicObject()
        : role("user")
        : content(userMessage)
    ]

    try {
      let stats = openAIChatCompletionStream(this: client(), messages, onChunk, this: options())
      return Result(stats)
    } catch (e) {
      return Error("Error: " + e)
    }
  }
}

function main = |args| {

  let littleAgent = NewAgent("LittleAgent")

  let resultCompletion = littleAgent: streamCompletion(
    "Write the beginning of a short sci-fi story about AI.",
    |chunk| {
      if chunk: error() != null {
        println("Error:", chunk: error())
        return false
      }
      print(chunk: content())
      return true  # Continue streaming
    }
  )

  # Check if Ok or Error
  if resultCompletion: isOk() {
    let stats = resultCompletion: value()
    println("\nStream completed!")
    println("Chunk count:", stats: chunkCount())
  } else {
    let error = result: message()
    println("Error: " + error)
  }

}

You can find the code for this demo here: https://codeberg.org/k33g-blog/about-golo/src/branch/main/2026-01-13-golo.ai/demo.2

And that's it! That's all for this time. But know that you can already use GoloScript to write powerful and flexible generative AI scripts implementing the use of embeddings, function calling, structured outputs. And soon I plan to implement MCP (client and server) support for GoloScript, which will open up even more possibilities.

Have fun! 🤓