Generating JSON with an LLM: The Old Method and the New Method

Generating JSON with an LLM: The Old Method and the New Method

How to get structured answers

Even with a baby LLM, it's possible to generate JSON. In this article, we'll explore how to generate JSON with an LLM using both the traditional method (with a detailed prompt) and the new method called "Structured Outputs".

Prerequisites

To understand the basics of using Ollama and baby LLMs, you should have read the previous article, Developing Generative AI Applications in Go with Ollama.

For this new article, we'll use a new LLM model, granite3-moe:1b. It's slightly larger than "baby Qwen" (398 MB vs 822 MB) but remains a model that can run on a standard computer (without GPU) and even on a Pi5 8GB RAM. So run the following command to download it:

ollama pull granite3-moe:1b

"Baby Qwen" can be "capricious" when performing certain complex tasks, but "granite3-moe" is a bit more "disciplined."

Sometimes qwen2.5:0.5b "freezes" with the 2nd JSON generation method. However, you can try with its big brother qwen2.5:1.5b which behaves much better.

🤔 But what are these numbers followed by "b"?

LLM Parameters

The numbers followed by "b" (like 0.5b, 1b, 1.5b, 3b) represent the number of billions of parameters the model contains. Parameters are the numerical values the model has learned during its training, allowing it to generate responses.

For example:

  • qwen2.5:0.5b has 500 million parameters (0.5 billion)

  • qwen2.5:1.5b has 1.5 billion parameters

  • qwen2.5:3b has 3 billion parameters

  • granite3-moe:1b has 1 billion parameters

Generally, the more parameters a model has, the more capable it is of handling complex tasks and generating sophisticated responses. According to your needs, choose the best compromise between performance and required resources.

We are now ready to generate JSON with "granite3-moe:1b".

First Method: The Old Way, with a Detailed Prompt

My use case is the following: when I give the model an animal name, I want it to return a JSON with information about that animal.

For example, if I give the name "chicken" to the model, I want to get a JSON that looks like this:

{
  "scientific_name": "Gallus gallus",
  "main_species": "Poultry",
  "average_length": "1.5 to 1.75 meters",
  "average_weight": "5 to 7 kilograms",
  "average_lifespan": "10 to 20 years",
  "countries": ["China", "Iran", "India", "Egypt", "Turkey"]
}

Note that the information generated by "granite3-moe:1b" generally doesn't correspond to reality. Remember it's a baby LLM with a limited knowledge base. But that doesn't matter, it's the generation mechanism we're interested in.

The Detailed Prompt

When writing the prompt, imagine explaining to a 5-year-old all the steps/instructions to get information about their favourite animal.

Here's the prompt I used:

You are a helpful AI assistant. The user will enter the name of an animal.
The assistant will then return the following information about the animal:

- the scientific name of the animal (the name of json field is: scientific_name)
- the main species of the animal  (the name of json field is: main_species)
- the decimal average length of the animal (the name of json field is: average_length)
- the decimal average weight of the animal (the name of json field is: average_weight)
- the decimal average lifespan of the animal (the name of json field is: average_lifespan)
- the countries where the animal lives into json array of strings (the name of json field is: countries)

Output the results in JSON format and trim the spaces of the sentence.
Use the provided context to give the data

Complete Example Code

This code looks very similar to the previous article's code but with a more detailed prompt.

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "log"
    "net/http"
    "net/url"
    "os"

    "github.com/ollama/ollama/api"
)

var (
    FALSE = false
    TRUE  = true
)

func main() {
    ctx := context.Background()

    var ollamaRawUrl string
    if ollamaRawUrl = os.Getenv("OLLAMA_HOST"); ollamaRawUrl == "" {
        ollamaRawUrl = "http://localhost:11434"
    }

    url, _ := url.Parse(ollamaRawUrl)

    client := api.NewClient(url, http.DefaultClient)

    systemInstructions := `You are a helpful AI assistant. The user will enter the name of an animal.
    The assistant will then return the following information about the animal:
    - the scientific name of the animal (the name of json field is: scientific_name)
    - the main species of the animal  (the name of json field is: main_species)
    - the decimal average length of the animal (the name of json field is: average_length)
    - the decimal average weight of the animal (the name of json field is: average_weight)
    - the decimal average lifespan of the animal (the name of json field is: average_lifespan)
    - the countries where the animal lives into json array of strings (the name of json field is: countries)
    Output the results in JSON format and trim the spaces of the sentence.
    Use the provided context to give the data`

    userContent := "chicken"

    // Prompt construction
    messages := []api.Message{
        {Role: "system", Content: systemInstructions},
        {Role: "user", Content: userContent},
    }

    req := &api.ChatRequest{
        Model:    "granite3-moe:1b",
        Messages: messages,
        Options: map[string]interface{}{
            "temperature":    0.0,
            "repeat_last_n":  2,
        },
        Stream: &FALSE,
        Format: json.RawMessage(`"json"`),
    }

    answer := ""
    err := client.Chat(ctx, req, func(resp api.ChatResponse) error {
        answer = resp.Message.Content
        return nil
    })

    if err != nil {
        log.Fatalln("😡", err)
    }
    fmt.Println(answer)
    fmt.Println()
}

What's important to note (in addition to the detailed instructions):

Message Construction:

messages := []api.Message{
    {Role: "system", Content: systemInstructions},
    {Role: "user", Content: userContent},
}

Two messages are created: a system message with instructions and a user message with the content "chicken".

Chat Request Configuration:

req := &api.ChatRequest{
    Model:    "granite3-moe:1b",
    Messages: messages,
    Options: map[string]interface{}{
        "temperature":    0.0,
        "repeat_last_n":  2,
    },
    Stream: &FALSE,
    Format: json.RawMessage(`"json"`),
}

This configuration is particularly important:

  • Temperature is set to 0.0 for deterministic responses

  • Streaming is disabled

  • Output format is specified as JSON

These three parameters are essential for obtaining structured JSON.

Running the Code

You can run the code with the following command:

go run main.go

And you should get a response similar to this:

{
  "scientific_name": "Gallus gallus",
  "main_species": "Poultry",
  "average_length": "1.5 to 1.75 meters",
  "average_weight": "5 to 7 kilograms",
  "average_lifespan": "10 to 20 years",
  "countries": ["China", "Iran", "India", "Egypt", "Turkey"]
}

Let's now move on to the second method of generating JSON. Recently added to Ollama, it's more straightforward and more effective (in my opinion).

Second Method: Structured Outputs

On December 6, 2024, Ollama announced support for "Structured Outputs". This method allows constraining the model to generate structured data using a JSON schema.

You can find the blog post here: Structured outputs

This new feature can be handy for extracting document data, structuring LLM responses, and more.

Let's first see how to define the expected JSON model for our animal use case.

The JSON Schema

Following the example from Ollama's blog post, here's how to define the JSON schema for our use case:

{
  "type": "object",
  "properties": {
    "scientific_name": {
      "type": "string"
    },
    "main_species": {
      "type": "string"
    },
    "average_length": {
      "type": "number"
    },
    "average_lifespan": {
      "type": "number"
    },
    "average_weight": {
      "type": "number"
    },
    "countries": {
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  },
  "required": [
    "scientific_name",
    "main_species",
    "average_length",
    "average_lifespan",
    "average_weight",
    "countries"
  ]
}

And I should pass this schema to the LLM using the Format property of the chat request.

Let's start with the complete example code.

Complete Example Code

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "log"
    "net/http"
    "net/url"
    "os"

    "github.com/ollama/ollama/api"
)

var (
    FALSE = false
    TRUE  = true
)

func main() {
    ctx := context.Background()

    var ollamaRawUrl string
    if ollamaRawUrl = os.Getenv("OLLAMA_HOST"); ollamaRawUrl == "" {
        ollamaRawUrl = "http://localhost:11434"
    }

    url, _ := url.Parse(ollamaRawUrl)

    client := api.NewClient(url, http.DefaultClient)

    // define schema for a structured output
    // ref: https://ollama.com/blog/structured-outputs
    schema := map[string]any{
        "type": "object",
        "properties": map[string]any{
            "scientific_name": map[string]any{
                "type": "string",
            },
            "main_species": map[string]any{
                "type": "string",
            },
            "average_length": map[string]any{
                "type": "number",
            },
            "average_lifespan": map[string]any{
                "type": "number",
            },
            "average_weight": map[string]any{
                "type": "number",
            },
            "countries": map[string]any{
                "type": "array",
                "items": map[string]any{
                    "type": "string",
                },
            },
        },
        "required": []string{"scientific_name", "main_species", "average_length", "average_lifespan", "average_weight", "countries"},
    }

    jsonModel, err := json.Marshal(schema)
    if err != nil {
        log.Fatalln("😡", err)
    }

    userContent := "Tell me about chicken"

    // Prompt construction
    messages := []api.Message{
        {Role: "user", Content: userContent},
    }

    req := &api.ChatRequest{
        Model:    "granite3-moe:1b",
        Messages: messages,
        Options: map[string]interface{}{
            "temperature":    0.0,
            "repeat_last_n":  2,
        },
        Stream: &FALSE,
        Format: json.RawMessage(jsonModel),
    }

    answer := ""
    err = client.Chat(ctx, req, func(resp api.ChatResponse) error {
        answer = resp.Message.Content
        return nil
    })

    if err != nil {
        log.Fatalln("😡", err)
    }
    fmt.Println(answer)
    fmt.Println()
}

What's important to note:

JSON Schema Definition and Processing:

Schema Definition: The program defines a detailed schema that describes the expected output data structure. This schema specifies six required fields:

  • scientific_name (string): scientific name of the animal

  • main_species (string): main species

  • average_length (number): average length

  • average_lifespan (number): average lifespan

  • average_weight (number): average weight

  • countries (array of strings): countries where the animal lives

Schema Processing:

jsonModel, err := json.Marshal(schema)

The schema is converted to JSON format for use in the request.

Chat Request Construction:

messages := []api.Message{
    {Role: "user", Content: userContent},
}

req := &api.ChatRequest{
    Model: "granite3-moe:1b",
    Messages: messages,
    Options: map[string]interface{}{
        "temperature": 0.0,
        "repeat_last_n": 2,
    },
    Stream: &FALSE,
    Format: json.RawMessage(jsonModel),
}

The request is configured with:

  • Temperature of 0.0 for deterministic responses

  • Streaming disabled

  • JSON schema as output format

Running the Code

You can now run the code with the following command:

go run main.go

And you should get a response similar to this:

{
    "average_length": 1.5, 
    "average_lifespan": 15, 
    "average_weight": 10, 
    "countries": ["China", "Russia", "United States", "Brazil", "Argentina"] ,
    "main_species": "Gallus gallus domesticus", 
    "scientific_name": "Chicken"
}

You'll notice that the values obtained differ from those of the first method. Indeed, we're never really sure what LLMs will generate. But what's important is that the response structure is respected. Nevertheless, let's see how we could help our "baby Granite3-moe" provide us with a response from precise data (remember, I told you that one use case was the data extraction of a document).

We'll give some context to our model (you can reread the last part of the previous article about context).

Let's Give Some Additional Information to Baby Granite3-moe

First, here's the complete example code:

Complete Example Code

It's very similar to the previous example, but with 2-3 differences that I'll explain afterward:

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "log"
    "net/http"
    "net/url"
    "os"

    "github.com/ollama/ollama/api"
)

var (
    FALSE = false
    TRUE  = true
)

func main() {
    ctx := context.Background()

    var ollamaRawUrl string
    if ollamaRawUrl = os.Getenv("OLLAMA_HOST"); ollamaRawUrl == "" {
        ollamaRawUrl = "http://localhost:11434"
    }

    url, _ := url.Parse(ollamaRawUrl)

    client := api.NewClient(url, http.DefaultClient)

    // define schema for a structured output
    // ref: https://ollama.com/blog/structured-outputs
    schema := map[string]any{
        "type": "object",
        "properties": map[string]any{
            "scientific_name": map[string]any{
                "type": "string",
            },
            "main_species": map[string]any{
                "type": "string",
            },
            "average_length": map[string]any{
                "type": "number",
            },
            "average_lifespan": map[string]any{
                "type": "number",
            },
            "average_weight": map[string]any{
                "type": "number",
            },
            "countries": map[string]any{
                "type": "array",
                "items": map[string]any{
                    "type": "string",
                },
            },
        },
        "required": []string{"scientific_name", "main_species", "average_length", "average_lifespan", "average_weight", "countries"},
    }

    jsonModel, err := json.Marshal(schema)
    if err != nil {
        log.Fatalln("😡", err)
    }

    data := `Information about the chicken:
    - scientific_name: Gallus gallus
    - main_species: Poultry
    - average_length: 1.5 to 1.75 meters
    - average_weight: 5 to 7 kilograms
    - average_lifespan: 10 to 20 years
    - countries: ["China", "Iran", "India", "Egypt", "Turkey"]
    `

    userContent := "Tell me about chicken"

    // Prompt construction
    messages := []api.Message{
        {Role: "system", Content: data},
        {Role: "user", Content: userContent},
    }

    req := &api.ChatRequest{
        Model:    "granite3-moe:1b",
        Messages: messages,
        Options: map[string]interface{}{
            "temperature":    0.0,
            "repeat_last_n":  2,
        },
        Stream: &FALSE,
        Format: json.RawMessage(jsonModel),
    }

    answer := ""
    err = client.Chat(ctx, req, func(resp api.ChatResponse) error {
        answer = resp.Message.Content
        return nil
    })

    if err != nil {
        log.Fatalln("😡", err)
    }
    fmt.Println(answer)
    fmt.Println()
}

What's important to note:

The main difference is the addition of contextual data in the prompt:

These contextual data are information about the animal that we provide to the model to help it generate a more accurate response:

data := `Information about the chicken:
- scientific_name: Gallus gallus
- main_species: Poultry
- average_length: 1.5 to 1.75 meters
- average_weight: 5 to 7 kilograms
- average_lifespan: 10 to 20 years
- countries: ["China", "Iran", "India", "Egypt", "Turkey"]
`

userContent := "Tell me about chicken"

The messages will, therefore, be defined as follows:

messages := []api.Message{
    {Role: "system", Content: data},
    {Role: "user", Content: userContent},
}

And when running, you should get a response similar to this:

{
    "average_length": 1.5,
    "average_lifespan": 10,
    "average_weight": 5,
    "countries": ["China", "Iran", "India", "Egypt", "Turkey"],
    "main_species": "Poultry",
    "scientific_name": "Gallus gallus"
}

You'll notice that the model respected the JSON schema we provided and transformed the text data into numerical data when necessary (for example average_length: 1.5 to 1.75 meters to "average_length": 1.5).

Conclusion

And that's it for this article. I hope you enjoyed reading it and learned something new. This "Structured Outputs" feature is really powerful and can be used in many use cases, such as extracting data from other LLMs' responses to trigger actions and more. My following articles will deal with RAG (Retrieval Augmented Generation) and, of course, how to use Ollama with "Tiny language models" to do RAG.

You can find the source code for this article on ollama-tlms-golang/01-json-output.

See you soon for the next posts.