Docker model runner
Latest Docker Desktop versions come with model runner capability. It allows you to run various LLMs locally. Full list of models supported is available on the, Docker Hub
You can interact with the models just like with regular containers. Use docker pull
to download them and docker run
to run them. By default with the run the model starts in interactive mode.
Interactive mode
$ docker model run ai/gemma3
Interactive chat mode started. Type '/bye' to exit.
> Who are you?
Hi there! I'm Gemma, a large language model created by the Gemma team at Google DeepMind.
I'm an open-weights model, which means I'm widely available for public use!
I can take text and images as inputs and respond with text.
You can learn more about me on the Gemma website:
[https://ai.google.dev/gemma](https://ai.google.dev/gemma)
It's nice to meet you!
> 1+1*5+5
Following the order of operations (PEMDAS/BODMAS), we first perform the
multiplication and then the addition:
1 + 1 * 5 + 5 = 1 + 5 + 5 = 11
So the answer is $\boxed{11}$
>
API access
Once the model is pulled locally, you can also access it via OpenAI compatible API
$ docker desktop enable model-runner --tcp 12434 # Enable the model runner on port 12434
$ curl http://localhost:12434/engines/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ai/gemma3",
"messages": [
{
"role": "user",
"content": "Give me a fact about whales."
}
]
}'
{
"choices":
[
{
"finish_reason": "stop",
"index": 0,
"message":
{
"role": "assistant",
"content": "Here’s a fascinating fact about whales:\n\n**Humpback whales have the largest vocalizations of any animal – their songs can last for up to 30 minutes and travel hundreds of miles!** \n\nThey use these complex songs primarily for communication, especially during mating season. \n\n---\n\nWould you like to hear another fact about whales, or perhaps a fact about a different animal?"
}
}
],
"created": 1745234508,
"model": "ai/gemma3",
"system_fingerprint": "b1-be7c303",
"object": "chat.completion",
"usage":
{
"completion_tokens": 82,
"prompt_tokens": 16,
"total_tokens": 98
},
"id": "chatcmpl-0rvxMLpt46ByLS070DcyxnPK0MnhwbqA",
"timings":
{
"prompt_n": 16,
"prompt_ms": 74.979,
"prompt_per_token_ms": 4.6861875,
"prompt_per_second": 213.3930833966844,
"predicted_n": 82,
"predicted_ms": 1294.787,
"predicted_per_token_ms": 15.79008536585366,
"predicted_per_second": 63.3308799053435
}
}