Ollama - Package Updates

Package Updates

[1.4.0]

Update ollama to 0.16.1
Full Changelog
Installing Ollama via the curl install script on macOS will now only prompt for your password if its required
Installing Ollama via the iem install script in Windows will now show progress
Image generation models will now respect the OLLAMA_LOAD_TIMEOUT variable
GLM-5: A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
MiniMax-M2.5: a new state-of-the-art large language model designed for real-world productivity and coding tasks.
The new ollama command makes it easy to launch your favorite apps with models using Ollama
Launch Pi with ollama launch pi
Improvements to Ollama's MLX runner to support GLM-4.7-Flash
Ctrl+G will now allow for editing text prompts in a text editor when running a model

Package Updates

[1.4.1]

Update ollama to 0.16.2
Full Changelog
ollama launch claude now supports searching the web when using :cloud models
Fixed rendering issue when running ollama in PowerShell
New setting in Ollama's app makes it easier to disable cloud models for sensitive and private tasks where data cannot leave your computer. For Linux or when running ollama serve manually, set OLLAMA_NO_CLOUD=1.
Fixed issue where experimental image generation models would not run in 0.16.0 and 0.16.1

Package Updates

[1.4.2]

Update ollama to 0.16.3
Full Changelog
New ollama launch cline added for the Cline CLI
ollama launch <integration> will now always show the model picker
Added Gemma 3, Llama and Qwen 3 architectures to MLX runner

Package Updates

[1.5.0]

Update ollama to 0.17.0
Full Changelog
OpenClaw can now be installed and configured automatically via Ollama, making it the easiest way to get up and running with OpenClaw with open models like Kimi-K2.5, GLM-5, and Minimax-M2.5.
When using cloud models, websearch is enabled - allowing OpenClaw to search the internet.
Improved tokenizer performance
Ollama's macOS and Windows apps will now default to a context length based on available VRAM

Package Updates

[1.5.1]

Update ollama to 0.17.4
Full Changelog
Tool call indices will now be included in parallel tool calls
Fixed issue where tool calls in the Qwen 3 and Qwen 3.5 model families would not be parsed correctly if emitted during thinking
Fixed issue where Ollama's app on Windows would crash when a new update has been downloaded
Nemotron architecture support in Ollama's engine
MLX engine now has improved memory usage
Ollama's app will now allow models that support tools to use web search capabilities
Improved LFM2 and LFM2.5 models in Ollama's engine
ollama create will no longer default to affine quantization for unquantized models when using the MLX engine
Added configuration for disabling automatic update downloading

Package Updates

[1.5.2]

Update ollama to 0.17.5
Full Changelog
Qwen3.5: the small Qwen 3.5 model series is now available in 0.8B, 2B, 4B and 9B parameter sizes.
Fixed crash in Qwen 3.5 models when split over GPU & CPU
Fixed issue where Qwen 3.5 models would repeat themselves due to no presence penalty (note: you may have to redownload the qwen3.5 models: ollama pull qwen3.5:35b for example)
ollama run --verbose will now show peak memory usage when using Ollama's MLX engine
Fixed memory issues and crashes in MLX runner
Fixed issue where Ollama would not be able to run models imported from Qwen3.5 GGUF files

Package Updates

[1.5.3]

Update ollama to 0.17.6
Full Changelog
Fixed issue where GLM-OCR would not work due to incorrect prompt rendering
Fixed tool calling parsing and rendering for Qwen 3.5 models

Package Updates

[1.5.4]

Update ollama to 0.17.7
Full Changelog
Allow thinking levels such as "medium" to correctly interpreted in Ollama's API for all thinking models
Add context length to support compaction when using ollama launch

Package Updates

[1.6.1]

Update ollama to 0.18.1
Full Changelog
Web Search and Fetch in OpenClaw
Ollama now ships with web search and web fetch plugin for OpenClaw. This allows Ollama's models (local or cloud) to search the web for the latest content and news. This also allows OpenClaw with Ollama to be able to fetch the web and extract readable content for processing. This feature does not execute JavaScript.
When using local models with web search in OpenClaw, ensure you are signed into Ollama with ollama signin
You can install web search directly into OpenClaw as a plugin if you already have OpenClaw configured and working:

Package Updates

[1.6.2]

Update ollama to 0.18.2
Full Changelog
Add extra check to ensure npm and git are installed before installing OpenClaw
Claude Code will now be faster when run locally, due to preventing cache breakages
Fix to correctly support ollama launch openclaw --model <model>
Register Ollama's websearch package correctly for OpenClaw

Package Updates

[1.8.0]

Update ollama to 0.20.0
Full Changelog
docs: update pi docs by @ParthSareen in #15152
mlx: respect tokenizer add_bos_token setting in pipeline by @dhiltgen in #15185
tokenizer: add SentencePiece-style BPE support by @dhiltgen in #15162

Package Updates

[1.8.1]

Update ollama to 0.20.2
Full Changelog
app: default app home view to new chat instead of launch by @jmorganca in #15312
bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in #15158
model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in #15254
ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in #15301
gemma4: enable flash attention by @dhiltgen in #15296
ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper by @jessegross in #15305
model/parsers: rework gemma4 tool call handling by @drifkin in #15306

Package Updates

[1.8.2]

Update ollama to 0.20.3
Full Changelog
Gemma 4 Tool Calling improvements
Added latest models to Ollama App
OpenClaw fixes for launching TUI

Package Updates

[1.8.3]

Update ollama to 0.20.4
Full Changelog
mlx: Improve M5 performance with NAX
gemma4: enable flash attention

Package Updates

[1.8.5]

Update ollama to 0.20.6
Full Changelog
Gemma 4 Tool Calling Improvements
App bug fixes and improvements
Parallel tool calling improvements
Docs for Hermes Agent

Package Updates

[1.8.6]

Update ollama to 0.20.7
Full Changelog
Fix quality of gemma:e2b and gemma:e4b when thinking is disabled
ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in #15483

Package Updates

[1.9.0]

Update ollama to 0.21.0
Full Changelog
Gemma 4 on MLX. Added support for running Gemma 4 via MLX on Apple Silicon, including a text-only MLX runtime for the model. The MLX backend also picked up mixed-precision quantization, better capability detection, and a batch of new op wrappers (Conv2d, Pad, activations, trig, masked SDPA, and RoPE-with-freqs).
Hermes and GitHub Copilot CLI in ollama launch. Added both integrations, which can now be configured in one command alongside the rest of the supported coding agents.
OpenCode moved to inline config. ollama launch opencode now writes its config inline rather than to a separate file, matching how other integrations are handled.
ollama launch no longer rewrites config when nothing changed. Pressing on a configured multi-model integration, or passing --model with the current primary, used to trigger a confirmation prompt and rewrite both the editor's config file and config.json. Now it's a no-op when the resolved model list matches what's already saved.
Fixed ollama launch openclaw --yes so it correctly skips the channels configuration step, so non-interactive setups complete cleanly.
Restored the Gemma 4 nothink renderer with the e2b-style prompt.
Fixed the Gemma 4 compiler error that was breaking Metal builds.
Fixed macOS cross-compiles so they no longer trigger generate, which was breaking cmake builds on some Xcode versions.
Quieted cgo builds by suppressing deprecated warnings during go build.
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.7...v0.21.0

Package Updates

[1.9.1]

Update ollama to 0.21.1
Full Changelog
MLX runner adds logprobs support for compatible models
Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler
Improved MLX prompt tokenization by moving tokenization into request handler goroutines
Better MLX thread safety for array management
GLM4 MoE Lite performance improvement with a fused sigmoid router head
Fixed model picker showing stale model after switching chats in the macOS app
Fixed structured outputs for Gemma 4 when think=false
Full Changelog: https://github.com/ollama/ollama/compare/v0.21.0...v0.21.1

Package Updates

[1.9.2]

Update ollama to 0.21.2
Full Changelog
Improved reliability of the OpenClaw onboarding flow in ollama launch
Recommended models in ollama launch now appear in a fixed, canonical order
OpenClaw integration now bundles Ollama's web search plugin in OpenClaw

Package Updates

[1.10.0]

Update ollama to 0.22.0
Full Changelog
NVIDIA's Nemotron 3 Omni
Poolside's first open-weight coding model - Laguna XS.2
Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.22.0

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Ollama - Package Updates