Ollama - Package Updates
-
[1.0.2]
- Update ollama to 0.12.11
- Full Changelog
- Ollama's API and the OpenAI-compatible API now supports Logprobs
- Ollama's new app now supports WebP images
- Improved rendering performance in Ollama's new app, especially when rendering code
- The "required" field in tool definitions will now be omitted if not specified
- Fixed issue where "tool_call_id" would be omitted when using the OpenAI-compatible API.
- Fixed issue where
ollama createwould import data from bothconsolidated.safetensorsand other safetensor files. - Ollama will now prefer dedicated GPUs over iGPUs when scheduling models
- Vulkan can now be enabled by setting
OLLAMA_VULKAN=1. For example:OLLAMA_VULKAN=1 ollama serve
-
[1.1.0]
- Update ollama to 0.13.0
- Full Changelog
- DeepSeek-OCR is now supported
- DeepSeek-V3.1 architecture is now supported in Ollama's engine
- Fixed performance issues that arose in Ollama 0.12.11 on CUDA
- Fixed issue where Linux install packages were missing required Vulkan libraries
- Improved CPU and memory detection while in containers/cgroups
- Improved VRAM information detection for AMD GPUs
- Improved KV cache performance to no longer require defragmentation
-
[1.1.1]
- Update ollama to 0.13.1
- Full Changelog
- nomic-embed-text will now use Ollama's engine by default
- Tool calling support for
cogito-v2.1 - Fixed issues with CUDA VRAM discovery
- Fixed link to docs in Ollama's app
- Fixed issue where models would be evicted on CPU-only systems
- Ollama will now better render errors instead of showing
Unmarshal:errors - Fixed issue where CUDA GPUs would fail to be detected with older GPUs
- Added thinking and tool parsing for cogito-v2.1
-
[1.1.2]
- Increase the proxy read timeout to 1h
-
[1.1.3]
- Disable body size check within the app
-
[1.1.4]
- Update ollama to 0.13.3
-
[1.1.5]
- Update ollama to 0.13.4
- Full Changelog
- Nemotron 3 Nano: A new Standard for Efficient, Open, and Intelligent Agentic Models
- Olmo 3 and Olmo 3.1: A series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
- Enable Flash Attention automatically for models by default
- Fixed handling of long contexts with Gemma 3 models
- Fixed issue that would occur with Gemma 3 QAT models or other models imported with the Gemma 3 architecture
-
[1.1.6]
- Update ollama to 0.13.5
- Full Changelog
- Google's FunctionGemma is now available on Ollama
bertarchitecture models now run on Ollama's engine- Added built-in renderer & tool parsing capabilities for DeepSeek-V3.1
- Fixed issue where nested properties in tools may not have been rendered properly
-
[1.2.0]
- Update ollama to 0.14.0
- Full Changelog
- ollama run --experimental CLI will now open a new Ollama CLI that includes an agent loop and the bash tool
- Anthropic API compatibility: support for the /v1/messages API
- A new REQUIRES command for the Modelfile allows declaring which version of Ollama is required for the model
- For older models, Ollama will avoid an integer underflow on low VRAM systems during memory estimation
- More accurate VRAM measurements for AMD iGPUs
- Ollama's app will now highlight swift soure code
- An error will now return when embeddings return NaN or -Inf
- Ollama's Linux install bundles files now use zst compression
- New experimental support for image generation models, powered by MLX
-
[1.2.1]
- Update ollama to 0.14.1
- Full Changelog
- fix macOS auto-update signature verification failure
-
[1.2.3]
- Update ollama to 0.14.3
- Full Changelog
- Z-Image Turbo: 6 billion parameter text-to-image model from Alibabas Tongyi Lab. It generates high-quality photorealistic images.
- Flux.2 Klein: Black Forest Labs fastest image-generation models to date.
- Fixed issue where Ollama's macOS app would interrupt system shutdown
- Fixed
ollama createandollama showcommands for experimental models - The
/api/generateAPI can now be used for image generation - Fixed minor issues in Nemotron-3-Nano tool parsing
- Fixed issue where removing an image generation model would cause it to first load
- Fixed issue where
ollama rmwould only stop the first model in the list if it were running
-
[1.3.0]
- Update ollama to 0.15.0
- Full Changelog
- A new
ollama launchcommand to use Ollama's models with Claude Code, Codex, OpenCode, and Droid without separate configuration. - New
ollama launchcommand for Claude Code, Codex, OpenCode, and Droid - Fixed issue where creating multi-line strings with
"""would not work when usingollama run - <kbd>Ctrl</kbd>+<kbd>J</kbd> and <kbd>Shift</kbd>+<kbd>Enter</kbd> now work for inserting newlines in
ollama run - Reduced memory usage for GLM-4.7-Flash models
-
[1.3.1]
- Update ollama to 0.15.1
- Full Changelog
- GLM-4.7-Flash performance and correctness improvements, fixing repetitive answers and tool calling quality
- Fixed performance issues on macOS and arm64 Linux
- Fixed issue where
ollama launchwould not detectclaudeand would incorrectly updateopencodeconfigurations
-
[1.3.2]
- Update ollama to 0.15.2
- Full Changelog
- New
ollama launch clawdbotcommand for launching Clawdbot using Ollama models
-
[1.3.3]
- Update ollama to 0.15.4
- Full Changelog
- ollama launch openclaw will now enter the standard OpenClaw onboarding flow if this has not yet been completed.
- Renamed ollama launch clawdbot to ollama launch openclaw to reflect the project's new name
- Improved tool calling for Ministral models
- ollama launch will now use the value of OLLAMA_HOST when running it
-
[1.3.4]
- Update ollama to 0.15.5
- Full Changelog
- Improvements to
ollama launch - Sub-agent support for
ollama launchfor planning, deep research, and similar tasks ollama signinwill now open a browser window to make signing in easier- Ollama will now default to the following context lengths based on VRAM:
- GLM-4.7-Flash support on Ollama's experimental MLX engine
ollama signinwill now open the browser to the connect page- Fixed off by one error when using
num_predictin the API - Fixed issue where tokens from a previous sequence would be returned when hitting
num_predict
-
[1.3.5]
- Update ollama to 0.15.6
- Full Changelog
- Fixed context limits when running
ollama launch droid ollama launchwill now download missing models instead of erroring- Fixed bug where
ollama launch claudewould cause context compaction when providing images
-
[1.4.0]
- Update ollama to 0.16.1
- Full Changelog
- Installing Ollama via the
curlinstall script on macOS will now only prompt for your password if its required - Installing Ollama via the
ieminstall script in Windows will now show progress - Image generation models will now respect the
OLLAMA_LOAD_TIMEOUTvariable - GLM-5: A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
- MiniMax-M2.5: a new state-of-the-art large language model designed for real-world productivity and coding tasks.
- The new
ollamacommand makes it easy to launch your favorite apps with models using Ollama - Launch Pi with
ollama launch pi - Improvements to Ollama's MLX runner to support GLM-4.7-Flash
- Ctrl+G will now allow for editing text prompts in a text editor when running a model
-
[1.4.1]
- Update ollama to 0.16.2
- Full Changelog
ollama launch claudenow supports searching the web when using:cloudmodels- Fixed rendering issue when running
ollamain PowerShell - New setting in Ollama's app makes it easier to disable cloud models for sensitive and private tasks where data cannot leave your computer. For Linux or when running
ollama servemanually, setOLLAMA_NO_CLOUD=1. - Fixed issue where experimental image generation models would not run in 0.16.0 and 0.16.1
-
[1.4.2]
- Update ollama to 0.16.3
- Full Changelog
- New
ollama launch clineadded for the Cline CLI ollama launch <integration>will now always show the model picker- Added Gemma 3, Llama and Qwen 3 architectures to MLX runner
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login