Ollama - Package Updates
-
[1.1.0]
- Update ollama to 0.13.0
- Full Changelog
- DeepSeek-OCR is now supported
- DeepSeek-V3.1 architecture is now supported in Ollama's engine
- Fixed performance issues that arose in Ollama 0.12.11 on CUDA
- Fixed issue where Linux install packages were missing required Vulkan libraries
- Improved CPU and memory detection while in containers/cgroups
- Improved VRAM information detection for AMD GPUs
- Improved KV cache performance to no longer require defragmentation
-
[1.1.1]
- Update ollama to 0.13.1
- Full Changelog
- nomic-embed-text will now use Ollama's engine by default
- Tool calling support for
cogito-v2.1 - Fixed issues with CUDA VRAM discovery
- Fixed link to docs in Ollama's app
- Fixed issue where models would be evicted on CPU-only systems
- Ollama will now better render errors instead of showing
Unmarshal:errors - Fixed issue where CUDA GPUs would fail to be detected with older GPUs
- Added thinking and tool parsing for cogito-v2.1
-
[1.1.2]
- Increase the proxy read timeout to 1h
-
[1.1.3]
- Disable body size check within the app
-
[1.1.4]
- Update ollama to 0.13.3
-
[1.1.5]
- Update ollama to 0.13.4
- Full Changelog
- Nemotron 3 Nano: A new Standard for Efficient, Open, and Intelligent Agentic Models
- Olmo 3 and Olmo 3.1: A series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
- Enable Flash Attention automatically for models by default
- Fixed handling of long contexts with Gemma 3 models
- Fixed issue that would occur with Gemma 3 QAT models or other models imported with the Gemma 3 architecture
-
[1.1.6]
- Update ollama to 0.13.5
- Full Changelog
- Google's FunctionGemma is now available on Ollama
bertarchitecture models now run on Ollama's engine- Added built-in renderer & tool parsing capabilities for DeepSeek-V3.1
- Fixed issue where nested properties in tools may not have been rendered properly
-
[1.2.0]
- Update ollama to 0.14.0
- Full Changelog
- ollama run --experimental CLI will now open a new Ollama CLI that includes an agent loop and the bash tool
- Anthropic API compatibility: support for the /v1/messages API
- A new REQUIRES command for the Modelfile allows declaring which version of Ollama is required for the model
- For older models, Ollama will avoid an integer underflow on low VRAM systems during memory estimation
- More accurate VRAM measurements for AMD iGPUs
- Ollama's app will now highlight swift soure code
- An error will now return when embeddings return NaN or -Inf
- Ollama's Linux install bundles files now use zst compression
- New experimental support for image generation models, powered by MLX
-
[1.2.1]
- Update ollama to 0.14.1
- Full Changelog
- fix macOS auto-update signature verification failure
-
[1.2.3]
- Update ollama to 0.14.3
- Full Changelog
- Z-Image Turbo: 6 billion parameter text-to-image model from Alibabas Tongyi Lab. It generates high-quality photorealistic images.
- Flux.2 Klein: Black Forest Labs fastest image-generation models to date.
- Fixed issue where Ollama's macOS app would interrupt system shutdown
- Fixed
ollama createandollama showcommands for experimental models - The
/api/generateAPI can now be used for image generation - Fixed minor issues in Nemotron-3-Nano tool parsing
- Fixed issue where removing an image generation model would cause it to first load
- Fixed issue where
ollama rmwould only stop the first model in the list if it were running