Ollama - Package Updates

nebulon

You can use this thread to track updates to the Ollama package.

Please open issues in a separate topic instead of replying here.

nebulon

[0.1.0]

Package Updates

[0.2.0]

Update ollama to 0.12.6
Full Changelog
Ollama's app now supports searching when running DeepSeek-V3.1, Qwen3 and other models that support tool calling.
Flash attention is now enabled by default for Gemma 3, improving performance and memory utilization
Fixed issue where Ollama would hang while generating responses
Fixed issue where qwen3-coder would act in raw mode when using /api/generate or ollama run qwen3-coder <prompt>
Fixed qwen3-embedding providing invalid results
Ollama will now evict models correctly when num_gpu is set
Fixed issue where tool_index with a value of 0 would not be sent to the model
Thinking models now support structured outputs when using the /api/chat API
Ollama's app will now wait until Ollama is running to allow for a conversation to be started
Fixed issue where "think": false would show an error instead of being silently ignored

Package Updates

[0.3.0]

Package Updates

[0.3.1]

Update ollama to 0.12.7
Full Changelog
Qwen3-VL is now available in all parameter sizes ranging from 2B to 235B
MiniMax-M2: a 230 Billion parameter model built for coding & agentic workflows available on Ollama's cloud
Ollama's new app now includes a way to add one or many files when prompting the model:
For better responses, thinking levels can now be adjusted for the gpt-oss models:
New API documentation is available for Ollama's API: https://docs.ollama.com/api
Model load failures now include more information on Windows
Fixed embedding results being incorrect when running embeddinggemma
Fixed gemma3n on Vulkan backend
Increased time allocated for ROCm to discover devices
Fixed truncation error when generating embeddings

Package Updates

[0.4.0]

Cloudron Forum