Ollama - Package Updates
- 
N nebulon pinned this topic 
 - 
[0.2.0]
- Update ollama to 0.12.6
 - Full Changelog
 - Ollama's app now supports searching when running DeepSeek-V3.1, Qwen3 and other models that support tool calling.
 - Flash attention is now enabled by default for Gemma 3, improving performance and memory utilization
 - Fixed issue where Ollama would hang while generating responses
 - Fixed issue where 
qwen3-coderwould act in raw mode when using/api/generateorollama run qwen3-coder <prompt> - Fixed 
qwen3-embeddingproviding invalid results - Ollama will now evict models correctly when 
num_gpuis set - Fixed issue where 
tool_indexwith a value of0would not be sent to the model - Thinking models now support structured outputs when using the 
/api/chatAPI - Ollama's app will now wait until Ollama is running to allow for a conversation to be started
 - Fixed issue where 
"think": falsewould show an error instead of being silently ignored 
 - 
[0.3.0]
- Fix wrong documentation URL in package info
 
 - 
[0.3.1]
- Update ollama to 0.12.7
 - Full Changelog
 - Qwen3-VL is now available in all parameter sizes ranging from 2B to 235B
 - MiniMax-M2: a 230 Billion parameter model built for coding & agentic workflows available on Ollama's cloud
 - Ollama's new app now includes a way to add one or many files when prompting the model:
 - For better responses, thinking levels can now be adjusted for the gpt-oss models:
 - New API documentation is available for Ollama's API: https://docs.ollama.com/api
 - Model load failures now include more information on Windows
 - Fixed embedding results being incorrect when running 
embeddinggemma - Fixed gemma3n on Vulkan backend
 - Increased time allocated for ROCm to discover devices
 - Fixed truncation error when generating embeddings
 
 - 
[0.4.0]
- Update ollama to 0.12.9
 - Full Changelog
 - Fix performance regression on CPU-only systems