Should ollama be part of this app package?
-
Local ollama is now integrated. You have to reinstall the app though.
Keep your expectations in check. It probably won't work great if you don't have a good CPU and we have no GPU integration yet. It's very slow with low end CPUs. I am not an expert on the RAM/CPU/GPU requirements. Feel free to experiment.
-
@LoudLemur said in Should ollama be part of this app package?:
One further suggestion: I think it should ship with a small but functioning model and prompt.
Now with the local ollama integration, you can download whichever models you want from the UI itself. The models are quite big so pre-installing them is not an option.
-
-
-
@Kubernetes I am not sure if this is what you are asking, but I am currently running Ollama separately via docker with a dedicated OLD GPU (8GB NVIDIA) on my NAS (working shockingly good on 7B/11B GGUFs and moderately good on 13B ones) and Cloudron on a VM on the same NAS. I use Ollama externally (technically it's still local on the machine's hardware, but is configured as though it is not) and deactivated this app's localhost Ollama. This can be done by going into the Cloudron's Open-Webui File Manager through the settings and configuring "env.sh" -
# Change this to false to disable local ollama and use your own export LOCAL_OLLAMA_ENABLED=false # When using remote ollama, change this to the ollama's base url export OLLAMA_API_BASE_URL="http://changethis:11434" # When local ollama is enabled, this is location for the downloaded models. # If the path is under /app/data, models will be backed up. Note that models # can be very large. To skip backup of models, move the models to a volume (https://docs.cloudron.io/volumes/) # export OLLAMA_MODELS=/app/data/ollama-home/models
Is this what you are referring to?
-
@coniunctio Yes, exactly this was what I was referring to. Thank you for bringing this example up.
-
@girish said in Should ollama be part of this app package?:
Local ollama is now integrated. You have to reinstall the app though.
Keep your expectations in check. It probably won't work great if you don't have a good CPU and we have no GPU integration yet. It's very slow with low end CPUs. I am not an expert on the RAM/CPU/GPU requirements. Feel free to experiment.
I have been using the workaround of disabling local Ollama with the Cloudron app and running a separate (external) docker container installation of Ollama with a dedicated GPU on the same hardware and then linking that instance of Ollama to the Cloudron instance of Open-WebUI. Somehow, this configuration is faster on a NAS purchased in 2018 with an add-on NVIDIA 8GB GPU than my M1 MacBook Pro with 16GB RAM and integrated GPU purchased more recently. The additional bonus of running the Cloudron Open-WebUI vs the localhost version on my Apple silicon MBP is that I can use my local LLMs on my mobile devices in transit when my laptop is shut down.
-
personally I disabled the ollama local, because my Cloudron doesnt have GPU and on CPU it is too painfull.
in exchange; I activated a bunch of Providers API compatible with OpenAI
but at the end I realized that I just need OpenRouter to access all of them.
with OpenRouter, you could even block providers that logs your queries;
which I will Feature Request for Open-WebUI