ETA for GPU support? Could we contribute to help it along?
-
Hiya!
This is a question for the Cloudron team directly, and I'm happy to get a direct response if the forum isn't the best place for it, but I'll post it publicly in case anyone else may wish to contribute.
In this thread @girish mentioned that GPU support in Cloudron will take 'some time' - https://forum.cloudron.io/topic/11312/the-models-load-in-the-void/6.
I understand completely that such things are difficult to quantify, and I don't know your current development roadmap. For instance, I doubt GPU support is very useful for most of the apps on Cloudron, and I'm sure that supporting all the apps rather than just one is your core business.
My questions are:
- Approximately how long is 'some time'? (It's okay if you can't answer that accurately or at all, I understand.)
- Is there anything we can do to contribute to speeding it up? This has real commercial value to us, and if $$ can help then that is one thing we may be able to contribute if we can understand how many $$ might help.
- For the community - Would it be useful enough for anyone else to help crowd-fund?
Bringing OpenWebUI to Cloudron - and perhaps soon other AI tools - is very exciting for us and multiple of our clients. AI tools are not easy to host privately in a cost effective, secure and commercial way. (I mean not just the pseudo-privacy of a 'private' ChatGPT space where we send our most sensitive data outside our firewall and even worse outside our own international jurisdiction, and then rely on an international corporate's legal T&Cs, or otherwise invest in AWS or Azure infrastructure for at least twice the price of doing it ourselves.)
Cloudron gives us solutions and approaches to many of the moving parts needed to provide truly private AI access to different clients and business units that we have to handle individually otherwise - e.g. easy deployment, built in no-stress backups, simple multi-tenanting, point-and-click resource management, flexible DNS, mail, etc. (It would be even nicer if we got end-to-end encrypted CI/CD pipelines across multiple networks and other related fancy stuff out of the box, but that is a dream only imagined because Cloudron is already great. ) But I don't need to sell Cloudron's benefits to this group, I'm sure, just noting they're great in relation to OpenWebUI. We are big fans of Cloudron, by the way!
So... Since GPU support is a very real dependency for a properly usable OpenWebUI, we'd like to help everyone get it ASAP.
-
@girish said in ETA for GPU support? Could we contribute to help it along?:
We haven't started working on this as such. I think the hard part is the GPU support in docker is varied. Are you looking for support for NVIDIA GPUs ?
Nvidia GPU support would be the one to have, at the moment, if only one brand could be supported. There are a lot of ai libraries for nvidia.
-
@girish said in ETA for GPU support? Could we contribute to help it along?:
I think the hard part is the GPU support in docker is varied.
From Arya:
"As of 2023, GPU support in Docker, particularly for AI applications, has made significant strides but still faces challenges. The main issue is that Docker was originally designed for CPU-based applications and struggled to efficiently utilize GPU resources. This led to performance issues and difficulty in deploying GPU-accelerated applications within Docker containers. To address these challenges, several projects have emerged to make Docker better suited for GPU workloads:
NVIDIA Docker: NVIDIA, the leading GPU manufacturer, developed the NVIDIA Docker toolkit, which provides a Docker image and runtime that allows containers to leverage NVIDIA GPUs. It simplifies the deployment of GPU-accelerated applications within Docker.
ROCm Docker: AMD's ROCm (Radeon Open Compute) platform offers Docker images optimized for AMD GPUs, enabling developers to run GPU-accelerated applications in Docker containers on AMD hardware.
CUDA Docker: CUDA is NVIDIA's parallel computing platform and programming model for GPUs. CUDA Docker images are available, providing a pre-configured environment for running CUDA applications within Docker containers.
Despite these advancements, challenges remain, such as:
Performance overhead: Running GPU applications in Docker containers can introduce performance overhead compared to running them natively.
Resource isolation: Ensuring proper resource isolation between containers sharing the same GPU can be complex.
Compatibility: Ensuring compatibility between different GPU drivers, Docker versions, and application dependencies requires careful management.
To further improve GPU support in Docker, ongoing efforts focus on enhancing performance, simplifying deployment, and improving resource management. These projects aim to make Docker a more viable option for running GPU-accelerated AI applications, enabling easier deployment and scalability." -
@LoudLemur said in ETA for GPU support? Could we contribute to help it along?:
@girish said in ETA for GPU support? Could we contribute to help it along?:
I think the hard part is the GPU support in docker is varied.
From Arya:
"As of 2023, GPU support in Docker, particularly for AI applications, has made significant strides but still faces challenges. The main issue is that Docker was originally designed for CPU-based applications and struggled to efficiently utilize GPU resources. ...
Thanks for that info, I wasn't aware of that challenge but it certainly makes sense.
The Open WebUI installation page talks about running the Docker image with GPU support but doesn't mention those problems: https://docs.openwebui.com/getting-started/
-
Thinking further on this... Perhaps it's not genuine Cloudron GPU support that we need.
Reading my original question about GPU support in a Cloudron context, I suppose it's easy to assume that there's an expectation of containerized resource management just like we get in all our Cloudron apps - the ability to segregate and limit GPU and attached VRAM just like we can for CPU, RAM, etc. While that would certainly be wonderful, it's not actually what we need for our business case. We just need the ability for OpenWebUI to draw on the hardware GPU resources we attach to its server (which is a virtual machine in our case), and run separate OpenWebUI apps easily with separate logins and datastores (which you've already given us).
Which OpenWebUI already appears to do, I suppose. If the container is started with the GPU usage switch, my understanding is that's all that's needed in the basic case. Please correct me if I'm wrong:
E.g. docker run -d -p 3000:8080 --gpus all etc...
https://docs.openwebui.com/getting-started/So my new questions are these (without having tried anything out yet):
-
If we installed the NVIDIA Ubuntu GPU drivers in the Cloudron OS and started the OpenWebUI container with the GPU switch, would it just work?
-
Does installing the GPU drivers interfere with Cloudron's upgrade processes? (In our case, we wouldn't mind having to manage the GPU drivers separately, we don't expect Cloudron to manage non-native additions.)
-
If we got past those first two questions, could we get a simple switch on the OpenWebUI Cloudron container to enable GPU support rather than having to set up our own custom container?
Cloudron isn't our virtualisation layer, we use VMWare in our case. So we can target our GPU usage to a Cloudron installation, we don't need to go further and manage it at the app level. (Actually making GPUs work in VMWare is a lot harder than it sounds, we found, but that's a separate problem.)
Of course I know nothing of how Cloudron is managing and segregating resources across its docker containers. Perhaps it is wishful thinking. But to be clear, we don't need fancy app level virtual GPU handling, we just want to use our GPU in one (or any) of the running apps. So I'm crossing fingers for good answers to the above questions.
(On a separate but interesting note, we've been hoping to do some Cloudron experimentation on this front. And they're not even hard experiments that we could do on a bare metal box, but we've been trying to do it in our data centre hosting environment on our existing systems as a proper proof of concept... However it's been a helluva job spinning up a pilot project with some older Dell servers we have running in a data centre with VMWare just to the point of being able to run GPUs at all, let alone getting as far as experimenting with Cloudron containers.
- We got some Tesla P40s running in some Dell r720xd's for our pilot project. Sourcing the right hardware including cards and power cables was hard enough.
- Then we didn't know what would fit where in which servers because none of it is officially supported or clearly documented.
- Then we found we had to upgrade our power supplies even though we thought the standard ones worked on paper.
- Then the hoops you have to jump through with both server firmware and BIOS/boot settings, and virtual machine BIOS and boot settings, and then VMWare updates and driver installations are CRAZY...
- We've finally got as far as making these things operational only to find that our VMWare Essentials licence doesn't provide virtual GPU support. This fact doesn't seem to be clearly written anywhere in the 1,000 documents we've read lately until you know exactly what to look for, and since BroadCom's recent purchase of VMWare the affordable licences we need seem to have disappeared.
So we still haven't booted a VMWare based Cloudron virtual machine in a data centre with an enterprise GPU in it yet... Our theory that low cost, flexibly hosted, easily backed up, privately hosted AI agents and learning data repositories should be available to smaller enterprises who care about privacy/security and cost enough to avoid the public clouds has encountered many challenges so far. However I think we're not far off - though for licensing reasons I think we might need to re-learn everything we know using a different hypervisor before the end - and I'd like to report back about how this all goes running on Cloudron in the very near future.)
-
-
If we installed the NVIDIA Ubuntu GPU drivers in the Cloudron OS and started the OpenWebUI container with the GPU switch, would it just work?
Not necessarily, it depends on the GPU. While it may seem like a good idea, results will be very random. Also,
nouveau
(or whatever they're called now) drivers are the worst available out there. I've only had good results with nvidia official drivers.Does installing the GPU drivers interfere with Cloudron's upgrade processes? (In our case, we wouldn't mind having to manage the GPU drivers separately, we don't expect Cloudron to manage non-native additions.)
Yes. Nvidia drivers are a pain to manage and often need debugging.