Cloudron Forum

robw

Thanks @joseph - I will try that!

Meanwhile, a small update: The disk filling process appears to have magically stopped. I don't know when. So our Rocket.Chat appears to be back to normal operation. I don't see any obvious clues in the package updates, although there are updates to the apps engine, maybe it was that. Our instance is backing up and updating happily again.

It wasn't me!

robw

Good question @joseph - Unfortunately I can't see the settings in Rocket.Chat right now, my settings screen is blank. (I double checked and I do have admin user access.) So I assume it is whatever is default. A suggestion above is to change the logs storage and see what happens. I will give that a go when I get some time to work out how to alter the invisible settings.

Meanwhile, I have observed that while the disk filling process does not kick off straight away, it always kicks off when the backup process starts. Taking a backup seems to trigger the process with 100% reliability. (In fact this might be why the disk fills up overnight, as I had auto backups enabled.) Then the disk fills and the backup fails. So I have no app level backups for Rocket.Chat at the moment.

Regarding the separate problem of blank settings, I found that the Admin group had checkboxes DISABLED for all the settings configurations. I don't know what caused that or whether it's important, we've never touched this screen. I have switched these on but my settings are still blank so far.

robw

Quick update: Upgrade to package 2.55.0 with Rocket.Chat server 7.3.0 did not fix the problem (though I didn't think it would based on the RC release notes).

Attempts to get help on the Rocket.Chat community forum appear to have failed because I encountered the rudest and most self centred and unhelpful "Community Liaison Officer" I've ever had the misfortune to meet, a genuinely toxic individual. However, my glass is half full: the silver lining is that it reinforced my sincere appreciation of the community in this forum and the Cloudron team.

robw

This would certainly be a very valuable addition to Cloudron. I have upvoted. That said, I guess that the configuration process could be challenging since there doesn't appear to be any point-and-click UI for that.

robw

I forgot an important advantage: You're supporting open source.

robw

Sorry, I didn't answer your original question directly...

Real world server specs for OpenWebUI itself are very low. My Cloudron OpenWebUI app instance fits into a few Gb of storage, barely uses any CPU on its own, and runs in well under 1 Gb of RAM.

But if you want to use the embedded ollama system to interact with locally hosted LLMs, your server needs to support the actual LLMs aswell as OpenWebUI. So you need all of this:

Enough disk storage for all the models you want to use.
- Individual models you can typically run locally for a reasonable cost range from 2-3 Gb (e.g. for a 3B model) up to 40-50Gb (e.g. for a 70B model). You might want to store multiple models.
Enough RAM (or VRAM) to fully load the model you want to use into memory, separately for each concurrent chat.
- To roughly calculate, you need the size of the model file plus some room for chat context depending on how much you want it to know/remember during chats, e.g. 3-6Gb per chat for 3-8B models, more for the bigger ones.
Enough CPU (or GPU) compute power to run the model fast.
- For tiny (3-8B) models, expect 1-2 minutes per chat response on a typical CPU+RAM system and don't imagine you can use bigger models at all, or seconds per chat response using GPU+VRAM. (Note: You might do better than that on the very latest CPUs, but GPU+VRAM is still going to be hundreds of times faster.)
If you're using CPU+RAM (as opposed to GPU+VRAM):
- You'll find that your disk I/O will be hammered (particularly during model loading) too, so you'll want very fast SSDs.
- Expect your CPU and your RAM to be fully consumed during inference (chats), so don't expect to be running other apps on your server at the same time.

In short, I'm not sure that a VPS hosted OpenWebUI instance running only on CPU+RAM is ever going to be useful for self hosted LLMs.

Unfortunately, even if you have a GPU on your virtual server, even if you get under the hood and install your GPU drivers on the Ubuntu operating system, currently Cloudron's OpenWebUI app installation won't use your GPU. So on Cloudron you're stuck with CPU+RAM.

But that is not as gloomy as it sounds... To answer your next question...

@timconsidine said in Real-world minimum server specs for OpenWebUI:

Using “out the box” with local default model.
Is there any point to the app to use with publicly hosted model ?

Yes, there is a point. Your use cases for handling data privately are more limited, certainly, but there are some outstanding advantages to doing this, particularly on Cloudron.

You're storing your data (including chats and RAG data) on a system you control.
- Although you're still sending your chats and data within them to the public model, you at least control what you can do with the storage of your chats and data.
You can download, backup, and always access your chats, or move them to a different OpenWebUI server, even if your connection to the public model is severed.
You can interact with multiple public and private models via a single interface, even within each chat. None of the public platforms let you talk to the others.
- E.g. OpenWebUI has some pretty cool features to let you split chat threads among different models, and let models "compete" with each other using "arena" chats. We've found this to be invaluable in our business because a lot of optimizing AI usage is about experimentation and finding the best tool for the task at hand.
You can install and manage your own prompt libraries, system prompts, workspaces (like "GPTs" in ChatGPT), coded tools and functions (OpenWebUI has some cool integrated Python coding capabilities in this area), in a standard way across every LLM that you interact with, and without storing your code and extended data in the public cloud.
You can brand your chat UI according to your company or client, and modify/integrate it in other ways. OpenWebUI is flexible and open source.
You can centrally connect to other apps that you self-host for various workloads including data access and agent/workflow automation without needing to upload and manage all that stuff in public systems.
- E.g. some apps running on Cloudron that can give your AI interactions super powers include:
  - N8N for workflow automation
  - Nextcloud for data storage and management
  - Chat and notification apps
  - BI apps like Baserow and Grafana
You can manage and segregate branded multi-user access to different chats and different AIs, either in a single OpenWebUI instance, or (since app management on Cloudron is so bloody easy), different instances on different URLs.
In the future when you switch to self hosted LLMs or other integrations, there's little no migration. You just switch off the public API connectors and redirect them to your own models and tools, because you managed your data and chats and code integrations locally from the outset.
And more. I'm sure I didn't think of everything.

By the way, plenty of these advantages are either because of or enhanced by running on Cloudron. Cloudron is great.

I haven’t tried DeepSeek locally but might be worth a shot for privacy. I wouldn’t use it otherwise.

I agree with that decision wholeheartedly. Well, unless you're talking with DeepSeek about stuff that you want the whole world to know and learn from. Then, go nuts.

robw

@timconsidine Are you trying to use locally hosted ollama models, or have you wired up API keys for the public cloud models like ChatGPT or DeepSeek(*) in your OpenWebUI instance?

If you're experiencing unusable slowness for locally hosted models, it might be because OpenWebUI on Cloudron out of the box is running with CPU+RAM only (not GPU+VRAM). Even for tiny models, that's going to be very slow even with very fast CPUs.

I'd be surprised if you're finding OpenWebUI to be slow with the public cloud models. There will be some latency through API calls between your Cloudron server and the online model. but I'd be surprised if you didn't find it to be nearly as fast as using the online hosted versions directly.

(*) By the way, if you're using DeepSeek online and not self-hosted, please assume every interaction is being read at the other end. There are no privacy controls. And even with ChatGPT and the others, I'd suggest reading the terms and conditions of your API usage carefully and considering which jurisdiction you're sending your data and chats to.

robw

Quick update: We now have Rocket.Chat on the 2.54.3 package (ran a manual update without a backup), i.e. the small update which sets the Deno cache directory. It has not fixed the problem. Nothing useful from the Rocket.Chat forums yet. So still working on it.

robw

@stoccafisso I can confirm that for me this is Rocket.Chat flakiness, not Cloudron. I fear your restoration process won't help if Rocket.Chat is just going to kick off its log filling process again.

@girish I haven't yet tried moving logs to the filesystem, because (very strangely) my Rocket.Chat settings page currently appears to be empty, just a search box:

That's a Rocket.Chat mystery I haven't figured out yet.

I did try the scheduled mongodb log clearing idea directly in the app container, but it didn't seem to help. Because I can't see what's actually happening in the database, I don't know if that's because the data is filling up faster than the query runs, or because of database locking because the database is so big (100s of Gb), or something to do with virtual container file system strangeness, or something else.

At this stage my server seems to have fallen into this pattern:

Rocket.Chat fills the disk
Rocket.Chat attempts an update (I have auto updates on)
Update fails because the backup fails (no disk space to prepare the backup, even though my backups go offsite to Backblaze)
Rocket.Chat restarts, at which point the "filled" disk space moves from within the Rocket.Chat container to the virtual file system (reported as "everything else" on the Cloudron System Info page)
A reboot releases the disk space from the virtual file system
Start over...

The disk filling doesn't seem to start immediately any more. I don't know what triggers it.

So that means with a reboot every day or so, the server is more or less operational for general usage, except for a while right at the end when the disk is full.

I'm hoping a Rocket.Chat update will arrive soon that makes this go away.

robw

Oh, the problem is not gone.

I'm not sure when that happened, Rocket.Chat had stopped working again this morning when I logged in so that's when I checked. So I assume it's Rocket.Chat again. Since the app wasn't working the logs are full of errors, so I couldn't easily see anything useful there. Since the Rocket.Chat container's file system changed in the last update, I can't seem to see the .bson files any more so I can't confirm that was the problem and I'm not sure what else to check.

There were some other automatic app updates over the last two nights (though not Rocket.Chat because it's up to date), so I suppose it's possibly not Rocket.Chat this time, I'm just assuming.

I can't see that data anywhere on the Cloudron disk from the server console. A du -h --max-depth=1 at the root level on the server shows Cloudron is using in expected the amount of space taken up by the apps in the coloured parts of the bar above even though df says the disk is full. The 'everything else' data doesn't seem to exist. So I can't tell what the data is. It appears to be living in the overlay filesystem until a Cloudron reboot, at which time it disappears. (Sorry I didn't take a screenshot.) I know nothing of these virtual file systems, so apologies if I'm explaining that wrong, just reporting what I can see.

So after a reboot it's all good again. After 30 minutes or so, there's no evidence so far that the disk is filling up or anything is changing on the disk yet:

robw

Um... The problem may have magically disappeared.

Yep, the disk filled up. The update failed. The app restarted. At which point, storage snapped back to its original state - 300Gb gone in the blink of an eye. I started the 2.54.2 package update again, skipping the backup just in case the snapshot process hung like earlier. It ran. Rocket.Chat is now on the latest version. The disk filling process does not appear to have re-commenced.

I've seen this before. Don't watch Highlander II, it was a horrible update that should never have been made. Just go directly to Highlander III.

robw

Geez... So starting the 2.54.1 package update (which I note upgrades to Rocket.Chat server 7.2.1) appeared to remove the rocketchat_apps_logs.bson file/collection and start over - which didn't happen on the previous update. (I wonder if this has something to do with the server upgrade?) However now I am watching that file grow rapidly during the update process and the update appears to be stuck on this step:

I guess this is going to fill up the disk and then fail. I'm not quite sure what to do next.

(By the way, I did try running the mongo data clearing process offered by @martinv earlier. However it went into a black hole, which I suppose is not surprising with a 200 Gb collection file, and meanwhile the disk was filling up rapidly so I stopped it. I suppose since that was running inside the container, everything zapped back to its original state when I restarted the app. Was it wrong to run that in the app console, should I access the database from outside instead?)

If the disk does fill up and the update stops, and if Rocket.Chat is still running at that point, I'll try the mongo data clearing process again. But I suspect the app won't be running at that point so I won't be able to do anything with the database.

This is not my favourite Rocket.Chat update.

robw

Ok, so that report that "App was updated to {previousVersion}" means the update didn't work, is that right? I'll know that from now on.

I was able to get Rocket.Chat updated to the latest 2.54.1 package. I had to skip the backup process, as the snapshot was not going to finish before the disk filled up. Unfortunately it has not fixed the problem. Disk space started growing right away.

It did make these errors go away though, which is what was happening with 2.54.0:

I note the rocketchat_apps_logs.bson collection did not get updated during this update. This file is now 201Gb.

robw

Oh... Does a local snapshot get taken before it's shipped off to Backblaze? That would make sense and explain the previous backup errors during the app update then; there wasn't enough local disk space.

Also I was wrong about rocketchat_apps_logs.bson not growing. It was (it's up to 158Gb again in under an hour), but perhaps I wasn't seeing the file size update in my remote console while the container was running? This is what's still filling up disk space.

robw

Firstly, thank you @girish and @martinv for being far more useful than a compass at the South Pole! Still nothing useful from the Rocket.Chat forum.

I have set up that cron task to run every 15 minutes, since our log file seemed to be growing very quickly. (After freeing up 40 Gb of disk space yesterday from other apps, our Cloudron server filled up again overnight.) However the cron task seems to be a workaround rather than a solution...

Also in very strange news, the rocketchat_apps_logs.bson file does not seem to be growing any more, even between cron task runs. I'm still not sure what's in that data, I haven't been able to look at it yet, but I'm guessing it's log data for third party Rocket.Chat apps.

Something else is now filling up the disk very quickly, a Gb every few minutes. I haven't figured out what yet. It does seem to correspond to running Rocket.Chat.

I noticed some other strange things and potentially useful notes while working through it:

Firstly, the disk space problem must be very new or we would've seen it ages ago.

I see there was an auto update to the 2.53.0 package last week.

Then the next day, an update to 2.54.0... Or was there?! (I think there was.)

I'm now seeing this behavior on each update, and it wasn't happening before. 2.54.1 landed a few days ago when the same backup error and strange event log effect. I think this is when the problem most likely started.

I still see the Update button after the update and clicking it offers to upgrade to the same version again. I notice this is not the latest package (which is 2.54.2.)

I noticed the 2.54.x series of updates do contain updates to the apps management system. Could it be related?

I noticed that when I manually ran the update, that's when the rocketchat_apps_logs.bson file seemed to diminish in size: it was no longer hundreds of Gb after the update. But I also noticed this:

b5b43ad0-4830-42db-a782-36d758365770-Screenshot 2025-01-21 103846.png

fca19139-4c1b-4e02-abf1-454f94262d19-Screenshot 2025-01-21 103959.png

Strangely, I couldn't find that data anywhere on the file system by digging through du reports; the disk usage across all directories did not add up to the disk usage I could see with df.

After a reboot it seemed to disappear. Is this something strange about the overlay file system that I don't understand?

Something that may or may not be relevant: Our app backups are going to a Backblaze data store, we don't keep them locally.

I'm going to try re-running the app update again to see if I can get to 2.54.2, and to see if the same thing happens with our rapidly growing data disappearing as part of the update.

robw

Hi all,

This is a Rocket.Chat question rather than a Cloudron question, so apologies for coming here, but the Rocket.Chat forum seems to be about as useful as a chocolate teapot.

Our rocketchat_apps_logs.bson file within our docker container is huge: ~266Gb. So it's hogging a lot of useful space on our Cloudron server. I've no idea what's stored in here, but it feels like it might not be critical data. Does anyone know if there's a way to reduce or truncate it, or if it's safe to remove?

Cheers,
Rob

robw

@Lanhild said in ETA for GPU support? Could we contribute to help it along?:

A lot of companies that might deploy Cloudron for its ease of life features don't necessarily have a VPS with a GPU.

Also, (might help you to deepen your Cloudron knowledge) Cloudron packages usually are only one component/application.

Moreover, OpenWebUI is "just" a UI that supports connections to Ollama and isn't affiliated with it. Meaning that Ollama isn't a dependency of it at all.

Excellent points @Lanhild - you've convinced me.

And there are benefits on the Ollama side too. I would appreciate the benefit in using Cloudron to keep our Ollama installation automatically up to date on its own, for instance.

In fact, given our remaining inability to modify the existing Cloudron OpenWebUI app to run with our GPUs, for our small clients we are now thinking this way - I.e. using Cloudron just for the OpenWebUI component and letting them connect to our separately hosted Ollama. It's a bit less convenient than we were hoping, but at least we'll still have segregated data and user management for each client in OpenWebUI.

So now, I also want a Cloudron OpenWebUI app that does not come with bundled Ollama, so that I can be sure these customers don't hammer our CPUs and get frustrated by a slow user experiences.

robw

Ok some further updates...

TL;DR - We still need help getting Cloudron's OpenWebUI container to start with GPU enabled. This is our bottlneck. Otherwise everything else works.

Now that we've figured out how to make our Dell servers and VMWare hypervisor* reliably support GPU all the way through to virtual machines, but failing to get Cloudron apps working with GPU, we have been looking for low cost 'hosting ready' virtual server alternatives for OpenWebUI. (Windows Server hosting that we like for a lot of other workloads is not a preferred option in this case.) At the VM level we've now got an Ubuntu/Caddy/Webmin (or Cockpit)/Docker/NVIDIA+CUDA stack fully operational with OpenWebUI/Ollama, and it's arguably a commercially viable solution.

And I must say, now that it's running in a 'hosting ready' environment with a software stack that's very similar to what Cloudron offers, even with our older-generation GPU test platform (Tesla P40s), the speed results from tests in OpenWebUI are extremely pleasing. I don't have tokens-per-second stats yet, but I can report one query that took 3.75 minutes using CPU only on the same host hardware, took 13 seconds with a single Tesla P40 GPU behind it, and left room on the GPU's VRAM for other concurrent queries.

But our stack still doesn't do all the nice stuff that Cloudron does without a lot of extra work - mail, backups, user directory, easy multi-tenanting, app leve resource limiting, automatic DNS, automatic updates, super easy installation (our current virtual server installation guide is still ~70 active configuration steps which can't be fully automated), and more.

Finally realising that we could run vanilla Docker test on the Cloudron host without breaking Cloudron (duh!), we ran the Nvidia sample workload from our Cloudron Ubuntu host. It works. So we know our server is ready.

d2b3aea2-214f-4726-a718-731727b54ba7-cloudron-host-running-nvdia-smi-container-test.png

After initially avoiding running standalone Docker containers on our Cloudron Ubuntu host (because we didn't want to upset Cloudron), running the sample app made us realise we could run a test of OpenWebUI using vanilla Docker to test our system too... It also works.

docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

time=2024-11-17T11:40:46.328Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm_v60102 cpu]"
time=2024-11-17T11:40:46.328Z level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
time=2024-11-17T11:40:47.791Z level=INFO source=types.go:123 msg="inference compute" id=GPU-2f9f15c7-39ba-5118-38fa-07ec8a1fa088 library=cuda variant=v12 compute=6.1 driver=12.7 name="Tesla P40" total="23.9 GiB" available="23.7 GiB"
INFO :     Started server process [1]
INFO :     Waiting for application startup.
INFO :     Application startup complete.

So I'm now quite certain our only hurdle is figuring out how to make Cloudron's OpenWebUI start with GPU support. But for the life of me, as Cloudron+Docker learners, we can't figure it out, even for a non-persistent test run. Modifying run.sh didn't help, and even while running in recovery mode started from the Cloudron CLI we can't see any way to make it work with modifications to run-compose.sh or /app/pkg/start.sh or the Dockerfile or anything else.

What can we do?

Please, can I repeat my offer to provide support (if needed) to get this done? At the very least , we could offer some $$ (and I hope the community might pitch in per my original post if more was needed), testing, notes from our own installation/test challenges, and an Nvidia GPU enabled virtual dev/test machine if required.

*Side note: I think VMWare used to offer a great hypervisor even for small scale, but now it's terrible for smaller customers (in my opinion). There are real alternatives these days, but hardly any that offer point-and-click container management. So virtual server layer container management tools that are as nice as Cloudron still have relevance beyond single server and home lab use cases, we think. Have I mentioned that we love Cloudron?

robw

@Lanhild said in ETA for GPU support? Could we contribute to help it along?:

I feel like that if GPU support works, having OpenWebUI and Ollama as separate packages would make maintenance easier.

I note this is a bit off topic for the thread... But... It's an interesting idea. Perhaps it depends on your use case.

I can't speak to Cloudron product maintenance of course, only guess. As we're still learning about the relationships between all the components in the stack (Cloudron, Docker, Ubuntu, GPU drivers - just Nvidia data center GPUs in my case at the moment, CUDA, Ollama, OpenWebUI, and throw in server hardware and hypervisors if you're in a virtualized hosting scenario), I have come to understand that complexity of dependencies between components is a real challenge.

Perhaps there are different end-user level configurations that need to be performed between Ollama and OpenWebUI (e.g. stuff like GPU support, single sign-on, access permissions), so it could be a good idea for the product team to separate them from that point of view because one might need more updates and testing than the other, or it might provide more flexibility.

But I wonder, in terms of end (Cloudron) user needs, wouldn't you need to be a pretty advanced user to care? I mean, I can think of several end-user cases where separating Ollama from OpenWebUI gives technical or management or performance benefits, like:

If you're using your Ollama instance from different endpoints (possibly including outside Cloudron)
If you want to share Ollama access between apps for performance reasons but separate user and data management on different OpenWebUI instances (e.g. if you have multiple OpenWebUI instances running on one or more Cloudron installations)
If the cost of hosting resources like storage/compute/RAM is an optimisation concern (e.g. even installing multiple instances of a single app can eat up premium storage space, and sharing compute/RAM of Ollama transactions among multiple apps could have a measurable benefit with more than a few OpenWebUI front ends)
If you already have a centrally managed non-Cloudron Ollama server but you want Cloudron for OpenWebUI front ends
If you want to reduce risk of stuff breaking between updates
And plenty of other stuff along those flexibility lines...

... but otherwise if I'm a regular simple Cloudron user with a GPU installed, I think I just want to one-click-download and have it working without any fuss. I'm guessing (though I don't know) that most Cloudron customers are running at a relatively small scale where simplicity is more important than performance and flexibility. (Please do correct me on that if needed.)

To be clear, it's probably a really good thing to offer in my company's case. But I think we might be in the minority here.

robw

Per this thread I was able to modify /app/code/run.sh in the web terminal (and also pull/push it via the Cloudron CLI) after using cloudron debug --app {appurl} in the CLI, then run OpenWebUI during recovery mode by running /app/pkg/start.sh. The app started and my modifications to run.sh are intact. However the GPU still wasn't found. My problem is that I don't know if my changes to run.sh had any effect on the app at all. I can't see anything that I understand in start.sh or run.sh which convince me that my changes were actually applied.

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

robw

Posts