To conclude this, it was a memory issue. The instance as a whole was a bit overcommited.
If the backup task is idle, it won't consume any memory. Also Cloudron does not reserve memory based on the limits set, neither for backup nor for apps. The limit is just to avoid rouge apps or the backup task to bring kill other apps.
@girish Increasing the swap seems to have cured the problem. Additionally, there may have been a problem caused by a misreading of the amount of swap as originally on the Linode it said there was only 512mb but in Cloudron it said there was almost 1GB.
I have increased the swap from 512mb to 2048mb and I haven't received any app out of memory notifications since then so that seems to have cured it.
@d19dotca Yes, the limits are there to protect against the noisy neighbor problem which exists when many processes are competing for the same resources and ONE uses up more than their fair share.
Technically we could have all 30 Apps be set to 1+GB on a 16GB RAM system and it would work fine until one App behaved badly. Then the system would be in trouble as the OOM killer would select a potentially critical service to kill.
With limits, the system is happy, and the killing happens in containers instead.
I'm with @nebulon, I see this as difficult to support, harder to code for - and more user friction. I usually never support more user friction unless it's dire. I think configuring after installation is more than acceptable. It's not like we can't write scripts using their API to macro your suggestion too, but I digress. 😂
Correct. We don't need to re-create container to change the memory limit. But we still need to restart it after adjusting memory limit because of the limitations in our packaging.
I think over time we have learnt that it is not a good idea to setup apache to auto-scale based on container memory limit. Those things are very dependent on the app/plugin use. I think java apps require a restart at this point since the JVM gets passed in the heap memory as a flag on startup, maybe there is a workaround for this as well, have to investigate.
But at that point, we can atleast make the memory limit code not re-create container which I think is where bulk of the slowness is.
After thinking about it, the graph chosen is better to view memory usage (which fluctuate much more than disk).
I would then suggest separating both information:
keep the current graph for the app usage (the Y axis adapting to the app using the most memory)
add a bar for total memory used, modelled on the total disk usage (could be a single colour for memory usage)
total disk usage
This would help maximize the amount of information you can visualize in one go and help detect spikes better.
@girish I don’t know if that’s really accurate. I say that only because according to the WordPress server status page, it shows my memory set at 256M, but I don’t have that set anywhere in any files. And when I assumed it was matching the memory assigned to the whole app deployment, it was not excuse when it set it to 1 GB as a test, it still showed 256M.
@necrevistonnezr indeed, previous versions of Cloudron would allocate more memory for redis based on the app's memory limit. This limit was not settable by the user. In 5.2, we have made the redis limits visible and configurable by the user. Our 5.2 migration script should ideally could have been smarter and allocated more memory to redis.
@YurkshireLad The memory recommendation is the absolute minimum memory below which the app won't run reliably. Like even with 1 user, it won't run properly below that recommendation.
The amount of memory required to host 100 users... is hard to know. Usually, the answer to this depends on the how all these users user Rocket.Chat. If they use simultaneously and if they use it from multiple devices etc. Cloudron will give you a notification (email) if an app is running out of memory. So, you can just take it slow and increase the app's memory limit as you go. You can also look into the upstream project's memory recommendation for 100 users.
I manually installed Mastadon on a Digital Ocean 1Gb droplet (unrelated to Cloudron), and with only 1 user and 5 posts, it ran out of memory. I bumped the droplet up to the 2Gb option and it ran smoothly.
@masonbee The memory limit setting is the "maximum memory" before which the app is killed for over use. It is not "reserved memory" i.e it's not pre-allocated to the app and thus cannot be used by any other app. The "maximum memory" simply ensures that a single web app cannot bring down the whole system.
Would be great if you could update this thread with your findings. Also please note that the memory settings for the addons are currently not preserved across app restores or even server reboots. We are working on this fix and you can see the status of it at https://git.cloudron.io/cloudron/box/issues/566