@girish I guess I'm wondering though why it'd say "Not available yet"... is that because I had restarted the server a few hours earlier? I don't normally notice that though when I restart, it usually still shows data. Is it possible there's a bug here?
If the restarts are losing that data, then I'd think that's a bug, right, if it shows for some services but not all? To me that makes it seems like it's either not completing properly when it runs and that could maybe explain why it shows values for some but not all, or perhaps it's losing data when it should be remembering it. My gut tells me there's a bug here. Or am I way off?
I guess it's okay since we have a workaround to run that command when it happens, my brain is just wondering why it happened in the first place and how it could be prevented.
@d19dotca Yes, the limits are there to protect against the noisy neighbor problem which exists when many processes are competing for the same resources and ONE uses up more than their fair share.
Technically we could have all 30 Apps be set to 1+GB on a 16GB RAM system and it would work fine until one App behaved badly. Then the system would be in trouble as the OOM killer would select a potentially critical service to kill.
With limits, the system is happy, and the killing happens in containers instead.
@robi That's my understanding, yes. That it will persist to containers. The article I liked in the earlier comment has the sysctl command to set vm swappiness. But as mentioned earlier, fine tuning these things should be done after investigating and understanding those settings because they usually end up having unexpected side effects.
Just leaving a note for myself, but it's unclear why the issue got fixed after the update. After an app update, the container is recreated and the collectd config is regenerated. Maybe the collectd restart did the trick.
After thinking about it, the graph chosen is better to view memory usage (which fluctuate much more than disk).
I would then suggest separating both information:
keep the current graph for the app usage (the Y axis adapting to the app using the most memory)
add a bar for total memory used, modelled on the total disk usage (could be a single colour for memory usage)
total disk usage
This would help maximize the amount of information you can visualize in one go and help detect spikes better.
@smilebasti The /home/yellowtent/appsdata is the location of apps. This size seems to roughly match the nextcloud size. As for docker, you should not use du tools inside docker's image directories since they are overlays and the du tool is not smart enough to figure out the size correctly. Try docker system df to get a better idea about the actual size docker uses (this is what is reported in the graph as ~5GB). The volumes also link into appsdata so they might be double counted the du tools.
To take a wild guess, maybe you were backing up to the file system for some time before you moved to NAS via SMB? If this was the case, then you should remove the old backups manually from /var/backups. You can just safely nuke all the timestamped directories and the snapshot directory inside it.
@ruihildt The issue in your case was different. There were so many apps that the query parameter limit was getting exceeded. This is fixed in next release. @necrevistonnezr this is most likely your issue as well!