High memory usage spikes

robi

What could be causing these wild swings?

How does one find out what app represents the graph? The corresponding CPU usage graphs just says CPU, no app associated with it!

subven

Looks like a cron or maybe the backup job? Does custom crons ring a bell? Take a look at the box logs and what happens at the time of the spike. If the graph doesn't show an app, it's a process outside of a Docker app like system or Cloudron itself.

Do you had recent app crashes? Like OOM related?

robi

@subven Nope, backup are weekly, no custom crons. Box logs show nothing that correlates. No app crashes. No CL OOMs.

Only CL that does it.

subven

@robi said in High memory usage spikes:

Box logs show nothing that correlates

How about the other logs from your system? If you havent done anything sketchy with your server outside of Cloudron, it should be Cloudron related so maybe check all logs and/or let @staff take a look at the problem.

robi

@subven nothing unusual on my end..

syslogs show lots of packets dropped, so maybe a type of DDoS.

subven

@robi could be but what about the intervals? Very unusual for a DDoS....

robi

@subven agreed, there is no way to easily move from one app to another while looking at each apps graphs to compare network traffic.

And the system info doesn't have a network graph at all.

It happened again this morning at 9:31:07 and 10:16:07 with CPU spikes too:

syslog has tons of messages from this IP blasting everyone:

Jan 14 17:31:23 em kernel: [35147503.981778] Packet dropped: IN=enp3s0 OUT= MAC=
ff:ff:ff:ff:ff:ff:52:54:a9:60:b1:6f:08:00 SRC=172.93.48.214 DST=255.255.255.255
LEN=62 TOS=0x00 PREC=0x00 TTL=64 ID=34492 PROTO=UDP SPT=52322 DPT=8888 LEN=42

robi

Started monitoring via top, 'm' to see when it spikes what is happening..
this began taking 100% CPU:

NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
0 0 0 0 R 100.0 0.0 1:36.15 kworker/7:1+events_freezable
0 0 0 0 S 55.8 0.0 9128:30 kswapd0

g# free -h
              total        used        free      shared  buff/cache   available
Mem:           47Gi        44Gi       426Mi       228Mi       2.1Gi       1.7Gi
Swap:         4.0Gi       1.4Gi       2.6Gi

top - 22:11:56 up 406 days, 23:57,  3 users,  load average: 3.38, 2.69, 1.51
Tasks: 477 total,   4 running, 473 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.3 us, 14.1 sy,  0.0 ni, 84.4 id,  0.1 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem : 96.4/48195.9  [|||||||||||||||||||||||||||||||||||||||||||||||||||  ]
MiB Swap: 35.0/4096.0   [|||||||||||||||||||                                  ]

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
2357779 root      20   0       0      0      0 R 100.0   0.0   5:17.08 kworker+
    157 root      20   0       0      0      0 R  59.5   0.0   9129:10 kswapd0
 2449326 uuidd     20   0  393392  47296   5784 R  50.8   0.1 126:33.66 uwsgi

What are these uwsgi processes?
Seems like a python related memory leak needing atweak to the uswgi.ini?
https://levelup.gitconnected.com/the-experience-of-solving-memory-leak-of-django-uwsgi-nginx-aws-cdb998244cfb

jdaviescoates

@robi said in High memory usage spikes:

kworker+

Guess you've now seen these?

https://www.linuxquestions.org/questions/linux-software-2/high-cpu-usage-by-kworker-4175563563/

https://askubuntu.com/questions/33640/kworker-what-is-it-and-why-is-it-hogging-so-much-cpu

Oh, and look, this person on SSDNodes also has a very similar issue:

https://lowendtalk.com/discussion/169951/kworker-4-1-events-freezable-runs-every-hour-to-clean-buffer

I'm gonna take this as yet another reason to not touch them with a barge pole!

robi

@jdaviescoates that doesn't seem related..

# cat /proc/vmstat | grep -E "dirty|writeback"
nr_dirty 1166
nr_writeback 0
nr_writeback_temp 0
nr_dirty_threshold 2192736
nr_dirty_background_threshold 1095029

I'll set echo 10000 > /proc/sys/vm/dirty_writeback_centisecs and see what happens.

jdaviescoates

@robi in case you missed, I added a few other links, including this one more similar with someone also on SSDNodes (I think that's who you're using too, right?)

https://lowendtalk.com/discussion/169951/kworker-4-1-events-freezable-runs-every-hour-to-clean-buffer

robi

@jdaviescoates Yep, nice find.. it might be something they're doing. Will send them a note.

robi

Haven't been able to find where this lives:
2451531 ? Sl 127:28 5 0 393392 47284 0.0 uwsgi --master --workers 2 --buffer-size 16384 --no-orphans --ini /etc/uwsgi/apps-enabled/graphite-uwsgi.ini

and since when do we have such lengthy PIDs?

girish

@robi that looks like the graphite container . Does that service have enough memory (under Services) ? You can also nuke /home/yellowtent/platformdata/graphite/whisper/collectd/localhost/* entirely and restart graphite. Maybe some corrupt whisper db file is causing issues.

robi

@girish Yes, plenty of memory, that is not the cause, but gets affected.

Graphite is not eating 40+GB of memory.

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

High memory usage spikes