Potential Issue - "exe" and/or netns-create stuck?
-
This morning my CR ground to a halt. I thought it was getting VPS bound since I've been "popular" the past few days, but they said all good so I kept digging. In the VM, there's a bin running REALLY hot. In top it shows up as "exe" which made me worry a bit:
1563 root 20 0 4926308 2.9g 28356 R 102.3 6.2 1:07.44 exe
So I ps-ax'd the PID and it came back as
1563 ? Rl 0:26 set-ipv6 /var/run/docker/netns/3939c4953582 all falseAfter a bit, that PID goes away and is replaced by another PID just like it. VM is up but CR is down.
Did a reboot, still persists. Any ideas?
-
@doodlemania2 You should look into where that exe binary is coming from, never seen anything like it. I think you can also see
/proc/1563/cmdline
(from memory) -
@girish yeah, it's just running really hot this type of command over and over:
netns-create/var/run/docker/netns/f4e6ea807391htop gives a slightly better visual:
Something about set-ipv6 maybe? (Not using IPv6 on this CR I don't think)
-
@doodlemania2 how is docker configured? is ipv6 enabled?
https://docs.docker.com/config/daemon/ipv6/
If not, perhaps enable it and see if the process calms down the CPU usage.
-
@doodlemania2 where is your server hosted ? (don't tell me netcup...)
-
@doodlemania2 can I take a look at the server? Can you write to support@ ? I want to see what the app is that is making it get stuck. In the other thread, wp developer app was suspected.
-
@doodlemania2 @doodlenode:~$ sudo cloudron-support --enable-ssh
[sudo] password for derek:
Enabling ssh access for the Cloudron support team...Done -
@doodlemania2 wow, that server is pretty clogged up. I keep getting kicked out of ssh. But I did see a lot of "exe". Maybe we should disable docker and reboot to check if it helps?
systemctl disable docker
if you can -
@girish okay - I think I fixed it - the /etc/hosts file was almost 40GB in size of garbage (shrug)
I emptied it out and replaced it with just default localhost items and restarted everything. MUCH happier it is.
Thank you so much for looking at it with me - it was in a very bad way!Now to see if I can figure out why hosts file got gummed up
-
-
-
@doodlemania2 that sounds.. worrying. how did the hosts file get corrupt ? how did you figure you should look into that file?
-
@girish well, the thing that kept getting overwhelmed was netns ip6 and i also noticed that my tailscale agent was running at like 4.6GB in memory, so thought, okay, let me ping something. it took a solid 40 seconds to even return a ping but it was correct. so that's when i looked at hosts.
did a cat to /dev/null on it and boom, all back to good. i am thinking that some dns thing went haywire - but I don't think it's CR's fault cause I've never seen CR try to write to hosts. Will dig a bit more but I also have an LTS upgrade due, so perhaps there's a bug someplace in Ubuntu with bind.
-
@doodlemania2 would have been good to get a sample/clue of what was in the hosts file with head or tail.