Move to object storage failed - think app was crashing but no notifications
-
I think the app was actually crashing, but I'm not sure why.
A couple of times I saw the "This Cloudron app is not responding screen" when visiting the app, but in my apps it always said it was running and there were no notifications about it crashing or anything but there is a load of this in the logs:
2023-01-23T20:57:43.000Z [bridport.tv:443] 2023-01-23 20:57:43.351 [32minfo[39m: Creating external storage move job for video 936efa69-b0c3-41d9-b217-11d33e94568f. 2023-01-23T20:57:43.000Z [bridport.tv:443] 2023-01-23 20:57:43.410 [32minfo[39m: Moving video 936efa69-b0c3-41d9-b217-11d33e94568f in job 22. 2023-01-23T20:57:46.000Z Killed 2023-01-23T20:57:48.000Z ==> Ensure and updating configs 2023-01-23T20:57:48.000Z => Run migration script for v5.0 => Healtheck error: Error: connect ECONNREFUSED 172.18.20.18:90002023-01-23T20:57:53.000Z Moving private video files in dedicated folders. 2023-01-23T20:57:53.000Z (node:132) ExperimentalWarning: stream/web is an experimental feature. This feature could change at any time 2023-01-23T20:57:53.000Z (Use `node --trace-warnings ...` to show where the warning was created) 2023-01-23T20:57:53.000Z ==> Starting PeerTube
Those
killed
andStarting PeerTube
would seem to suggest the app kept crashing, right?So then why did it always say running in My App and why not notifications?
-
@jdaviescoates said in Move to object storage failed - think app was crashing but no notifications:
2023-01-23T20:57:43.000Z [bridport.tv:443] 2023-01-23 20:57:43.410 [32minfo[39m: Moving video 936efa69-b0c3-41d9-b217-11d33e94568f in job 22.
2023-01-23T20:57:46.000Z KilledUsually "Killed" means it ran out of memory. You can check
dmesg
output. -
@girish said in Move to object storage failed - think app was crashing but no notifications:
Usually "Killed" means it ran out of memory.
Likely, I guess, but then why did it keep saying Running in My Apps and no OOM notifications?
@girish said in Move to object storage failed - think app was crashing but no notifications:
You can check dmesg output.
Sorry, where do I do this? In the web terminal for the app? or directly on the server?
-
@jdaviescoates said in Move to object storage failed - think app was crashing but no notifications:
Likely, I guess, but then why did it keep saying Running in My Apps and no OOM notifications?
Ideally, we should get an OOM notification in the notification view.
Sorry, where do I do this? In the web terminal for the app? or directly on the server?
Yes, directly on the server. Just scroll down to the end and see if you see any obvious oom messages.
-
@girish said in Move to object storage failed - think app was crashing but no notifications:
Yes, directly on the server. Just scroll down to the end and see if you see any obvious oom messages.
No obvious (to me) oom messages, but:
Loads of:
Packet dropped
And stuff like this:
[855094.077040] br-4e591afd08de: port 69(veth2607764) entered disabled state [855094.129062] br-4e591afd08de: port 69(veth2607764) entered disabled state [855094.132516] device veth2607764 left promiscuous mode [855094.132520] br-4e591afd08de: port 69(veth2607764) entered disabled state [855099.003288] br-4e591afd08de: port 69(vethb6eee79) entered blocking state [855099.003290] br-4e591afd08de: port 69(vethb6eee79) entered disabled state [855099.003445] device vethb6eee79 entered promiscuous mode [855099.384650] eth0: renamed from veth515d16c [855099.450669] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [855099.450690] IPv6: ADDRCONF(NETDEV_CHANGE): vethb6eee79: link becomes ready [855099.450860] br-4e591afd08de: port 69(vethb6eee79) entered blocking state [855099.450863] br-4e591afd08de: port 69(vethb6eee79) entered forwarding state
And this:
[852923.301732] warn_alloc: 1 callbacks suppressed [852923.301735] kworker/u20:4: page allocation failure: order:4, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0 [852923.301741] CPU: 0 PID: 3143384 Comm: kworker/u20:4 Not tainted 5.4.0-137-generic #154-Ubuntu [852923.301741] Hardware name: netcup KVM Server, BIOS VPS 3000 G10 11/07/2022 [852923.301747] Workqueue: writeback wb_workfn (flush-cifs-2) [852923.301748] Call Trace: [852923.301755] dump_stack+0x6d/0x8b [852923.301757] warn_alloc.cold+0x7b/0xdf [852923.301760] __alloc_pages_slowpath+0xe6e/0xeb0 [852923.301763] ? get_page_from_freelist+0x24a/0x3f0 [852923.301764] __alloc_pages_nodemask+0x2d0/0x320 [852923.301766] alloc_pages_current+0x87/0xe0 [852923.301768] kmalloc_order+0x1f/0x80 [852923.301769] kmalloc_order_trace+0x24/0xa0 [852923.301770] __kmalloc+0x22b/0x290 [852923.301792] crypt_message+0x378/0x870 [cifs] [852923.301795] ? _get_random_bytes.part.0+0x52/0x190 [852923.301811] smb3_init_transform_rq+0x277/0x300 [cifs] [852923.301827] smb_send_rqst+0xe9/0x1a0 [cifs] [852923.301852] cifs_call_async+0x11e/0x270 [cifs] [852923.301889] ? smb311_decode_neg_context+0x460/0x460 [cifs] [852923.301916] smb2_async_writev+0x25e/0x500 [cifs] [852923.301919] ? __mod_lruvec_state+0x44/0xf0 [852923.301932] ? cifs_echo_callback+0x70/0x70 [cifs] [852923.301947] cifs_writepages+0x8d8/0xd80 [cifs] [852923.301961] ? cifs_writepages+0x8d8/0xd80 [cifs] [852923.301964] do_writepages+0x43/0xd0 [852923.301965] __writeback_single_inode+0x40/0x300 [852923.301967] writeback_sb_inodes+0x233/0x4d0 [852923.301968] __writeback_inodes_wb+0x56/0xf0 [852923.301970] wb_writeback+0x20c/0x2f0 [852923.301971] wb_workfn+0x36e/0x4f0 [852923.301973] ? ttwu_do_activate+0x5b/0x70 [852923.301976] process_one_work+0x1eb/0x3b0 [852923.301977] worker_thread+0x4d/0x400 [852923.301978] kthread+0x104/0x140 [852923.301980] ? process_one_work+0x3b0/0x3b0 [852923.301980] ? kthread_park+0x90/0x90 [852923.301982] ret_from_fork+0x35/0x40 [852923.301983] Mem-Info: [852923.301986] active_anon:1987482 inactive_anon:1546882 isolated_anon:0 active_file:272022 inactive_file:169992 isolated_file:0 unevictable:4633 dirty:1652 writeback:975 unstable:0 slab_reclaimable:239084 slab_unreclaimable:199572 mapped:247621 shmem:178933 pagetables:40272 bounce:0 free:89613 free_pcp:0 free_cma:0 [852923.301988] Node 0 active_anon:7949928kB inactive_anon:6187528kB active_file:1088088kB inactive_file:679968kB unevictable:18532kB isolated(anon):0kB isolated(file):0kB mapped:990484kB dirty:6608kB writeback:3900kB shmem:715732kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no [852923.301989] Node 0 DMA free:15908kB min:56kB low:68kB high:80kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [852923.301991] lowmem_reserve[]: 0 2911 17895 17895 17895 [852923.301992] Node 0 DMA32 free:134232kB min:10980kB low:13960kB high:16940kB active_anon:1620548kB inactive_anon:855000kB active_file:99068kB inactive_file:92088kB unevictable:0kB writepending:28kB present:3129196kB managed:3063660kB mlocked:0kB kernel_stack:4152kB pagetables:22864kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [852923.301994] lowmem_reserve[]: 0 0 14984 14984 14984 [852923.301995] Node 0 Normal free:208312kB min:56540kB low:71884kB high:87228kB active_anon:6329956kB inactive_anon:5332252kB active_file:989620kB inactive_file:587976kB unevictable:18532kB writepending:10596kB present:15728640kB managed:15352200kB mlocked:18532kB kernel_stack:55432kB pagetables:138224kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [852923.301997] lowmem_reserve[]: 0 0 0 0 0 [852923.301998] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB [852923.302002] Node 0 DMA32: 13503*4kB (UMEH) 5953*8kB (UMEH) 2033*16kB (UMEH) 20*32kB (UMEH) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 134932kB [852923.302006] Node 0 Normal: 26299*4kB (UMEH) 7726*8kB (UMEH) 2192*16kB (UMEH) 203*32kB (UEH) 1*64kB (H) 0*128kB 2*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 210172kB [852923.302010] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [852923.302011] 686530 total pagecache pages [852923.302013] 63582 pages in swap cache [852923.302013] Swap cache stats: add 1882218, delete 1818633, find 294651424/294922016 [852923.302014] Free swap = 0kB [852923.302014] Total swap = 4194300kB [852923.302014] 4718457 pages RAM [852923.302015] 0 pages HighMem/MovableOnly [852923.302015] 110515 pages reserved [852923.302015] 0 pages cma reserved [852923.302015] 0 pages hwpoisoned [852923.302018] CIFS VFS: \\username.your-storagebox.de crypt_message: Failed to init sg [852923.393347] CIFS VFS: \\username.your-storagebox.de crypt_message: Failed to init sg
Ah, and I guess
lowmem_reserve[]
does sound a lot like a low memory warning.Looks like quite a few possible problems, but perhaps it was the swap running out?
-
@jdaviescoates not an expert but the second one even looks like some cifs crash. Is PeerTube data stored on cifs?
-
@girish said in Move to object storage failed - think app was crashing but no notifications:
second one even looks like some cifs crash. Is PeerTube data stored on cifs?
No. S3 object storage.
CIFS are Hetzner Storage Boxs and are only used for backups and my music storage volume.
-
Just to say, I think the issue here was that the server just didn't have enough RAM nor CPU in order to process the large video someone tried to upload. Things have been working nicely since moving my Cloudron to a dedicated server.