Restart loop with 100% CPU since Cloudron app-version v1.37.0

msbt

My Immich is cought in a restart loop while doing this:

Sep 01 08:31:362023-09-01 06:31:36,514 INFO success: machine-learning entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Sep 01 08:31:39Traceback (most recent call last):
Sep 01 08:31:39File "<frozen runpy>", line 198, in _run_module_as_main
Sep 01 08:31:39File "<frozen runpy>", line 88, in _run_code
Sep 01 08:31:39File "/app/code/machine-learning/app/main.py", line 13, in <module>
Sep 01 08:31:39from app.models.base import InferenceModel
Sep 01 08:31:39File "/app/code/machine-learning/app/models/__init__.py", line 1, in <module>
Sep 01 08:31:39from .clip import CLIPEncoder
Sep 01 08:31:39File "/app/code/machine-learning/app/models/clip.py", line 8, in <module>
Sep 01 08:31:39from clip_server.model.clip import BICUBIC, _convert_image_to_rgb
Sep 01 08:31:39ModuleNotFoundError: No module named 'clip_server'
Sep 01 08:31:40172.18.0.1 - - [01/Sep/2023:06:31:40 +0000] "GET / HTTP/1.1" 302 5 "-" "Mozilla (CloudronHealth)"
Sep 01 08:31:402023-09-01 06:31:40,088 INFO exited: machine-learning (exit status 1; not expected)
Sep 01 08:31:412023-09-01 06:31:41,090 INFO spawned: 'machine-learning' with pid 77776
Sep 01 08:31:422023-09-01 06:31:42,092 INFO success: machine-learning entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Sep 01 08:31:42I20230901 06:31:42.149549 224 batched_indexer.cpp:279] Running GC for aborted requests, req map size: 0
Sep 01 08:31:43I20230901 06:31:43.207621 223 raft_server.cpp:545] Term: 59, last_index index: 4264, committed_index: 4264, known_applied_index: 4264, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 16159
Sep 01 08:31:43I20230901 06:31:43.207686 265 raft_server.h:60] Peer refresh succeeded!
Sep 01 08:31:45Traceback (most recent call last):
Sep 01 08:31:45File "<frozen runpy>", line 198, in _run_module_as_main
Sep 01 08:31:45File "<frozen runpy>", line 88, in _run_code
Sep 01 08:31:45File "/app/code/machine-learning/app/main.py", line 13, in <module>
Sep 01 08:31:45from app.models.base import InferenceModel
Sep 01 08:31:45File "/app/code/machine-learning/app/models/__init__.py", line 1, in <module>
Sep 01 08:31:45from .clip import CLIPEncoder
Sep 01 08:31:45File "/app/code/machine-learning/app/models/clip.py", line 8, in <module>
Sep 01 08:31:45from clip_server.model.clip import BICUBIC, _convert_image_to_rgb
Sep 01 08:31:45ModuleNotFoundError: No module named 'clip_server'
Sep 01 08:31:452023-09-01 06:31:45,357 INFO exited: machine-learning (exit status 1; not expected)
Sep 01 08:31:462023-09-01 06:31:46,360 INFO spawned: 'machine-learning' with pid 77780
Sep 01 08:31:472023-09-01 06:31:47,361 INFO success: machine-learning entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Sep 01 08:31:50172.18.0.1 - - [01/Sep/2023:06:31:50 +0000] "GET / HTTP/1.1" 302 5 "-" "Mozilla (CloudronHealth)"
Sep 01 08:31:50Traceback (most recent call last):

I've restored each update to see when it started and it seems that it got introduced with v1.37.0, v1.36.0 is working fine, after that the restarts begin.

I also installed a fresh one to check if it's happening there as well and does, so something needs fixing

Edit: Corresponding update is this one https://github.com/immich-app/immich/releases/tag/v1.75.0

nebulon

I can reproduce this, looking into it.

nebulon

Just released a new package version which should fix this.

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Restart loop with 100% CPU since Cloudron app-version v1.37.0