Backups keep failing
-
My Immich backups have been failing for some time now.
I am now using the rsync backup instead of tar and I have increased the memory of the machine.Here are the logs: https://upload.disroot.org/r/ndomGdSa#bbIlMwTZpfswm53IwhZ66We9lvxSqjGdHYVMuWzts+I=
Thanks!
-
Too many empty directories? If so search the forum, others have hit the same thing.
-
Please have a quick look at the logs, I see no "too many" directories or executables in the logs.
I have ran thefind -type d -empty
and found 4000 results though. Could that be the cause still? -
Oh sorry I was mislead by the comment of @jdaviescoates . So looking at the logs it seems the upload to S3 fails with some
EPIPE
. This indicates that the remote S3 storage terminates the upload connection prematurely or maybe also your system ran out of memory and that killed the pipeline. We have had various issue with backblaze in the past. Can you maybe tweak the backup task memory limit as well as the part-size to see if that makes a difference? -
I am indeed using backblaze.
I have increased the memory limit from the default 1G to 5G.
I have also increased the part-size from the default 10MB to 128MB (is that what you mean by tweak? Or should I reduce it instead?).
I will let you know how it goes. As it's a big library of photos (100+GB?), it's taking a while. -
I am getting these errors now:
Jan 27 20:47:44 descriptor Jan 27 20:47:44 [01/27/25 19:47:44] INFO Error while closing socket [Errno 9] Bad file Jan 27 20:47:44 [01/27/25 19:47:44] INFO Shutting down Jan 27 20:47:44 [01/27/25 19:47:44] INFO Shutting down due to inactivity. Jan 27 20:47:44 [01/27/25 19:47:44] INFO Waiting for application shutdown. Jan 27 20:47:48 [01/27/25 19:47:48] ERROR Worker (pid:772) was sent SIGINT! Jan 27 20:47:48 [01/27/25 19:47:48] INFO Application shutdown complete. Jan 27 20:47:48 [01/27/25 19:47:48] INFO Booting worker with pid: 907 Jan 27 20:47:48 [01/27/25 19:47:48] INFO Finished server process [772] Jan 27 20:47:57 of inactivity. Jan 27 20:47:57 [01/27/25 19:47:57] INFO Application startup complete. Jan 27 20:47:57 [01/27/25 19:47:57] INFO Created in-memory cache with unloading after 300s Jan 27 20:47:57 [01/27/25 19:47:57] INFO Initialized request thread pool with 4 threads. Jan 27 20:47:57 [01/27/25 19:47:57] INFO Started server process [907] Jan 27 20:47:57 [01/27/25 19:47:57] INFO Waiting for application startup. Jan 28 01:00:00 2025-01-28T00:00:00Z Jan 28 01:00:00 2025-01-28T00:00:00Z Jan 28 01:00:00 2025-01-28T00:00:00Z Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Error: Input file is missing: upload/upload/4404f1b3-8534-432f-8c2e-ba205381eb5c/66/b4/66b49043-0cc2-48a9-b163-c138a82362b8.HEIC Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Error: Input file is missing: upload/upload/4404f1b3-8534-432f-8c2e-ba205381eb5c/e4/e6/e4e69bc7-abab-4fb2-9b16-d01f28d3819d.jpg Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Error: VipsJpeg: Premature end of input file Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Object: Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Object: Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Object: Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Unable to run job handler (thumbnailGeneration/generate-thumbnails): Error: Input file is missing: upload/upload/4404f1b3-8534-432f-8c2e-ba205381eb5c/66/b4/66b49043-0cc2-48a9-b163-c138a82362b8.HEIC Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Unable to run job handler (thumbnailGeneration/generate-thumbnails): Error: Input file is missing: upload/upload/4404f1b3-8534-432f-8c2e-ba205381eb5c/e4/e6/e4e69bc7-abab-4fb2-9b16-d01f28d3819d.jpg Jan 28 01:00:00 [Nest] 32 - 01/28/2025, 12:00:00 AM ERROR [Microservices:JobService] Unable to run job handler (thumbnailGeneration/generate-thumbnails): Error: VipsJpeg: Premature end of input file Jan 28 01:00:00 "id": "60e8f369-c07d-4e55-b54a-904f455f8366" Jan 28 01:00:00 "id": "78bd4060-73b1-4687-ad99-719856cfed7f"
and
Jan 28 01:00:00 at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24) Jan 28 01:00:00 at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24) Jan 28 01:00:00 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) Jan 28 01:00:00 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) Jan 28 01:00:00 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) Jan 28 01:00:00 { Jan 28 01:00:00 { Jan 28 01:00:00 { Jan 28 01:00:00 } Jan 28 01:00:00 } Jan 28 01:00:00 } Jan 28 03:00:00 [Nest] 32 - 01/28/2025, 2:00:00 AM ERROR [Microservices:Backup failed with code 1] Database Backup Failure Jan 28 03:00:00 R [Microservices:BackupService] Backup failed with code 1 Jan 28 03:00:00 [Nest] 32 - 01/28/2025, 2:00:00 AM ERROR [Microservices:BackupService] Gzip exited with code 0 but pgdump exited with 1 Jan 28 03:00:00 [Nest] 32 - 01/28/2025, 2:00:00 AM ERROR [Microservices:BackupService] pg_dumpall: error: query failed: ERROR: permission denied for table pg_authid Jan 28 03:00:00 [Nest] 32 - 01/28/2025, 2:00:00 AM LOG [Microservices:BackupService] Database Backup Starting. Database Version: 14 Jan 28 03:00:00 Jan 28 03:00:00 pg_dumpall: error: query was: SELECT rolname FROM pg_authid WHERE rolname !~ '^pg_' ORDER BY 1 Jan 28 16:22:36 [Nest] 47 - 01/28/2025, 3:22:36 PM LOG [Api:EventRepository] Websocket Connect: LBAslGNTdfqXxBm4AAAL Jan 28 16:23:39 [Nest] 47 - 01/28/2025, 3:23:39 PM LOG [Api:EventRepository] Websocket Disconnect: LBAslGNTdfqXxBm4AAAL Jan 28 18:31:37 ['CPUExecutionProvider'], in descending order of Jan 28 18:31:37 preference Jan 28 18:31:37 [01/28/25 17:31:37] INFO Loading visual model 'ViT-B-32__openai' to memory Jan 28 18:31:37 [01/28/25 17:31:37] INFO Setting execution providers to Jan 28 18:31:39 ['CPUExecutionProvider'], in descending order of Jan 28 18:31:39 preference Jan 28 18:31:39 [01/28/25 17:31:39] INFO Loading detection model 'buffalo_l' to memory Jan 28 18:31:39 [01/28/25 17:31:39] INFO Setting execution providers to Jan 28 18:31:40 ['CPUExecutionProvider'], in descending order of Jan 28 18:31:40 preference Jan 28 18:31:40 [01/28/25 17:31:40] INFO Loading recognition model 'buffalo_l' to memory Jan 28 18:31:40 [01/28/25 17:31:40] INFO Setting execution providers to Jan 28 18:36:39 descriptor Jan 28 18:36:39 [01/28/25 17:36:39] INFO Error while closing socket [Errno 9] Bad file Jan 28 18:36:39 [01/28/25 17:36:39] INFO Shutting down Jan 28 18:36:39 [01/28/25 17:36:39] INFO Shutting down due to inactivity. Jan 28 18:36:39 [01/28/25 17:36:39] INFO Waiting for application shutdown. Jan 28 18:36:40 [01/28/25 17:36:40] ERROR Worker (pid:907) was sent SIGINT! Jan 28 18:36:40 [01/28/25 17:36:40] INFO Application shutdown complete. Jan 28 18:36:40 [01/28/25 17:36:40] INFO Booting worker with pid: 946 Jan 28 18:36:40 [01/28/25 17:36:40] INFO Finished server process [907] Jan 28 18:36:47 of inactivity.
I am not sure how to handle that...