Backup failing with unspecified error
-
@girish I forgot one more line
Jan 04 06:16:13 box:shell backup-snapshot/app_dd36cce2-7674-4985-b671-a7dcd2d96a6c code: null, signal: SIGKILL
. The rest of the log is just the normal uploading messages. I'd post the whole thing but there are a lot of file names I'd rather not share. -
@zjuhasz1 said in Backup failing with unspecified error:
SIGKILL
Usually, this indicates that it got killed because of out of memory. Under backups -> configure, there is a memory limit slider. Can you give it lot more memory and try the backup again?
-
@girish It was already at 40GB but I increased it to 64 and reduced the concurrent uploads. It does seem to be crashing when uploading very large files. Does it need to store the entire file its uploading in memory or does it break large files into parts? I don't understand if the multi part upload size is used for rsync also or just tgz backups.
-
@zjuhasz1 and @girish I think this looks like the same with my issue!
https://forum.cloudron.io/topic/4129/update-to-4-1-0-1-23-0-failed
-
@zjuhasz1 I think we (@imc67 and I) found the reason. Can you confirm that the backups are failing when you take an app backup from the apps view (as opposed to the Backups view?)? If so, the fix is this:
- Open
/home/yellowtent/box/src/apptaskmanager.js
- Line 76 or so is like this:
tasks.startTask(taskId, { logFile, timeout: 20 * 60 * 60 * 1000 /* 20 hours */, nice: 15 }, function (error, result) {
Change it to (just adds memoryLimit):
tasks.startTask(taskId, { logFile, timeout: 20 * 60 * 60 * 1000 /* 20 hours */, nice: 15, memoryLimit: 4 * 1024 * 1024 * 1024 }, function (error, result) {
systemctl restart box
- Now take a backup.
- Open
-
@zjuhasz1 What kind of timeout are you hitting? Do you have any logs/error message?
There are two high level timeouts - one is the various things like upload/copy etc during the backup process. I can tell you how to tweak them, if this is where it's failing. The other timeout is that the entire backup can take a day at most. After a day, the backup is killed (because it's assumed we probably hit some bug).