Backups blocked by dodgy characters
-
I don't mean 80's TV presenters either.
Here's an example:
Error uploading snapshot/path/~�filename.docx. Message: Couldn't parse the specified URI. HTTP Code: InvalidURI
Basically the
�
seems ton be causing the issue.Any way that Cloudron can skip or rename those files so that the entire backup doesn't fail?
The law of luck states that you are much more likely to need the backup you don't have than the ones you do.
-
I don't mean 80's TV presenters either.
Here's an example:
Error uploading snapshot/path/~�filename.docx. Message: Couldn't parse the specified URI. HTTP Code: InvalidURI
Basically the
�
seems ton be causing the issue.Any way that Cloudron can skip or rename those files so that the entire backup doesn't fail?
The law of luck states that you are much more likely to need the backup you don't have than the ones you do.
-
@marcusquinn do you happen to know which character this is? The one with the questionmark is like just a placeholder for the selected font.
@nebulon It's in Nextcloud but all I see is �
It looks like that temp file that's created when you have a .docx file open on windows, after the ~ and before the first letter.
I don't have MS Word, so couldn't say what it is.
I've deleted it as I need to get on and have backups running - but maybe there's a way to skip files where the URL can't be parsed and make sure all else is backed up, with just these files copied to a quarantine area or something?
-
I guess you are using rsync? With rsync, you are at the mercy of the supported characters by the backend. S3 has a set of safe characters - https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html . Most printable characters are accepted but the '?' indicates that it is some fancy non-printable character. Best bet is to either rename it or switch to tgz format...
-
I guess you are using rsync? With rsync, you are at the mercy of the supported characters by the backend. S3 has a set of safe characters - https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html . Most printable characters are accepted but the '?' indicates that it is some fancy non-printable character. Best bet is to either rename it or switch to tgz format...
@girish Yes, rsync - unfortunately, tarballs aren't an entirely free choice.
I understand that one file might not be uploaded - but why does the whole process have to crap out?
A 99.9999999999% backup is still significantly better than a 0% backup
-
@girish Yes, rsync - unfortunately, tarballs aren't an entirely free choice.
I understand that one file might not be uploaded - but why does the whole process have to crap out?
A 99.9999999999% backup is still significantly better than a 0% backup
@marcusquinn well we can't really decide which file is crucial for an app to work and which one can be lost after a restore. So we can't mark a backup as valid, if anything goes wrong during backup as that might just mean issues get unnoticed and one has a false sense of safety, which will then only hit if the backup is really required.
It would be good to understand which character caused this though.
-
I guess a way to fix is to encode local filenames accordingly and also decode to restart. Some filename mapping scheme. This will require some effort and have to see if it's worth the effort. Same thing can be used for large file names with encryption backend.
-
@marcusquinn well we can't really decide which file is crucial for an app to work and which one can be lost after a restore. So we can't mark a backup as valid, if anything goes wrong during backup as that might just mean issues get unnoticed and one has a false sense of safety, which will then only hit if the backup is really required.
It would be good to understand which character caused this though.
@nebulon If you have access to Windows & Office, then create a .docx file and show hidden files in the Windows Explorer to see the temp file it creates while editing. It is whatever character is after the ~ and before the first letter. I can't tell from Cloudron and don't have MS Office. Anyone else here know? Sorry, I give all I have.
-
I guess a way to fix is to encode local filenames accordingly and also decode to restart. Some filename mapping scheme. This will require some effort and have to see if it's worth the effort. Same thing can be used for large file names with encryption backend.
@girish Maybe - or just skip the app and backup all other apps. I just think all or nothing fails risk more than log & continue errors.
-
@girish Maybe - or just skip the app and backup all other apps. I just think all or nothing fails risk more than log & continue errors.
-
@marcusquinn Right, I think this was suggested elsewhere as well to have the concept of "partial backups". Currently, cloudron only has full backups.
@girish *If * there's not a single dodgy character in any Apps.
So, are you saying it is not possible top backup the main server and all other apps if just one App has a dodgy character in it?
It seems to me that if the system can write the error, then it is still functioning, but chooses to fail on that error than log and continue.
Not sure what to say, doesn't seem like a technical impossibility to me, more one of strategy and persuasion.
If there's a way to vote on it, I vote for 99.999999999% backups being preferred to 0% backups. What you can do with that I really don't know.
I can contribute in many ways but this might be a gatekeeper issue, so I'll leave that with you all to debate, and have to set myself a calendar reminder to check that page every time for errors to then hunt and delete the offending MS Office files.
-
@girish *If * there's not a single dodgy character in any Apps.
So, are you saying it is not possible top backup the main server and all other apps if just one App has a dodgy character in it?
It seems to me that if the system can write the error, then it is still functioning, but chooses to fail on that error than log and continue.
Not sure what to say, doesn't seem like a technical impossibility to me, more one of strategy and persuasion.
If there's a way to vote on it, I vote for 99.999999999% backups being preferred to 0% backups. What you can do with that I really don't know.
I can contribute in many ways but this might be a gatekeeper issue, so I'll leave that with you all to debate, and have to set myself a calendar reminder to check that page every time for errors to then hunt and delete the offending MS Office files.
@marcusquinn looks like those are two issues then. One is a full backup containing all apps and the platform and the other is the failing app backup alone.
I guess the second one is clear, that we can't keep that 99% backup since we don't know what failed and if that is relevant to the app.
The first one could have an option to exclude such an app then from the wholesome backup. Just not sure if that has unnoticed side-effects then if the whole server has to be restored, but I guess we could include the last working backup of the app in such a case?
-
@marcusquinn looks like those are two issues then. One is a full backup containing all apps and the platform and the other is the failing app backup alone.
I guess the second one is clear, that we can't keep that 99% backup since we don't know what failed and if that is relevant to the app.
The first one could have an option to exclude such an app then from the wholesome backup. Just not sure if that has unnoticed side-effects then if the whole server has to be restored, but I guess we could include the last working backup of the app in such a case?