My Mastodon backups are failing now, nevermind not even pretending to be successful!
-
It failed again.
Feb 05 18:09:52 box:tasks update 14258: {"percent":30,"message":"Retrying (10) multipart copy of snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz. Error: XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: > 504"} Feb 05 18:09:52 box:tasks update 14258: {"percent":30,"message":"Retrying (10) multipart copy of snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz. Error: XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: > 504"} Feb 05 18:09:52 box:tasks update 14258: {"percent":30,"message":"Retrying (10) multipart copy of snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz. Error: XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: > 504"} Feb 05 18:09:55 2023-02-05T17:09:55.161Z pid=18 tid=gifwu class=ActivityPub::ProcessingWorker jid=f274ffd9d9ef46c2f2e26e1f INFO: start Feb 05 18:09:55 [b7b5820d-067c-47d5-a414-6e0885a71f70] method=POST path=/inbox format=html controller=ActivityPub::InboxesController action=create status=202 duration=35.62 view=0.00 db=4.33 key=https://mstdn.social/users/autumnrainn#main-key Feb 05 18:09:55 172.18.0.1 - - [05/Feb/2023:17:09:55 +0000] "POST /inbox HTTP/1.1" 202 5 "-" "http.rb/5.1.0 (Mastodon/4.0.2; +https://mstdn.social/)" Feb 05 18:09:55 2023-02-05T17:09:55.521Z pid=18 tid=gid06 class=DistributionWorker jid=1aa8ac9daad3d92b7ad5e327 INFO: start Feb 05 18:09:55 2023-02-05T17:09:55.522Z pid=18 tid=gifwu class=ActivityPub::ProcessingWorker jid=f274ffd9d9ef46c2f2e26e1f elapsed=0.36 INFO: done Feb 05 18:09:55 2023-02-05T17:09:55.588Z pid=18 tid=gid06 class=DistributionWorker jid=1aa8ac9daad3d92b7ad5e327 elapsed=0.067 INFO: done Feb 05 18:09:59 [7740f3ed-18ca-4ea8-90e4-c8a6449c2d28] method=POST path=/inbox format=html controller=ActivityPub::InboxesController action=create status=202 duration=19.21 view=0.00 db=0.88 key=https://med-mastodon.com/users/laci#main-key Feb 05 18:09:59 172.18.0.1 - - [05/Feb/2023:17:09:59 +0000] "POST /inbox HTTP/1.1" 202 5 "-" "http.rb/5.1.0 (Mastodon/4.0.2; +https://med-mastodon.com/)" Feb 05 18:10:11 2023-02-05T17:10:11.642Z pid=18 tid=gifwu class=ActivityPub::ProcessingWorker jid=2c73ab52c6cf572e4ab7dfb0 INFO: start Feb 05 18:10:11 [ad319702-8819-4e3c-94ab-0d5d9dd72cea] method=POST path=/inbox format=html controller=ActivityPub::InboxesController action=create status=202 duration=34.05 view=0.00 db=4.14 key=https://mstdn.social/users/stux#main-key Feb 05 18:10:11 172.18.0.1 - - [05/Feb/2023:17:10:11 +0000] "POST /inbox HTTP/1.1" 202 5 "-" "http.rb/5.1.0 (Mastodon/4.0.2; +https://mstdn.social/)" Feb 05 18:10:12 2023-02-05T17:10:12.101Z pid=18 tid=gid06 class=DistributionWorker jid=f2e8d2096a03e5a0a22b44f3 INFO: start Feb 05 18:10:12 2023-02-05T17:10:12.109Z pid=18 tid=gifwu class=ActivityPub::ProcessingWorker jid=2c73ab52c6cf572e4ab7dfb0 elapsed=0.467 INFO: done Feb 05 18:10:12 2023-02-05T17:10:12.199Z pid=18 tid=gid06 class=DistributionWorker jid=f2e8d2096a03e5a0a22b44f3 elapsed=0.097 INFO: done Feb 05 18:10:12 2023-02-05T17:10:12.337Z pid=18 tid=gifwu class=ActivityPub::ProcessingWorker jid=79f4cf2ddd7425b5ba3179f7 INFO: start Feb 05 18:10:12 [7117e894-9a36-47ae-99ef-e6bfed76b484] method=POST path=/inbox format=html controller=ActivityPub::InboxesController action=create status=202 duration=30.65 view=0.00 db=4.97 key=https://mastodon.social/users/Gargron#main-key Feb 05 18:10:12 172.18.0.1 - - [05/Feb/2023:17:10:12 +0000] "POST /inbox HTTP/1.1" 202 5 "-" "http.rb/5.1.0 (Mastodon/4.0.2; +https://mastodon.social/)" Feb 05 18:10:12 2023-02-05T17:10:12.763Z pid=18 tid=gifwu class=ActivityPub::ProcessingWorker jid=79f4cf2ddd7425b5ba3179f7 elapsed=0.425 INFO: done Feb 05 18:10:17 2023-02-05T17:10:17.127Z pid=18 tid=gid06 class=LinkCrawlWorker jid=6d2fc9d59d0ba9218f8f8875 INFO: start Feb 05 18:10:17 2023-02-05T17:10:17.135Z pid=18 tid=gid06 class=LinkCrawlWorker jid=6d2fc9d59d0ba9218f8f8875 elapsed=0.008 INFO: done Feb 05 18:10:35 [8164575c-f883-4294-8afd-245393eaed54] method=GET path=/api/v2/instance format=*/* controller=Api::V2::InstancesController action=show status=200 duration=52.18 view=41.19 db=5.35 Feb 05 18:10:35 172.18.0.1 - - [05/Feb/2023:17:10:35 +0000] "GET /api/v2/instance HTTP/1.1" 200 1799 "-" "Python/3.11 aiohttp/3.8.2" Feb 05 18:10:38 2023-02-05T17:10:38.419Z pid=18 tid=2mya INFO: queueing Scheduler::SuspendedUserCleanupScheduler (suspended_user_cleanup_scheduler) Feb 05 18:10:38 2023-02-05T17:10:38.421Z pid=18 tid=gifwu class=Scheduler::SuspendedUserCleanupScheduler jid=0d4d923fe63743640bf49bbf INFO: start Feb 05 18:10:38 2023-02-05T17:10:38.433Z pid=18 tid=gifwu class=Scheduler::SuspendedUserCleanupScheduler jid=0d4d923fe63743640bf49bbf elapsed=0.012 INFO: done Feb 05 18:10:41 verb f4de37ea-b20c-4f33-babc-d3f1a55324ad Starting stream from timeline:public:local for 127.0.0.1 Feb 05 18:10:46 2023-02-05T17:10:46.844Z pid=18 tid=gifwu class=LinkCrawlWorker jid=d66cb87b3323c02055a565d1 INFO: start Feb 05 18:10:46 2023-02-05T17:10:46.851Z pid=18 tid=gifwu class=LinkCrawlWorker jid=d66cb87b3323c02055a565d1 elapsed=0.006 INFO: done Feb 05 18:10:47 2023-02-05T17:10:47.760Z pid=18 tid=2mxq INFO: queueing Scheduler::AccountsStatusesCleanupScheduler (accounts_statuses_cleanup_scheduler) Feb 05 18:10:47 2023-02-05T17:10:47.768Z pid=18 tid=gid06 class=Scheduler::AccountsStatusesCleanupScheduler jid=24ffa8c0ad05ae3d3006f58d INFO: start Feb 05 18:10:47 2023-02-05T17:10:47.783Z pid=18 tid=gid06 class=Scheduler::AccountsStatusesCleanupScheduler jid=24ffa8c0ad05ae3d3006f58d elapsed=0.015 INFO: done Feb 05 18:10:47 [c4707371-b235-4c96-b0d0-c0a0e66ed4a6] method=GET path=/api/v1/timelines/public format=*/* controller=Api::V1::Timelines::PublicController action=show status=200 duration=563.49 view=368.27 db=26.91 Feb 05 18:10:47 172.18.0.1 - - [05/Feb/2023:17:10:47 +0000] "GET /api/v1/timelines/public?local=true HTTP/1.1" 200 7479 "-" "Python/3.11 aiohttp/3.8.2" Feb 05 18:10:53 172.18.0.1 - - [05/Feb/2023:17:10:53 +0000] "GET /api/v1/instance/peers HTTP/1.1" 200 56281 "-" "Python/3.11 aiohttp/3.8.2" Feb 05 18:11:03 [35c162c1-8573-4d10-a52b-2c6798d0020e] method=GET path=/api/v1/instance/domain_blocks format=*/* controller=Api::V1::Instances::DomainBlocksController action=index status=404 duration=19.39 view=0.00 db=2.74 Feb 05 18:11:03 172.18.0.1 - - [05/Feb/2023:17:11:03 +0000] "GET /api/v1/instance/domain_blocks HTTP/1.1" 404 31 "-" "Python/3.11 aiohttp/3.8.2" Feb 05 18:11:12 box:tasks update 14258: {"percent":30,"message":"Retrying (11) multipart copy of snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz. Error: XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: > 504"} Feb 05 18:11:12 box:tasks update 14258: {"percent":30,"message":"Aborting multipart copy of snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz"} Feb 05 18:11:12 box:tasks update 14258: {"percent":30,"message":"Retrying (11) multipart copy of snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz. Error: XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: > 504"} Feb 05 18:11:12 box:storage/s3 copy: s3 copy error when copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz: XMLParserError: Unexpected close tag 5 7 > Feb 05 18:11:12 box:tasks update 14258: {"percent":30,"message":"Copied 1 files with error: BoxError: Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: >"} Feb 05 18:11:12 box:backuptask copy: copied to 2023-02-05-152523-550/app_social.futurnumerique.com_v1.10.0 errored. error: Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag 5 7 > Feb 05 18:11:12 box:apptask run: app error for state pending_backup: BoxError: Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag 5 7 > at done (/home/yellowtent/box/src/storage/s3.js:348:48) at Response.<anonymous> (/home/yellowtent/box/src/storage/s3.js:425:71) at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:367:18) at Request.callListeners (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:106:20) at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:78:10) at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:686:14) at Request.transition (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:22:10) at AcceptorStateMachine.runTo (/home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:14:12) at /home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:26:10 at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:38:9) { reason: 'External Error', details: {} Feb 05 18:11:12 box:taskworker Task took 6349.098 seconds Feb 05 18:11:12 box:tasks setCompleted - 14258: {"result":null,"error":{"stack":"BoxError: Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: >\n at done (/home/yellowtent/box/src/storage/s3.js:348:48)\n at Response.<anonymous> (/home/yellowtent/box/src/storage/s3.js:425:71)\n at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:367:18)\n at Request.callListeners (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:106:20)\n at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:78:10)\n at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:686:14)\n at Request.transition (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:22:10)\n at AcceptorStateMachine.runTo (/home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:14:12)\n at /home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:26:10\n at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:38:9)","name":"BoxError","reason":"External Error","details":{},"message":"Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: >"}} Feb 05 18:11:12 box:tasks update 14258: {"percent":100,"result":null,"error":{"stack":"BoxError: Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: >\n at done (/home/yellowtent/box/src/storage/s3.js:348:48)\n at Response.<anonymous> (/home/yellowtent/box/src/storage/s3.js:425:71)\n at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:367:18)\n at Request.callListeners (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:106:20)\n at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:78:10)\n at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:686:14)\n at Request.transition (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:22:10)\n at AcceptorStateMachine.runTo (/home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:14:12)\n at /home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:26:10\n at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:38:9)","name":"BoxError","reason":"External Error","details":{},"message":"Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag\nLine: 5\nColumn: 7\nChar: >"}} Error copying snapshot/app_2ce44b61-0307-4b0b-96a9-31996d95fd42.tar.gz (26077596173 bytes): XMLParserError XMLParserError: Unexpected close tag 5 7 > at done (/home/yellowtent/box/src/storage/s3.js:348:48) at Response.<anonymous> (/home/yellowtent/box/src/storage/s3.js:425:71) at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:367:18) at Request.callListeners (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:106:20) at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/sequential_executor.js:78:10) at Request.emit (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:686:14) at Request.transition (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:22:10) at AcceptorStateMachine.runTo (/home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:14:12) at /home/yellowtent/box/node_modules/aws-sdk/lib/state_machine.js:26:10 at Request.<anonymous> (/home/yellowtent/box/node_modules/aws-sdk/lib/request.js:38:9) Feb 05 18:11:20 172.18.0.1 - - [05/Feb/2023:17:11:20 +0000] "GET /about HTTP/1.1" 301 5 "-" "Mozilla (CloudronHealth)"
What is going on???
-
@scooke said in My Mastodon backups are failing now, nevermind not even pretending to be successful!:
What is going on???
I don't know, but perhaps share more info about your set-up.
What back-up format? backed-up to where? with what sort of mount? etc.
-
@girish This is really starting to get frustrating. I removed Mastodon from the auto backups because that seemed to be the app causing a problem, likely some weird bit of media in it (seeing as it's Mastodon). Tehn, Peertube starting acting up - I had changed buckets and changed mode from tgz to rsyn - then it failed with a video file. Ok, fine, rsync can't do videos.
I reverted back to the original bucket which has been working fine up until 2 weeks ago, changed back to tgz, and removed Peertube too. Tried manually backing up the rest - result:Error uploading snapshot/app_040b01cf-5c14-4db5-ba54-83f3356cb466.tar.gz. Message: Unexpected close tag Line: 5 Column: 7 Char: > HTTP Code: XMLParserError
It's the tailend that keeps pooping up: "Message: Unexpected close tag Line: 5 Column: 7 Char: > HTTP Code: XMLParserError" This time it was an unmanaged WP site that's never given me problems.
The entire time I'm using my self-hosted Minio instance on a CapRover installation. It's been working fine up until 2 weeks ago. Something changed. I don't know what.
-
I've rechecked the bucket, made sure it was public (it had been public but had changed to Custom), nothing else is turned on (like versioning), it has a specific Identity (User), and I'm retrying (without mastodon or peertube) using rsync again, with the following:
-
@scooke said in My Mastodon backups are failing now, nevermind not even pretending to be successful!:
CapRover
As you yourself have frequently pointed out on this very forum, CapRover isn't really suitable for production as it doesn't Just Work like Cloudron does.
Given you've said this, I surprised to learn you are entrusting your backups to it.
Edit: but looking here it looks like the API endpoint got tweaked 4 days ago, could that be it?
https://github.com/caprover/one-click-apps/blob/master/public/v4/apps/minio.yml
-
@jdaviescoates Could be. I will look that up. Suffice to say, it seems the federated apps using Minio as a backend are the ones that have been choking. The backup is running fine thus far (about 25%).
Looks like this is the big difference:
**Dashboard**: https://$$cap_appname.$$cap_root_domain **S3 API Endpoint**: https://$$cap_appname-api.$$cap_root_domain
where the S3 API Endpoint was
https://$$cap_appname-s3.$$cap_root_domain
before. -
@girish Not sure where to look. But I removed both Peertube and Mastodon from the Cloudron backup and the most recent back up succeeded. I'll try to locate where the logs are and see whats up.
My guess is that since both of those were ALSO using Minio as their own storage there was some kind of conflict or overload (even though their respective destinations were different).
@jdaviescoates I do see that the Minio version has updated on CapRover, but my instance is still running the previous, so there isn't any problem with the url change.
-
@girish I found where the logs are, but the backup process for the Mastodon instance took very very long, and I went to bed. I checked this morning: the bucket which Mastodon uses for its own media is working fine, but there was initially no sign of the app backup. In the app's Cloudron dashboard there was no notice that it failed, but there was no record of the backup either. I eventually found it in the /snapshots folder in the bucket for the main Cloudron backup. I downloaded the backup configuration for the previous Mastodon app backup, 2 weeks ago, and everything matches up EXCEPT the region in that file differs from the region set in Minio. Could this be messing things up?
-
Since some time back, you also indicated that at some point you saw the same error with a Wordpress app instance, I guess this is all not too related to a specific app but maybe more like something choking on maybe low resources somewhere. For example can you check if that minio instance may restart due to running out of memory and thus responding with broken xml temporarily? Or also if those backups now take so long, is there a cron job running on the minio instance server which may restart services related here? I don't know caprover but just to rule those things out.
-
@nebulon Aa far as I can tell, there is no restart or low resources.
This time the backup of the Mastodon seems to have completed, using rsync, so it took about 5 hours. I say seems because it at least appears in the destination bucket BUT not in the list of backups in the apps Cloudron dashboard.
This might be good enough for me... assuming that I could, for example, download one of the previously listed Backup Configurations, such as
app-backup-config-2023-02-08-xx-xx-xx (my.mastodonapp.com).json
and then tweak it so that when I then Import a Backup, I could choose this edited one pointing to the correct place in my Minio instance... right?Why would it not be listed as complete though??
-
I still have the same problem. I tried creating a new object storage bucket. That didn't work. I tried starting all of the inactive applications and manually updating them till they were all current. Then I restarted the server and tried to backup again. That didn't work, either.
I might try deleting some applications to see if I can isolate the problem.
That didn't work either.I have noticed that the backup process indicates the number of applications to backup, however it counts 24 instead of 25. I deleted a few more applications and restarted the server. When I tried the backup again, the number of applications did correctly tally with the number Cloudron thought it was going to backup, however, the backup still failed.
The logs seemed to indicate that the problem seemed to be with Immich. I decided to backup the data and deleted immich. The backup still didn't work!
-
-