Certificate rate limit hit during restore process?
-
Hi all,
this is a follow up of a post I created earlier.
I restored our Cloudron 6.2.1 instance as described. During the process of restoration all of the apps but Mattermost fail to restore. According to the logs nginx crashes/fails to start due to a certificate error. After having a look at the event log I found out that we hit the rate limit.
{ "domain": "subdomain.example.tld", "errorMessage": "Failed to send new order. Expecting 201, got 429 \"{\\n \\\"type\\\": \\\"urn:ietf:params:acme:error:rateLimited\\\",\\n \\\"detail\\\": \\\"Error creating new order :: too many certificates already issued for exact set of domains: subdomain.example.tld: see https://letsencrypt.org/docs/rate-limits/\\\",\\n \\\"status\\\": 429\\n}\"" }
I really can't tell how this happened. Is there anything I can do right now in order to get everything up and running shortly?
Regards
-
-
-
@ctrl Did I understand correctly that the dashboard itself is running (since you are able to view the eventlog) ? Are you using normal certs or wild card certs?
Can you also check
/home/yellowtent/boxdata/certs
. This contains the certs. You can doopenssl x509 -text -in <certfilename>.cert
. The output will have :Validity Not Before: Mar 3 21:55:58 2021 GMT Not After : Jun 1 21:55:58 2021 GMT
Can you check the validity of the certs there? The files with underscore in the beginning means wildcard cert.
-
@girish said in Certificate rate limit hit during restore process?:
@ctrl Did I understand correctly that the dashboard itself is running (since you are able to view the eventlog) ?
Yes. After restoring our instance the dashboard is running on
my.example.tld
. After logging in I can see the restore process of all apps is running in parallel. Mattermost finishes first and is up and running while all remaining apps bail out with an error pointing to nginx.Error reloading nginx: reload exited with code 1 signal null
But after a reboot nginx won't come up anymore at all thus not providing the dashboard anymore.
Are you using normal certs or wild card certs?
We use normal certs since wildcard certs don't seem to be configurable. We use inwx.de as a domain name registrar.
Can you check the validity of the certs there? The files with underscore in the beginning means wildcard cert.
Since we can't use wildcard certs I checked the cert for the
my
dashboard domain.Validity Not Before: Feb 26 10:00:16 2021 GMT Not After : May 27 10:00:16 2021 GMT
I found out the directory contains a lot of cruft of old certs where we don't use the matching subdomains anymore. Is it save to delete them?
-
@ctrl Yes, correct, you cannot use wildcard certs if you do not use DNS automation (this is how let's encrypt works).
Since we can't use wildcard certs I checked the cert for the my dashboard domain.
Not After : May 27 10:00:16 2021 GMTOh, this is good. This means that the certs are valid. We just need to bring up nginx.
I found out the directory contains a lot of cruft of old certs where we don't use the matching subdomains anymore. Is it save to delete them?
Yes, it's safe to delete them. Cloudron keeps those around because if you install an app again on that subdomain, it can reuse it. It reuses certs because Let's encrypt has limits on how much certs you can get in a week.
Back to nginx, you can see why it's failing using
journalctl -u nginx -a
. You can also check/var/log/nginx/error.log
. -
If I recall correctly, it's possible to get a single certificate from letsencrypt that covers several domains. This is promoted by letsencrypt as a way to "work around" the rate limit -- which is enforced per-certificate, not per-domain. Adding this feature to cloudron might be a nice way to prevent hitting the rate limit, especially during a restore operation where many domains are added at once.
-
@infogulch Cloudron does need certs during restore operation because it uses the certs from the backup (i.e of the previous install).
The difficulty with getting multiple domains per cert is that the list of domains is not known before hand. Usually people install apps one by one... And once the list of apps is "stable", it's not a problem.
-
@girish I logged all steps trying to reproduce the results:
- Install OS: Ubuntu 20.04@Hetzner Cloud instance
- Update OS: 20.04.2
- Restore Cloudron:
cd ~/; mkdir tmp; cd $_ wget https://cloudron.io/cloudron-setup chmod +x cloudron-setup ./cloudron-setup --provider generic --version 6.1.2 reboot
- Open
https://IP_ADDR
and accept the self-signed certificate to finish setup. - Restore via backup config file:
Getting certificate of my.example.tld ...
- Remove old certs from
/home/yellowtent/boxdata/certs
- Tell Firefox to forget all related sites
- Login to
my.example.tld ...
(Browser still uses selfsigned cert) - Notification center:
Failed to new certs of my.example.tld: Failed to send new order. Expecting 201, got 429 "{\n \"type\": \"urn:ietf:params:acme:error:rateLimited\",\n \"detail\": \"Error creating new order :: too many certificates already issued for exact set of domains: my.example.tld: see https://letsencrypt.org/docs/rate-limits/\",\n \"status\": 429\n}". Renewal will be retried in 12 hours
- Status of Apps: running||restoring (status seems to vary between restore attempts)
9.1 If status is restoring: Restore process fails due to cert/nginx error
9.2 If status is running: Opening an app redirects tohttps://my.example.tld
(Browser still uses selfsigned cert) - Reboot from within UI
- Can't reconnect to UI:
Cloudron is offline. Reconnecting.
journalctl -u nginx -a
Mar 08 15:17:24 hostname systemd[1]: Starting nginx - high performance web server... Mar 08 15:17:24 hostname nginx[503]: nginx: [emerg] cannot load certificate "/home/yellowtent/boxdata/certs/example.tld.host.cert": BIO_new_file() failed (SSL: error:02001002:s> Mar 08 15:17:24 hostname systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE Mar 08 15:17:24 hostname systemd[1]: nginx.service: Failed with result 'exit-code'. Mar 08 15:17:24 hostname systemd[1]: Failed to start nginx - high performance web server. Mar 08 15:17:25 hostname systemd[1]: nginx.service: Scheduled restart job, restart counter is at 1. Mar 08 15:17:25 hostname systemd[1]: Stopped nginx - high performance web server. Mar 08 15:17:25 hostname systemd[1]: Starting nginx - high performance web server...
cat /var/log/nginx/error.log
2021/03/08 14:45:54 [notice] 25650#25650: signal process started 2021/03/08 14:48:33 [notice] 996#996: signal process started 2021/03/08 14:52:30 [emerg] 1174#1174: cannot load certificate "/home/yellowtent/boxdata/certs/example.tld.host.cert": BIO_new_file() failed (SSL: error:02001002:system library:fopen:No such file or directory:fopen('/home/yellowtent/boxdata/certs/example.tld.host.cert','r') error:2006D080:BIO routines:BIO_new_file:no such file)
-
Thanks for the detailed steps.
@ctrl said in Certificate rate limit hit during restore process?:
Open https://IP_ADDR and accept the self-signed certificate to finish setup.
Restore via backup config file: Getting certificate of my.example.tld ...
Remove old certs from /home/yellowtent/boxdata/certsCan you clarify this part. Where do you see 'Getting certificate of my.xxx` ? I assume you want through the Restore UI correct? Meaning in the Setup page, there is a Restore link at the bottom and then you uploaded the backup config. Did you see the message during the restore process. It should only say "Downloading backup.." . https://docs.cloudron.io/backups/#restore-cloudron has some screenshots of that restore link.
Finally, why did you remove old certs from
/home/yellowtent/boxdata/certs
. Or are you referring to removing the old certs of obsolete domains? -
@girish said in Certificate rate limit hit during restore process?:
Thanks for the detailed steps.
@ctrl said in Certificate rate limit hit during restore process?:
Open https://IP_ADDR and accept the self-signed certificate to finish setup.
Restore via backup config file: Getting certificate of my.example.tld ...
Remove old certs from /home/yellowtent/boxdata/certsCan you clarify this part. Where do you see 'Getting certificate of my.xxx` ? I assume you want through the Restore UI correct? Meaning in the Setup page, there is a Restore link at the bottom and then you uploaded the backup config. Did you see the message during the restore process. It should only say "Downloading backup.." . https://docs.cloudron.io/backups/#restore-cloudron has some screenshots of that restore link.
The message appears right after I start the restore process and Cloudron notifies about the downloading process of the backup. To create a screenshot I repeated the restoring process right now. This time nginx crashed during the restore process right after the aforementioned message regarding the certs. I attached some screenshots in order. Nevertheless I am able to load the login page by force-reloading the
my.xxx
domain which still uses the selfsigned cert.
Finally, why did you remove old certs from
/home/yellowtent/boxdata/certs
. Or are you referring to removing the old certs of obsolete domains?I just removed orphaned certs belonging to apps I tested in the past. I let all certs in place which are still associated to existing subdomains/apps we use.
-
@ctrl mm, hard to make out why this happens. Are you able to send us the IP address of your server to support@, so we can try to see why it does this? You can run "cloudron-support --enable-ssh" on the server, so we can look into it. Thanks!
-
I will mark this as solved since it got auto-magically solved.
-
@girish Yes, thank you very much, I was just about to write excact the same thing.
Maybe related - since we use inwx.de as our domain registrar we have to use the
Manual
option as a DNS setting. The disadvantage is we can't use a wildcard certificate in that scenario. For the most part I assume that's why we ran into the rate limit issue in first place. Maybe I'm wrong but I guess at a certain point there were just too many requests at once since there existed many orphaned certs inside~/yellowtent/boxdata/certs
from previous installed apps since 2018.- Wouldn't it be an improvement for
Manual
setups to delete the corresponding certs upon deinstallation of an app? - Would it be possible to include inwx.de as an automated DNS provider? They offer an API.
- For the time being which registrar would be your personal recommendation for a automated DNS provider?
- Wouldn't it be an improvement for