out of space error leading to missing certs
-
@roofboard So, you just have to delete the app config files in
/etc/nginx/applications
and then runsystemctl restart nginx
andsystemctl restart box
.When you restart box, it will re-generate the nginx config for the dashboard alone. Once you have access to the dashboard, you can go to Location section of each app and click save. This will regenerate nginx config of the app.
/etc/nginx/nginx.conf
should be:user www-data; # detect based on available CPU cores worker_processes auto; # this is 4096 by default. See /proc/<PID>/limits and /etc/security/limits.conf # usually twice the worker_connections (one for uptsream, one for downstream) # see also LimitNOFILE=16384 in systemd drop-in worker_rlimit_nofile 8192; pid /run/nginx.pid; events { # a single worker has these many simultaneous connections max worker_connections 4096; } http { include mime.types; default_type application/octet-stream; # required for long host names server_names_hash_bucket_size 128; access_log /var/log/nginx/access.log combined; sendfile on; # timeout for client to finish sending headers client_header_timeout 30s; # timeout for reading client request body (successive read timeout and not whole body!) client_body_timeout 60s; # keep-alive connections timeout in 65s. this is because many browsers timeout in 60 seconds keepalive_timeout 65s; # zones for rate limiting limit_req_zone $binary_remote_addr zone=admin_login:10m rate=10r/s; # 10 request a second include applications/*.conf; }
-
@girish
Ok, I restored the ngnix.conf file in the yellowtent then went and moved all the etc/ngnix/applications into a new folder called old. Did a restart and it is still not getting there...root@my:/etc/nginx/applications# ls 0fa72b5f-441d-4bef-bee3-665f4d85dc3e.conf 4b5dbf96-42b4-4a13-9b9f-15d5228dce9c.conf a1c46e70-b09e-419f-8461-3e8e40da3870.conf b3cbed12-eecc-42f2-93ba-b0834a3b3f5b.conf default.conf 1a907fb3-616a-4b71-930d-c132adc14357.conf 4eaa7fe2-9c72-46c7-946e-f7ed41891a72.conf a9948920-c8d0-4e14-9139-45ce8a78b549.conf b892da04-793f-4449-a6d4-ed8564455d46.conf e67529c6-edb3-47a5-890f-580adc2d7c61.conf 3d520625-8452-4e93-87c7-e03f89e4286b.conf 9cbc7dcd-5202-4e5f-9730-9491d8dc4077.conf abfd70d6-750a-4621-9072-82da26e9df8f.conf bdfaef04-4f9d-433e-aaf7-44e6146acb01.conf my.draglabs.com.conf root@my:/etc/nginx/applications# sudo mv *.conf old/ root@my:/etc/nginx/applications# ls old
@roofboard
when i try to start ngnix in one tab, and have journalctl -u nginx -fa in another tab this is the error that I am getting.Jun 03 20:58:21 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server. Jun 03 20:58:21 my.draglabs.com systemd[1]: nginx.service: Scheduled restart job, restart counter is at 4. Jun 03 20:58:21 my.draglabs.com systemd[1]: Stopped nginx - high performance web server. Jun 03 20:58:21 my.draglabs.com systemd[1]: Starting nginx - high performance web server... Jun 03 20:58:21 my.draglabs.com nginx[22106]: nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY) Jun 03 20:58:21 my.draglabs.com systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE Jun 03 20:58:21 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 20:58:21 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server. Jun 03 20:58:21 my.draglabs.com systemd[1]: nginx.service: Scheduled restart job, restart counter is at 5. Jun 03 20:58:21 my.draglabs.com systemd[1]: Stopped nginx - high performance web server. Jun 03 20:58:21 my.draglabs.com systemd[1]: nginx.service: Start request repeated too quickly. Jun 03 20:58:21 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 20:58:21 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server.```
-
@roofboard said in out of space error leading to missing certs:
Jun 03 20:58:21 my.draglabs.com nginx[22106]: nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY)
Some nginx config file is loading this file (it's under /etc/nginx/applications/*, you can move out all files there temporarily somewhere else) . Can you please check which one? That conf needs to be deleted and then nginx has to be restarted. The reason it's not starting is that most likely it is a 0 byte file.
-
@girish hmmmm
When I moved the conf files all the way out of the ngnix folder into /old then ran deleted the app config files in /etc/nginx/applications and ran run systemctl restart nginx and systemctl restart boxThen it momentarily started but cloudron would not load, I rebooted and tried to start ngnix using the command systemctl restart nginx
and below is the output from journalctl -u nginx -fa
Jun 03 21:15:12 my.draglabs.com nginx[12053]: nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY) Jun 03 21:15:12 my.draglabs.com systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE Jun 03 21:15:12 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 21:15:12 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server. Jun 03 21:15:12 my.draglabs.com systemd[1]: nginx.service: Scheduled restart job, restart counter is at 1. Jun 03 21:15:12 my.draglabs.com systemd[1]: Stopped nginx - high performance web server. Jun 03 21:15:12 my.draglabs.com systemd[1]: Starting nginx - high performance web server... Jun 03 21:15:12 my.draglabs.com nginx[12062]: nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY) Jun 03 21:15:12 my.draglabs.com systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE Jun 03 21:15:12 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 21:15:12 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Scheduled restart job, restart counter is at 2. Jun 03 21:15:13 my.draglabs.com systemd[1]: Stopped nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: Starting nginx - high performance web server... Jun 03 21:15:13 my.draglabs.com nginx[12068]: nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY) Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 21:15:13 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Scheduled restart job, restart counter is at 3. Jun 03 21:15:13 my.draglabs.com systemd[1]: Stopped nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: Starting nginx - high performance web server... Jun 03 21:15:13 my.draglabs.com nginx[12070]: nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY) Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 21:15:13 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Scheduled restart job, restart counter is at 4. Jun 03 21:15:13 my.draglabs.com systemd[1]: Stopped nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: Starting nginx - high performance web server... Jun 03 21:15:13 my.draglabs.com nginx[12072]: nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY) Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 21:15:13 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Scheduled restart job, restart counter is at 5. Jun 03 21:15:13 my.draglabs.com systemd[1]: Stopped nginx - high performance web server. Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Start request repeated too quickly. Jun 03 21:15:13 my.draglabs.com systemd[1]: nginx.service: Failed with result 'exit-code'. Jun 03 21:15:13 my.draglabs.com systemd[1]: Failed to start nginx - high performance web server.
-
@girish said in out of space error leading to missing certs:
/etc/nginx/applications/
The Key being referred to is definitely a zero byte file, also draglabs.com is the main domain to which I log in. if it possible that conf is regenerating pointers to the home/yellowtent/platformdata/nginx/cert/_.draglabs.com.key": ?
-
@girish said in out of space error leading to missing certs:
e-generate the nginx config for the dashboard alone. Once you have access to the dashboard, you can go to Location section of each app and click save. This will regenerate nginx config of the app
FIXED!!!
It is difficult to tell if deleting the conf files from the folder /etc/nginx/application and then restarting unbound Instructions then using systemctl restart nginx and systemctl restart box
I say that because unbound definitely was not working at at one point.
And as I remember nginx did start momentarily.However the solution came when I deleted the corrupted zero byte private key from the folder /home/yellowtent/platformdata/nginx/cert/
When that file was deleted I was able to log in without ssl using firefox. Once in under the domains and certs section of cloudron I was able to click on Renew All Certs. That fixed SSL, and I was able to go into each program and re assign the dns settings by clicking save.
-
-
-
@girish there is no way to trigger certificate renewal over the (SSH) console?
I had a bug (a couple months ago) I never reported where stopped apps did not get a new cert and nginx failed to launch because of outdated/non valid certs making Cloudron brake (no nginx --> no dashboard) on system reboot. Fixed it by just copying over current cert files from working (non stopped) apps. They where obviously non valid for those stopped apps but I was able to start nginx, start the stopped apps and renew their certs.
So in short: Would be nice to have a way to trigger cert renewal over console command and/or extend the troubleshoot guide with cert related stuff.
-
Also this whole issue was caused by running out of space - I took a look at some of the other posts on out of space crashes and can tell it is a difficult problem to solve.
Supposedly there is a running out of space warning but i never got that warning.
I was thinking that a good solution for the running out of space error would involve taking the remaining space cron which calculates remaining space every 'n' minutes and integrating it over 'x' hours to arrive at time to disk full.
This could relatively accurately predict if an out of space crash is pending or imminent - and if so... do things like stop processes prevent backup (if backing up to local filesystem) etc.
Essentially
- predict the crash with a pinch of calculus.
- send a warning to the administrator.
- follow a contingency to protect the sever.
Because I could imagine many ways this could happen, and my example is ONLY one way. A program can crash Cloudron I could have been copying video files, It could have been NextCloud, a spam attack on a mailserver.
-
@roofboard yes, agreed. I don't like it the way it currently right now that filling up disk space brings everything down. Currently, we have a simple cron checker which will give alerts if it's nearing some amount of disk space but this fails in many cases because it runs only every 6 hours or so (it's not run too often to prevent disk churn).
I think a good long term solution is to figure out how to limit disk usage of apps. I think another thread there is a idea that maybe all appdata can be stored in a XFS partition. We can then enforce quotas on apps.
-
@subven said in out of space error leading to missing certs:
nginx failed to launch because of outdated/non valid certs making Cloudron brake (no nginx --> no dashboard) on system reboot.
Yes, indeed, this is a bug. As @roofboard also found out, the code check is a cert file exists but not if it's corrupt. I will get this fixed, so at the very least, restarting the box code will get the dashboard back up.
-
@girish said in out of space error leading to missing certs:
only every 6 hours or so
The predictive aspect of @roofboard's suggestion is also a good one by tracking a bit of the rate of change, perhaps speeding up in frequency as we approach higher thresholds (>80%+) and slowing down when out of the danger zone(<80%).
Combining this with an email to the admin which is more likely to be seen than a UI notification would be great, until we add the external mobile notification integration via external messaging services.. which is in the pipeline.
-
@girish said in out of space error leading to missing certs:
@roofboard yes, agreed. I don't like it the way it currently right now that filling up disk space brings everything down. Currently, we have a simple cron checker which will give alerts if it's nearing some amount of disk space but this fails in many cases because it runs only every 6 hours or so (it's not run too often to prevent disk churn).
I think a good long term solution is to figure out how to limit disk usage of apps. I think another thread there is a idea that maybe all appdata can be stored in a XFS partition. We can then enforce quotas on apps.
A good shorter term solution would be to allow to configure the level below which the alert is sent. Depending on if you use your server for storing text files, or if you download video, your "low disk" tolerance will be wildly different.
-
-
-
@subven said in out of space error leading to missing certs:
@girish there is no way to trigger certificate renewal over the (SSH) console?
I'd like an answer to this question.. as I just ran into the missing cert problem too.
Having deleted all the conf/cert files, and gotten nginx started, the UI is still not accessible after box restart. All apps are inaccessible too.
box restart seems to recreate the
/etc/nginx/applications/my.domain.conf
BUT doesn't check if the/home/yellowtent/platformdata/nginx/certs/my.domain.host.cert
is there.How are they regenerated from the CLI?
-