Looks like logrotate was configured to time out after 90 seconds, and so the large Etherpad log files failed to rotate, and disk filled. Looks like it's time to extend the timeouts and generally review the logrotation config.
sparkwise
Posts
-
App logs filling disk -
App logs filling diskThank you!
-
App logs filling diskLast Friday, we found that our Cloudron apps stopped working due to one app (Etherpad-Lite) filling 100% of /dev/root with >100GB of new app logs. We deleted the logs, rebooted, and chalked it up as a one-time issue. Earlier today, this same thing happened a second time: Disk filled with another 100GB of extra platformdata. Running
cloudron-support --troubleshoot
and rebooting Cloudron resolved the issue, but without a clear understanding of why this happened, I'm concerned that this could happen again.This leaves me with a few follow-up questions:
- After having already deleted the excess log files, is there anything we can look at to track down the root cause, or is the "evidence" now all gone?
- Do we need to be actively managing log rotation for apps? Should I be setting up app-specific logrotate config files? Recommendations on how to do this?
- Any recommendations on how to monitor or configure alerts on file size or disk usage on Cloudron? (Server is running on AWS EC2, so perhaps I just do this with AWS Cloudwatch tools.)
- I'm seeing a number of recent disk-full threads on Cloudron and they didn't seem to be localized on Etherpad-Lite, so wondering if there are any platform issues that might be related.
Some specifics on our situation:
- saw that our externally-hosted uptime server reported that multiple Cloudron-hosted apps were unresponsive.
- visited Cloudron dashboard and found
/dev/root
at 100% capacity, with "platformdata" filling up disk usage chart. (160GB disk) - connected via SSH (AWS EC2 "Session Manager" option failed, presumably due to filled disk)
- found that Etherpad-Lite app had created >100GB of log files in the past day
- tracked down the troubleshooting instructions (idea: link to Troubleshooting page from the "/dev/root at 100% capacity" error dialog)
- ran
sudo cloudron-support --troubleshoot
, still saw DNS failures, but after rebooting everything started normal again.
-
Migrating Cloudron to a new instance type on AWS EC2We're running our Cloudron on AWS EC2, initially created from a AWS Marketplace AMI. I just attempted to upgrade to a different instance type with a newer Intel-based processor instance type (r5.large => r7i.large), and saw this error message:
Failed to modify instance type for instance. The instance configuration for this AWS Marketplace product is not supported. Please see the AWS Marketplace site for more information about supported instance types, regions, and operating systems.
The Cloudron AMI that we started from is no longer available in the Marketplace. What would you recommend in order to move to a newer processor? Perhaps a fresh Cloudron install on a new instance that is not connected to the Marketplace? Or launch a new Cloudron Marketplace AMI and restore a backup from the current instance? Appreciate any advice on this.
-
Removing Python 2.7Removing 2.7 seems to have been seamless. Thank you!
-
Removing Python 2.7Thanks for the pointer! I doesn't look like Inspector is reporting spurious findings, though: Python 2.7 appears to be fully installed and active on the system, and is not just remnant configuration files picked up by SSM.
I don't see any indication that Python 2.7 is used by any Cloudron containers, so I think that removing the
python2.7
andpython2.7-minimal
packages is the best path forward. My hesitation is the presence of thepython-is-python2
package:$ sudo apt-get --simulate remove python2.7 python2.7-minimal The following packages will be REMOVED: python-is-python2 python2 python2-minimal python2.7 python2.7-minimal $ dpkg -l | grep python-is-python2 ii python-is-python2 2.7.17-4 all symlinks /usr/bin/python to the DEPRECATED python2
Wondering if that changes the recommended path forward.
-
Removing Python 2.7We run Cloudron on an AWS EC2 instance (Ubuntu 22.04LTS), and Amazon Inspector is flagging security issues with Python 2.7 on this box that are now only being fixed in Ubuntu's "Extended Security Maintenance" (esm3). As best I can tell, we're not actually using Python 2.7 for installed apps, so I'm wondering about uninstalling Python 2.7 entirely from our Cloudron. What would you recommend?
-
Chatwoot 4.0.2 is a beta release candidateI wanted to flag that the latest Cloudron build is the first beta release candidate for Chatwoot v4. (Chatwoot posted a migration guide to accompany the upgrade, but looks like pgvector may already be installed and available.)
I spotted this before our Cloudron app auto-updated so I was able to turn off auto-updates for this app, but I wanted to recommend marking this beta of a major version such that it doesn't get updated by Cloudron automatically.
-
setting NODE_ENV for production@nebulon I see. Thank you!
-
setting NODE_ENV for productionIt looks like Cloudron is trying to set
NODE_ENV=production
in start.sh:57, but this clearly isn't working based on what is in the logs.@robi Adding a
.env
file withNODE_ENV=production
unfortunately didn't change this.@luckow Warning message in the logs says "This mode is slower for users and less secure than production mode."
Wanted to flag that I see an example in the Etherpad documentation that sets this via Docker Compose: https://docs.etherpad.org/docker.html#:~:text=NODE_ENV%3A production
-
setting NODE_ENV for productionI noticed while while restarting the Etherpad app that it is currently loading in development mode.
[WARN] settings - Etherpad is running in Development mode. This mode is slower for users and less secure than production mode. You should set the NODE_ENV environment variable to production by using: export NODE_ENV=production
In which file should I add
NODE_ENV=production
? I'm not seeing anything in the File Manager for environment variables. -
How do you delete users?@bmann Not sure if this answers your question, but our process is this: Initially, I enabled "teams" and disabled "signups" at
/settings/admin/flags
, then configured some teams and Google Login. When we have new team members join, I invite them to the appropriate team (e.g./settings/teams/2/members
> Add.) When they accept the invite, I believe this creates the user account automatically. The issue that we ran into is that there was (and still is) no way to subsequently delete that user account from the admin GUI. -
Admin Login won't work@marylou @nebulon Currently, accessing Etherpad's admin interface is a two-step process (after the config mentioned above):
- First, visit
/admin-auth/
to authenticate with your Cloudron user. If successful, you'll see the word "Authorized" appear. - Next, visit
/admin
. This should no longer redirect you to/admin/login
, and should instead show you the Etherpad admin console.
- First, visit
-
Custom skins not loading@nebulon Thank you! Great to hear. Happy to say that our custom skin loads properly.
-
Custom skins not loading@nebulon We're circling back to custom skins as being our primary blocker to updating Etherpad. We believe that the Cloudron packaging change that prevents custom skins from working was when these lines were commented out: https://git.cloudron.io/cloudron/etherpad-lite-app/-/blob/master/start.sh?ref_type=heads&blame=0#L29-33
Would it be possible to reinstate symlink creation for custom skins located in
app/data/skins
? I believe the link needs to be added into the/src/static/skins
directory to work. -
Google Login challenges@marylou Where we ended up landing: both buttons are visible but only Google Login button gets used. Not perfect, but the app is fantastic.
-
Admin Login won't workI think access to the admin UI will still be needed in order to access Etherpad's Plugin Manager. (I don't believe there's a way to specify or update plugins from the json file.)
-
Custom skins not loading@nebulon Is there a way that I can add my custom skin files to
/run/etherpad-lite/src/static/skins/
? I can't access that directory from Cloudron's File Manager. -
App state incorrectly reported as "Running" immediately after security update rebootThanks for clarifying!
-
App state incorrectly reported as "Running" immediately after security update rebootI've noticed that after I apply security updates from the Cloudron Notifications page, all apps show as "Running" immediately after the Cloudron has finished rebooting, even before apps are passing their own health checks. I suspect all apps should initially be in the "Stopped" state after the reboot.