Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Support
  3. Recurrent Cloudron Downtime - Request for Support

Recurrent Cloudron Downtime - Request for Support

Scheduled Pinned Locked Moved Solved Support
awsec2unresponsive
9 Posts 6 Posters 205 Views 7 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F Offline
    F Offline
    Felipe.rubilar
    wrote last edited by joseph
    #1

    Dear Support Team,

    For the past three weeks, we have been experiencing recurrent issues with our Cloudron instance. Specifically, the system goes down and all the applications hosted on it become unavailable. This has been happening 1-2 times per week during this period.

    Each time this issue occurs, we are forced to restart the server on AWS to restore the services and make the applications accessible again. This situation is significantly impacting our organization, as our team members are unable to work with the tools hosted within the Cloudron container during these downtimes.

    We have monitored the server resources (CPU, RAM, storage, etc.), and everything appears to be within normal ranges. Unfortunately, we do not have any logs or additional details to identify the root cause of these failures.

    We kindly request your assistance in investigating and resolving this issue.

    We appreciate your prompt support and look forward to your guidance to prevent further disruptions.

    Best regards,
    Felipe Rubilar
    Cloud Engineer

    1 Reply Last reply
    1
    • jamesJ Offline
      jamesJ Offline
      james
      Staff
      wrote last edited by
      #2

      Hello @Felipe.rubilar
      If you feel comfortable doing so, please provide the logs of your system, so people here can take a look.
      Also, since you are monitoring CPU, RAM, storage, etc please provide some Screenshots of the weeks where this issue arrises.

      1 Reply Last reply
      0
      • J Offline
        J Offline
        joseph
        Staff
        wrote last edited by
        #3

        @Felipe.rubilar When the apps become unresponsive, is the server up? i.e can you ping the server? Can you ssh into the server?

        If you are unable to SSH into the server, this indicates a problem on the server side. Which EC2 instance are you using? If you use lower end EC2 servers, they have a concept of CPU credits. This is usually the root cause of unresponsiveness.

        1 Reply Last reply
        1
        • F Offline
          F Offline
          Felipe.rubilar
          wrote last edited by Felipe.rubilar
          #4

          Hi @james and @joseph,

          Cloudron Performance Issues Analysis

          Thank you for your responses and guidance. I've gathered additional information and performed a detailed analysis of the system resources, as you suggested. Here's what I've found:

          1. Logs and Observations

          • When the apps become unresponsive, the server itself is also unreachable (cannot ping or SSH into it)
          • This confirms that the issue is likely on the server side
          • I've attached screenshots of the CPU and memory usage graphs for the last 6 hours, which include the period leading up to the most recent downtime

          2. Resource Analysis

          CPU Usage:

          • The CPU usage shows frequent spikes, with a peak of 169.2% (see attached graph)
          • Most of the CPU usage does not seem to come from the applications themselves (e.g., GitLab and Mesa de Ayuda have low CPU usage)
          • Since the instance is a t3.large, which relies on CPU credits, I suspect that these spikes might be depleting the credits, causing the instance to slow down or become unresponsive

          Memory Usage:

          • The memory usage is consistently high, with 7.18 GiB of RAM used out of 8 GiB
          • GitLab alone consumes 4.63 GiB, and the system is actively using swap (4.12 GiB)
          • The reliance on swap could be degrading performance, especially during high memory demand

          Nginx Warnings:

          • I noticed several ssl_stapling warnings in the logs for the SSL certificates of my apps
          • While these are just warnings, I'm unsure if they could be contributing to the issue

          3. Possible Root Cause

          Based on the resource analysis, I believe the main issue might be resource limitations:

          CPU Credits:

          • The high CPU usage might be depleting the CPU credits of the t3.large instance, leading to performance degradation

          Memory Constraints:

          • The high memory usage and reliance on swap suggest that the instance might not have enough RAM for the current workload

          4. Next Steps and Questions

          I'm considering upgrading the instance to address these limitations, but I'd like to confirm if this is the root cause of the downtime. Specifically:

          • Does this analysis align with the symptoms of the instance becoming unresponsive?
          • Would upgrading to a more powerful instance (e.g., t3.xlarge, m5.large, or m5.xlarge) resolve the issue?
          • Are there additional steps I should take to optimize the current setup or gather more diagnostic information?

          Attachments

          • CPU and Memory Usage Graphs
          • System Info Screenshot

          Thank you again for your help! I look forward to your insights.

          Best regards,
          Felipe Rubilar

          image.png

          image.png

          image.png

          1 Reply Last reply
          0
          • J Offline
            J Offline
            joseph
            Staff
            wrote last edited by
            #5

            I guess the next step is to figure what causes the CPU spikes. Does AWS collect any metrics here to help us out? (Maybe some CI setup in GitLab?) What app is Mesa de ayuda?

            1 Reply Last reply
            0
            • U Online
              U Online
              umnz
              wrote last edited by
              #6

              This may come out of left field, but are you running the app n8n on your Cloudron server?

              1 Reply Last reply
              0
              • K Offline
                K Offline
                krumel
                wrote last edited by
                #7

                Just as a sidenote: Had similar issues recently - Nextcloud appeared to be the issue, more specifically Nextcloud brought the backup process into some strange shenanigans which crashed the whole VM.
                I had no time to properly investigate, though, and we do not use nextcloud really, so I simply uninstalled it. Fixed it.

                1 Reply Last reply
                1
                • J joseph has marked this topic as solved
                • F Offline
                  F Offline
                  Felipe.rubilar
                  wrote last edited by
                  #8

                  Hello everyone,

                  Thank you all for your support and suggestions throughout this process. I’m happy to report that the issue has been resolved.

                  After analyzing the resources and following the recommendations provided, we decided to upgrade the AWS instance from t3.large to t3.xlarge. This upgrade provided additional CPU and RAM, which addressed the resource limitations that were causing the server to become unresponsive.

                  Since the upgrade, we have not experienced any further downtime or performance issues. We can confirm that the problem was related to resource constraints, specifically the depletion of CPU credits and the high reliance on swap memory.

                  Thank you again for your help and guidance.

                  Best regards,
                  Felipe Rubilar

                  1 Reply Last reply
                  1
                  • scookeS Offline
                    scookeS Offline
                    scooke
                    wrote last edited by scooke
                    #9

                    Disk space. Ram. Probably the two* main components of a server that will make running Cloudron (or any software, really) a pleasure or a pain. But I think the easy answer isn't necessarily the best - buy a server with lots of disk space and lots of RAM - because use-case factors in alot. Perhaps some sort of pop-up questionnaire at some point in the purchase stream or set up process asking which apps are likely to be used would be one way to give a user a heads-up that their 4GB RAM, 60GB SSD server isn't going to cut it (or in this case an 8GB running Gitlab! Plus, it seems they are running additional software on that server, this "Mesa de Ayuda" which seems to be some kind of helpdesk software... Cloudron clearly advises running only Cloudron on a server and nothing else, so that was utilizing the same minimal 8GB of RAM).

                    That said, I moved my Mastodon app off of my main server which has 64GB RAM and successfully run it on another server with just 4GB RAM plus 4GB swap, plus a Wordpress app, with stable results. I've also run a Presearch node on a 3.5GB RAM server along with Cloudron, and it all worked.

                    *A third would be using Cloudflare. I know why ppl use it, I just don't know why ppl use it.

                    A life lived in fear is a life half-lived

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • Bookmarks
                    • Search