Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Support
  3. Cloudron instance scaling issues after a few hours / couple of days, apps responsive but showing a permanent "Starting..." status

Cloudron instance scaling issues after a few hours / couple of days, apps responsive but showing a permanent "Starting..." status

Scheduled Pinned Locked Moved Solved Support
performancestabilityboxplatform
29 Posts 6 Posters 1.3k Views 6 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • nebulonN Offline
    nebulonN Offline
    nebulon
    Staff
    wrote on last edited by
    #6

    @uwcrbc please send a mail with your Cloudron ID / dashbaord domain info to support@cloudron.io and enable remote ssh support for us, so we can take a closer look to get this resolved.

    U 1 Reply Last reply
    2
    • nebulonN nebulon

      @uwcrbc please send a mail with your Cloudron ID / dashbaord domain info to support@cloudron.io and enable remote ssh support for us, so we can take a closer look to get this resolved.

      U Offline
      U Offline
      uwcrbc
      wrote on last edited by
      #7

      @nebulon Hi @nebulon - Thanks for this. Email sent.

      1 Reply Last reply
      1
      • nebulonN Offline
        nebulonN Offline
        nebulon
        Staff
        wrote on last edited by
        #8

        We got the info and are looking into this. The findings so far are, that the calcom app is constantly restarting and causing the server to become very slow. Not sure why this seems to have cascading effects, since the resources are plenty, but depending on the server provider, the virtual CPUs might get throttled if one core is used 100% over a longer period of time. Will keep this updated on the findings.

        U 1 Reply Last reply
        4
        • nebulonN nebulon

          We got the info and are looking into this. The findings so far are, that the calcom app is constantly restarting and causing the server to become very slow. Not sure why this seems to have cascading effects, since the resources are plenty, but depending on the server provider, the virtual CPUs might get throttled if one core is used 100% over a longer period of time. Will keep this updated on the findings.

          U Offline
          U Offline
          uwcrbc
          wrote on last edited by
          #9

          @nebulon Interesting.
          The server provider is ssdnodes - So far I have had nothing to complain about (just about the opposite). Happy to play the liaison part should we need to relay info / ask questions of them.
          Many thanks,

          1 Reply Last reply
          2
          • U Offline
            U Offline
            uwcrbc
            wrote on last edited by
            #10

            Also: interestingly the moment the server started playing up is indeed the moment that I had roughly linked to our latest cal.com deployment.
            I say latest because this Cloudron server had a previous cal.com app instance that we used as a test a while back - not sure if relevant but maybe it can help inform the situation.

            1 Reply Last reply
            2
            • nebulonN Offline
              nebulonN Offline
              nebulon
              Staff
              wrote on last edited by
              #11

              I've updated the calcom instance on that server to latest, which has some fixes in the package. So far things seem fine.

              1 Reply Last reply
              4
              • J joseph has marked this topic as solved on
              • U Offline
                U Offline
                uwcrbc
                wrote on last edited by
                #12

                Happy to report that the updated cal.com package seems to have fixed the underlying server issue.
                Many thanks,

                1 Reply Last reply
                3
                • U Offline
                  U Offline
                  uwcrbc
                  wrote on last edited by uwcrbc
                  #13

                  At last - I think I was too eager to see this resolved.
                  The problem is still there with the exact same symptoms.
                  How can we proceed further? @nebulon @james Would another remote troubleshooting/debugging be helpful?

                  1 Reply Last reply
                  0
                  • nebulonN Offline
                    nebulonN Offline
                    nebulon
                    Staff
                    wrote on last edited by
                    #14

                    Do you see a constant process showing up in htop or so on your system, which may constantly hog some core like calcom restart did?

                    1 Reply Last reply
                    0
                    • U Offline
                      U Offline
                      uwcrbc
                      wrote on last edited by uwcrbc
                      #15

                      @nebulon Thanks - To the untrained eye, there is nothing in htop that pops up as flagrantly hogging resources.
                      The swap is high but I am not sure if this is a relevant indicator

                      1 Reply Last reply
                      0
                      • nebulonN Offline
                        nebulonN Offline
                        nebulon
                        Staff
                        wrote on last edited by
                        #16

                        And the systemctl restart box still fixes the issue temporarily?

                        1 Reply Last reply
                        0
                        • U Offline
                          U Offline
                          uwcrbc
                          wrote on last edited by uwcrbc
                          #17

                          @nebulon Yes - this fixes the issue temporarily at least

                          1 Reply Last reply
                          0
                          • girishG Offline
                            girishG Offline
                            girish
                            Staff
                            wrote on last edited by
                            #18

                            This is probably happening because the apphealthmonitor is getting stuck (for some reason) and thus unable to update the health of apps. @uwcrbc in the box.log , do you see a line printed every 10 seconds like box:apphealthmonitor app health: xx ?

                            1 Reply Last reply
                            0
                            • U Offline
                              U Offline
                              uwcrbc
                              wrote on last edited by
                              #19

                              @girish Thanks for this - After looking into the box.log:

                              • post systemctl restart box I do see indeed box:apphealthmonitor app health: xx running / 0 stopped / 0 unresponsive entry type every 10 seconds or so.
                              • pre systemctl restart box (when we experience the issue) I do not see much of the box:apphealthmonitor app health:xx entries. Rather, I do have a few rare box:apphealthmonitor setHealth: <<CONTAINER_UID>> (<<URL>>) waiting for 1192.461 to update health entries

                              Hopefully it helps?

                              1 Reply Last reply
                              0
                              • girishG Offline
                                girishG Offline
                                girish
                                Staff
                                wrote on last edited by
                                #20

                                @uwcrbc I think there is a bug that the apphealthmonitor is getting stuck (for some reason). We have seen this happen but very rarely and not reproducibly . If you can give us access, can you send me a mail at support@cloudron.io ? I can debug this further .

                                1 Reply Last reply
                                1
                                • N Offline
                                  N Offline
                                  Neiluj
                                  wrote on last edited by
                                  #21
                                  This post is deleted!
                                  1 Reply Last reply
                                  0
                                  • U Offline
                                    U Offline
                                    uwcrbc
                                    wrote on last edited by
                                    #22

                                    @girish Thanks for this. Access given / mail sent. Appreciate the help.

                                    1 Reply Last reply
                                    0
                                    • robiR Offline
                                      robiR Offline
                                      robi
                                      wrote on last edited by
                                      #23

                                      I think I have another instance of this bug with the health monitor.

                                      I have installed an app that does not have health checks and it shows as Not Responding in the dashboard even though it works fine.

                                      However, it seems that because of that app and the health monitor getting stuck, any new apps that get installed or upgraded also fail their health checks and remain in Starting... mode in the dashboard.

                                      I have rebooted the server and all the apps come up, other than the one mentioned above, then after this updated apps show the Starting... message.

                                      P.S.
                                      It would be really nice to add the 🔁 button for the 'cloudron' service, like we have for all other services.

                                      Conscious tech

                                      girishG 1 Reply Last reply
                                      0
                                      • girishG Offline
                                        girishG Offline
                                        girish
                                        Staff
                                        wrote on last edited by girish
                                        #24

                                        My home server got into a situation like this just now. It seems it's because the eventlog got flooded with many entries.

                                        +----------+
                                        | count(*) |
                                        +----------+
                                        |   563547 |
                                        +----------+
                                        
                                        mysql> SELECT action, COUNT(*) AS count
                                            -> FROM eventlog
                                            -> GROUP BY action
                                            -> ORDER BY count DESC;
                                        +---------------------------+--------+
                                        | action                    | count  |
                                        +---------------------------+--------+
                                        | app.up                    | 446592 |
                                        | app.down                  | 106588 |
                                        | backup.cleanup.finish     |   3664 |
                                        | app.update.finish         |   1354 |
                                        | app.update                |   1354 |
                                        | backup.finish             |    920 |
                                        | backup.start              |    920 |
                                        | cloudron.update.finish    |    847 |
                                        | cloudron.update           |    833 |
                                        | cloudron.start            |    110 |
                                        | dyndns.update             |     78 |
                                        

                                        Those app up/down eventlogs are out of hand!

                                        I nuked them manually:

                                        mysql> DELETE from eventlog WHERE action='app.up';
                                        Query OK, 446782 rows affected (8 min 1,59 sec)
                                        
                                        mysql> DELETE from eventlog WHERE action='app.down';
                                        Query OK, 106588 rows affected (9 min 16,27 sec)
                                        
                                        
                                        

                                        That took a whopping 17min just to delete entries!

                                        1 Reply Last reply
                                        3
                                        • robiR Offline
                                          robiR Offline
                                          robi
                                          wrote on last edited by
                                          #25

                                          Oh wow, thanks for finding that Girish!

                                          Are you pushing fixes?
                                          Should we do the same manually?
                                          Does box need a restart after or will apphealthmonitor just work better w/o half a million app.up entries?

                                          Conscious tech

                                          1 Reply Last reply
                                          1
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Bookmarks
                                          • Search