Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    Solved Nextcloud in Error state even though it's running (after Cloudron 5.5 update)

    Nextcloud
    4
    38
    222
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      msbt App Dev last edited by msbt

      Experiencing an odd behaviour on one of my servers with a 700GB Nextcloud instance. The dashboard/app info says "Error : - Error restoring postgresql. Status code: 500 message: Failed to import database. Code: 3"

      Restarting the app didn't change anything, stopping doesn't work because it's in an erronous state.

      Error logs show this:

      Aug 13 01:12:04 box:tasks setCompleted - 4453: {"result":null,"error":{"stack":"BoxError: Unknown install command in apptask:error\n at /home/yellowtent/box/src/apptask.js:1070:29\n at /home/yellowtent/box/src/apps.js:520:13\n at Query.<anonymous> (/home/yellowtent/box/src/appdb.js:147:13)\n at Query.<anonymous> (/home/yellowtent/box/node_modules/mysql/lib/Connection.js:526:10)\n at Query._callback (/home/yellowtent/box/node_modules/mysql/lib/Connection.js:488:16)\n at Query.Sequence.end (/home/yellowtent/box/node_modules/mysql/lib/protocol/sequences/Sequence.js:83:24)\n at Query._handleFinalResultPacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/sequences/Query.js:149:8)\n at Query.EofPacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/sequences/Query.js:133:8)\n at Protocol._parsePacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/Protocol.js:291:23)\n at Parser._parsePacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/Parser.js:433:10)","name":"BoxError","reason":"Internal Error","details":{},"message":"Unknown install command in apptask:error"}}
      
      Aug 13 01:12:04 box:tasks 4453: {"percent":100,"result":null,"error":{"stack":"BoxError: Unknown install command in apptask:error\n at /home/yellowtent/box/src/apptask.js:1070:29\n at /home/yellowtent/box/src/apps.js:520:13\n at Query.<anonymous> (/home/yellowtent/box/src/appdb.js:147:13)\n at Query.<anonymous> (/home/yellowtent/box/node_modules/mysql/lib/Connection.js:526:10)\n at Query._callback (/home/yellowtent/box/node_modules/mysql/lib/Connection.js:488:16)\n at Query.Sequence.end (/home/yellowtent/box/node_modules/mysql/lib/protocol/sequences/Sequence.js:83:24)\n at Query._handleFinalResultPacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/sequences/Query.js:149:8)\n at Query.EofPacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/sequences/Query.js:133:8)\n at Protocol._parsePacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/Protocol.js:291:23)\n at Parser._parsePacket (/home/yellowtent/box/node_modules/mysql/lib/protocol/Parser.js:433:10)","name":"BoxError","reason":"Internal Error","details":{},"message":"Unknown install command in apptask:error"}}
      

      not sure if those are related, but it is still up and running. Any suggestions on what to do? 😉

      1 Reply Last reply Reply Quote 0
      • nebulon
        nebulon Staff last edited by

        Did you attempt to retry the restore in the repair section of the app configure view?

        1 Reply Last reply Reply Quote 0
        • M
          msbt App Dev last edited by

          Just had a quick restore session with @girish, he suggested that even though postgres had 3,5GB of RAM available, that this still wasn't enough to import/migrate a 400MB+ dump of the database. We upped the limit to 4GB and did another restore, this fixed the app status. I rescanned the files now and waiting for feedback if any other stuff is missing.

          1 Reply Last reply Reply Quote 0
          • girish
            girish Staff last edited by

            What happened was that the db migration failed because postgres wanted more memory. What I did was to give it more memory and trigger a in-place import. That did the trick.

            1 Reply Last reply Reply Quote 0
            • necrevistonnezr
              necrevistonnezr last edited by necrevistonnezr

              Something is not right with my Nextcloud instance, either, after the Cloudron 5.5 update. I had to increase the memory to 8 GB and CPU to 50 %, otherwise the app was in a "not responding" state.

              All clients (mac, PC, iOS) are in an endless loop to sync but never actually do. My Nextcloud website takes forever to load (all other Cloudron services like FreshRSS are fine). I re-setup the iOS client which takes forever. After entering credentials on the login dialog, I'm not being redirected to the app but I see a webview of Nextcloud.

              The Nextcloud logs don't show anything odd at a first glance except this:

              "Aug 18 09:23:29 [Tue Aug 18 07:23:29.826254 2020] [rewrite:error] [pid 8495] [client 172.18.0.1:50318] AH00670: Options FollowSymLinks and SymLinksIfOwnerMatch are both off, so the RewriteRule directive is also forbidden due to its similar ability to circumvent directory restrictions : /app/code/config"
              

              and (?)

              Aug 18 09:27:12 58:C 18 Aug 07:27:12.149 * DB saved on disk
              Aug 18 09:27:12 58:C 18 Aug 07:27:12.159 * RDB: 0 MB of memory used by copy-on-write
              Aug 18 09:27:12 15:M 18 Aug 07:27:12.220 * Background saving terminated with success
              
              girish 1 Reply Last reply Reply Quote 0
              • necrevistonnezr
                necrevistonnezr last edited by

                Also: Suddenly there's a new folder "uploads" that wasn't there before and that I didn't create.

                1 Reply Last reply Reply Quote 0
                • necrevistonnezr
                  necrevistonnezr last edited by

                  I think the culprit is PostgreSQL 11 - was that recently changed in the Nextcloud Docker? My CPU runs at 100 % the whole time....

                  Anmerkung 2020-08-18 153717.png

                  1 Reply Last reply Reply Quote 0
                  • girish
                    girish Staff last edited by

                    Yes, Cloudron moved to Postgres 11 in the previous release (Cloudron 5.5). Can you just try restarting Postgres under services?

                    Another thing is in /home/yellowtent/platformdata/logs/box.log do you see some error like Error importing postgresql ?

                    necrevistonnezr 1 Reply Last reply Reply Quote 0
                    • girish
                      girish Staff @necrevistonnezr last edited by

                      @necrevistonnezr said in Nextcloud in Error state even though it's running (after Cloudron 5.5 update):

                      Aug 18 09:27:12 58:C 18 Aug 07:27:12.149 * DB saved on disk
                      Aug 18 09:27:12 58:C 18 Aug 07:27:12.159 * RDB: 0 MB of memory used by copy-on-write
                      Aug 18 09:27:12 15:M 18 Aug 07:27:12.220 * Background saving terminated with success

                      This one is from redis, you can ignore it.

                      1 Reply Last reply Reply Quote 0
                      • necrevistonnezr
                        necrevistonnezr @girish last edited by

                        @girish said in Nextcloud in Error state even though it's running (after Cloudron 5.5 update):

                        Yes, Cloudron moved to Postgres 11 in the previous release (Cloudron 5.5). Can you just try restarting Postgres under services?

                        Another thing is in /home/yellowtent/platformdata/logs/box.log do you see some error like Error importing postgresql ?

                        No error in box.log
                        After restarting Postgres, it immediateley goes back to 100 % CPU.

                        Anmerkung 2020-08-18 162033.png

                        1 Reply Last reply Reply Quote 0
                        • girish
                          girish Staff last edited by

                          Just to narrow the issue down, if you stop the nextcloud app, does the postgresql cpu usage go back to normal? From the screenshot it seems it's busy in some SELECT command.

                          necrevistonnezr 1 Reply Last reply Reply Quote 0
                          • necrevistonnezr
                            necrevistonnezr @girish last edited by

                            @girish said in Nextcloud in Error state even though it's running (after Cloudron 5.5 update):

                            Just to narrow the issue down, if you stop the nextcloud app, does the postgresql cpu usage go back to normal? From the screenshot it seems it's busy in some SELECT command.

                            After stopping the app, CPU cores go down to the usual 5-15 %

                            Anmerkung 2020-08-18 163519.png

                            1 Reply Last reply Reply Quote 0
                            • girish
                              girish Staff last edited by girish

                              @necrevistonnezr Do you think you can stop the existing nextcloud and then maybe clone from the latest backup and check if postgres is still using a lot of CPU? If it works out, maybe you can then just move stopped nextcloud into another domain and then put the cloned one there.

                              necrevistonnezr 2 Replies Last reply Reply Quote 0
                              • necrevistonnezr
                                necrevistonnezr @girish last edited by necrevistonnezr

                                @girish said in Nextcloud in Error state even though it's running (after Cloudron 5.5 update):

                                @necrevistonnezr Do you think you can stop the existing nextcloud and then maybe clone from the latest backup and check if postgres is still using a lot of CPU? If it works out, maybe you can then just move stopped nextcloud into another domain and then put the cloned one there.

                                Clone Nextcloud into another subdomain you mean? How do I do that?

                                EDIT: Found it.

                                1 Reply Last reply Reply Quote 0
                                • nebulon
                                  nebulon Staff last edited by

                                  What was the root cause if you found it?
                                  On a side note postgres really gets hammered with SELECTs during for example a rescan of nextcloud files.

                                  necrevistonnezr 1 Reply Last reply Reply Quote 0
                                  • necrevistonnezr
                                    necrevistonnezr @nebulon last edited by

                                    @nebulon said in Nextcloud in Error state even though it's running (after Cloudron 5.5 update):

                                    What was the root cause if you found it?
                                    On a side note postgres really gets hammered with SELECTs during for example a rescan of nextcloud files.

                                    I meant I found the cloning process, I haven't found the cause for the CPU spikes.
                                    I'm trying go clone a backup to a new subdomain but I don't have enough free space to clone a 300 GB Nextcloud instance... 😞

                                    1 Reply Last reply Reply Quote 0
                                    • necrevistonnezr
                                      necrevistonnezr @girish last edited by necrevistonnezr

                                      @girish said in Nextcloud in Error state even though it's running (after Cloudron 5.5 update):

                                      @necrevistonnezr Do you think you can stop the existing nextcloud and then maybe clone from the latest backup and check if postgres is still using a lot of CPU? If it works out, maybe you can then just move stopped nextcloud into another domain and then put the cloned one there.

                                      I did that now. Took 10 hours. Result is the same. 100 % CPU on Postgres on Nextcloud (app id 410c...). This is HUGELY frustrating. And I can't even login, it takes forever.

                                      Clipboard01.jpg

                                      1 Reply Last reply Reply Quote 0
                                      • nebulon
                                        nebulon Staff last edited by

                                        For a start, do you have some nextcloud client running on your laptop or so? Maybe that fires requests like crazy and thus hammering postgres as a result?

                                        1 Reply Last reply Reply Quote 0
                                        • necrevistonnezr
                                          necrevistonnezr last edited by

                                          No, I switched off all clients on purpose - and after cloning to a new subdomain, there would be no connection, anyway.

                                          1 Reply Last reply Reply Quote 0
                                          • nebulon
                                            nebulon Staff last edited by

                                            maybe some plugin causes this? Can you use the occ tool via terminal into the app to disable some?

                                            necrevistonnezr 1 Reply Last reply Reply Quote 0
                                            • necrevistonnezr
                                              necrevistonnezr @nebulon last edited by necrevistonnezr

                                              @nebulon said in Nextcloud in Error state even though it's running (after Cloudron 5.5 update):

                                              maybe some plugin causes this? Can you use the occ tool via terminal into the app to disable some?

                                              I think only the bare minimum is enabled....

                                              occ app:list
                                              Enabled:
                                                - accessibility: 1.5.0
                                                - activity: 2.12.0
                                                - admin_audit: 1.9.0
                                                - calendar: 2.0.3
                                                - cloud_federation_api: 1.2.0
                                                - comments: 1.9.0
                                                - contacts: 3.3.0
                                                - contactsinteraction: 1.0.0
                                                - dav: 1.15.0
                                                - encryption: 2.7.0
                                                - federatedfilesharing: 1.9.0
                                                - files: 1.14.0
                                                - files_external: 1.10.0
                                                - files_pdfviewer: 1.8.0
                                                - files_rightclick: 0.16.0
                                                - files_sharing: 1.11.0
                                                - files_trashbin: 1.9.0
                                                - files_versions: 1.12.0
                                                - files_videoplayer: 1.8.0
                                                - firstrunwizard: 2.8.0
                                                - logreader: 2.4.0
                                                - lookup_server_connector: 1.7.0
                                                - nextcloud_announcements: 1.8.0
                                                - notifications: 2.7.0
                                                - oauth2: 1.7.0
                                                - password_policy: 1.9.1
                                                - photos: 1.1.0
                                                - privacy: 1.3.0
                                                - provisioning_api: 1.9.0
                                                - recommendations: 0.7.0
                                                - serverinfo: 1.9.0
                                                - settings: 1.1.0
                                                - sharebymail: 1.9.0
                                                - spreed: 9.0.3
                                                - support: 1.2.1
                                                - systemtags: 1.9.0
                                                - text: 3.0.1
                                                - theming: 1.10.0
                                                - twofactor_backupcodes: 1.8.0
                                                - twofactor_totp: 4.1.3
                                                - updatenotification: 1.9.0
                                                - user_ldap: 1.9.0
                                                - viewer: 1.3.0
                                                - workflowengine: 2.1.0
                                              Disabled:
                                                - bookmarks
                                                - bruteforcesettings
                                                - documentserver_community
                                                - federation
                                                - mail
                                                - maps
                                                - ransomware_detection
                                                - survey_client
                                                - tasks
                                                - twofactor_admin
                                              
                                              1 Reply Last reply Reply Quote 0
                                              • nebulon
                                                nebulon Staff last edited by

                                                This is very strange, since if noone accesses nextcloud there shouldn't be long-running processes accessing the database either.

                                                1 Reply Last reply Reply Quote 0
                                                • necrevistonnezr
                                                  necrevistonnezr last edited by

                                                  To be clear: Postgres goes crazy if I try to login from a browser or a client...

                                                  necrevistonnezr 1 Reply Last reply Reply Quote 0
                                                  • necrevistonnezr
                                                    necrevistonnezr @necrevistonnezr last edited by

                                                    Can I somehow go back to an earlier Postgres version? This right now is killing my server and my workflow.

                                                    1 Reply Last reply Reply Quote 0
                                                    • nebulon
                                                      nebulon Staff last edited by

                                                      You would have to reinstall Cloudron for that old version altogether 😕 maybe if you enable remote SSH support we could take a more direct look, if so please mail your domain to support@cloudron.io

                                                      1 Reply Last reply Reply Quote 0
                                                      • girish
                                                        girish Staff last edited by girish

                                                        This is not related to the thread directly (but I was wondering about if we do make db rollbacks even possible). Do you use other apps that use Postgres? I realize this is not immediately obvious and hard to tell 😕 . For example, GitLab is now incompatible with the older postgres and then some of the newer apps like loomio require some of the Postgres extensions we have enabled (maybe one of these extensions is causing CPU use). If it's possible, as @nebulon said we can take a look.

                                                        1 Reply Last reply Reply Quote 0
                                                        • girish
                                                          girish Staff last edited by girish

                                                          @necrevistonnezr Ah ok, I guess all of them are mysql. That does make it easier to debug. Please write when possible, we can look into it asap.

                                                          1 Reply Last reply Reply Quote 0
                                                          • girish
                                                            girish Staff last edited by girish

                                                            We hit this with another user now and I think the root cause is that the migration only partially imported the database. This is causing nextcloud to do a log of queries (maybe some internal loop).

                                                            To fix this (please do all this carefully, if you not are confident just reach out to support@cloudron.io and we can do it for you):

                                                            1. Give the postgresql service a lot more memory (Services -> PostgreSQL). There is no good number for this, just give it as much as you can. It's harmless since you can always scale it down later after the import.

                                                            2. First, identify the backup of the app that was created before the Cloudron updated to 5.5.0. From this backup, copy over the postgresqldump file. Assuming f6e87030-2102-4c6c-b8eb-b2d86a268917 is the id of the nextcloud app:

                                                            # cp /home/yellowtent/appsdata/f6e87030-2102-4c6c-b8eb-b2d86a268917/postgresqldump /root/postgresqldump.current
                                                            
                                                            # cp /from/the/app/backups/postgresqldump /home/yellowtent/appsdata/f6e87030-2102-4c6c-b8eb-b2d86a268917/postgresqldump
                                                            

                                                            On your PC/Mac (not Cloudron!), then use the CLI tool to import the data in-place. This command simple re-imports the database that we just copied above.

                                                            $ cloudron import --in-place --app nextcloud.domain.com
                                                            

                                                            If you had generated some files in the past few days, you should run the occ scan again - https://cloudron.io/documentation/apps/nextcloud/#rescan-files after nextcloud is running again.

                                                            1 Reply Last reply Reply Quote 0
                                                            • girish
                                                              girish Staff last edited by

                                                              CPU usage after the re-import:

                                                              9b41601f-bea0-4c01-b522-666937967af4-image.png

                                                              1 Reply Last reply Reply Quote 0
                                                              • necrevistonnezr
                                                                necrevistonnezr last edited by

                                                                Since I was pressured for time, I re-setup Nextcloud from scratch, imported the backup and went that route. Really stressful and I hope I don't have to do that again. Makes you realize why you pay for certain cloud services...

                                                                1 Reply Last reply Reply Quote 0
                                                                • girish
                                                                  girish Staff last edited by

                                                                  @necrevistonnezr thanks for the update. We have fixed the code in the meantime that causes this.

                                                                  necrevistonnezr 1 Reply Last reply Reply Quote 0
                                                                  • necrevistonnezr
                                                                    necrevistonnezr @girish last edited by necrevistonnezr

                                                                    @girish Now I know why support didn't work out: Cloudron blocked my answer from my Cloudron mail account to you guys - as mail relay via Sendgrid - as spam.... (!)
                                                                    FYI: the shown IP 167.89.12.138 does indeed belong to Sendgrid.

                                                                    Screenshot of SendGrid.jpg

                                                                    So mail relay via Sendgrid from the Cloudron mail server is not reliable, I guess....

                                                                    1 Reply Last reply Reply Quote 0
                                                                    • girish
                                                                      girish Staff last edited by

                                                                      Ah, looks like the sendgrid IP is blacklisted by zen spamhaus (which cloudron uses by default).

                                                                      $ host -t TXT 138.12.89.167.zen.spamhaus.org
                                                                      138.12.89.167.zen.spamhaus.org descriptive text "https://www.spamhaus.org/sbl/query/SBL491387"
                                                                      

                                                                      https://www.spamhaus.org/sbl/query/SBL491387 says phishing mails are originating from that IP. Can you tell sendgrid about this (the link says you as customer can do nothing about it)?

                                                                      1 Reply Last reply Reply Quote 0
                                                                      • necrevistonnezr
                                                                        necrevistonnezr last edited by

                                                                        I reported it to Sendgrid, this was their answer:

                                                                        When sending email through an account that is under the Free or Essentials pricing package, your account will be utilizing a shared IP pool. Being grouped with others in a pool of shared IP's can offer several benefits, especially if you are only sending a moderate amount of email.
                                                                        Although there are benefits to sending on shared IP's, there are also risks which can sometimes produce unintended negative consequences. If some of these users display poor sending habits or behavior, it can negatively affect others (you) within the group.

                                                                        Essentially, you need to be on a paid plan, otherwise you end up in spam lists. The thing is, no one tells you that. And you only find out that your mail was blocked when you login to the account and go to the "Blocked" subsite. In my case, I found out that a job application didn't go through 14 days ago. I get that they wanna sell you something but at least tell me about it. I learned the hard way, this is the end for me for self hosted mail. Imapsynced my Cloudron mail back to my old provider and that's that.

                                                                        1 Reply Last reply Reply Quote 0
                                                                        • M
                                                                          msbt App Dev last edited by

                                                                          You can use their web api and catch events like those through webhooks. Not sure if I set it up at sendgrid, because I left after a few days of testing (and for that exact reason, getting randomly blocked because of bad IPs) and went to Mailjet.

                                                                          There I've got a webhook which is pulled by a zapier task a few times a day, which notifies me when an email got blocked/bounced, maybe that's something to consider to set up.

                                                                          1 Reply Last reply Reply Quote 1
                                                                          • M
                                                                            msbt App Dev last edited by

                                                                            Correction: it wasn't zapier, but integromat

                                                                            1 Reply Last reply Reply Quote 1
                                                                            • necrevistonnezr
                                                                              necrevistonnezr last edited by

                                                                              Related to Sendgrid and why their IPs are identified as spam sources: https://krebsonsecurity.com/2020/08/sendgrid-under-siege-from-hacked-accounts/

                                                                              1 Reply Last reply Reply Quote 1
                                                                              • M
                                                                                msbt App Dev last edited by

                                                                                well that was months ago, so they had that problem before that as well

                                                                                1 Reply Last reply Reply Quote 0
                                                                                • First post
                                                                                  Last post
                                                                                Powered by NodeBB