Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Discuss
  3. Scaling / High Availability Cloudron Setup

Scaling / High Availability Cloudron Setup

Scheduled Pinned Locked Moved Discuss
scalabilitymulti-host
41 Posts 15 Posters 7.5k Views 21 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N NCKNE

    I would also love to see a HA setup for larger installations (which in my opinion in many cases have the need for some kind of identity provider solution such as shibboleth or FreeIPA for external apps as well though). The Nomad solution looks very promising and could possibly be implemented as a paid premium add-on for larger installations.

    I was personally thinking about a very simple solution for an active-passive setup with just two instances using the snapshotted backups. The backups could be (incremental rsync) replicated to a passive instance that would store them locally for a very quick restore. Incremental syncs would not require much bandwidth or downtime and the restore of locally stored backups would be fairly quick.

    Switching back from the the formerly passive instance to the previously failed of newly setup instance would most likely have to be done manually. A fully automated cluster with recovery would require at least three hosts (quorum) and might be too much overhead for smaller instances.

    Would this be something that could be considered in future developments?

    jimcavoliJ Offline
    jimcavoliJ Offline
    jimcavoli
    App Dev
    wrote on last edited by
    #4

    @NCKNE Pieces of that definitely could be - I wonder about the appetite for a hot/cold HA standby setup in the community versus an active-active clustered sort of approach. I know for myself, I'm not a big fan of paying for servers to sit there "just in case" as much as I prefer to utilize a little less across more machines and have normal operating headroom with some ability to absorb failures. That's just me though, so the more input on this topic we can get to inform what everyone values, the better!

    Insofar as external app SSO goes, I very much agree that it is an important addition for the future, and I have a somewhat simplified solution that I'm working on (as opposed to the beast that is Shibboleth, since I've looked at packaging it for Cloudron and been...put off by the effort). The drive to do so has also been the thought in my mind that a big part of an app that would allow you to leverage Cloudron as an IdP would be a similar sort of flexibility that makes the rest of the system so strong. I'm aiming for a multi-system IdP app, in essence, which would allow for SAML, OAuth2, and potentially CAS exchanges for a start. I think it would be great to get RADIUS into the mix as well, though that may be better served as its own app. There are some outstanding challenges with the way the Cloudron LDAP system is set up presently, especially with respect to groups, as well as some profile fields, that will need to be sorted out before that's at its full potential, but hopefully we can get a proof of concept available at some point in the near future.

    N 1 Reply Last reply
    1
    • jimcavoliJ jimcavoli

      @NCKNE Pieces of that definitely could be - I wonder about the appetite for a hot/cold HA standby setup in the community versus an active-active clustered sort of approach. I know for myself, I'm not a big fan of paying for servers to sit there "just in case" as much as I prefer to utilize a little less across more machines and have normal operating headroom with some ability to absorb failures. That's just me though, so the more input on this topic we can get to inform what everyone values, the better!

      Insofar as external app SSO goes, I very much agree that it is an important addition for the future, and I have a somewhat simplified solution that I'm working on (as opposed to the beast that is Shibboleth, since I've looked at packaging it for Cloudron and been...put off by the effort). The drive to do so has also been the thought in my mind that a big part of an app that would allow you to leverage Cloudron as an IdP would be a similar sort of flexibility that makes the rest of the system so strong. I'm aiming for a multi-system IdP app, in essence, which would allow for SAML, OAuth2, and potentially CAS exchanges for a start. I think it would be great to get RADIUS into the mix as well, though that may be better served as its own app. There are some outstanding challenges with the way the Cloudron LDAP system is set up presently, especially with respect to groups, as well as some profile fields, that will need to be sorted out before that's at its full potential, but hopefully we can get a proof of concept available at some point in the near future.

      N Offline
      N Offline
      NCKNE
      wrote on last edited by
      #5

      @jimcavoli said in Scaling / High Availability Cloudron Setup:

      @NCKNE Pieces of that definitely could be - I wonder about the appetite for a hot/cold HA standby setup in the community versus an active-active clustered sort of approach. I know for myself, I'm not a big fan of paying for servers to sit there "just in case" as much as I prefer to utilize a little less across more machines and have normal operating headroom with some ability to absorb failures. That's just me though, so the more input on this topic we can get to inform what everyone values, the better!

      I am absolutely with you here, having passive servers just sitting there being bored and wasting energy is nothing to aim for. I was just spinning ideas in my head to allow for a quick restore in case of a failure and a simple solution could be to back up to a remote servers disk. Having lots of data (TBs) in apps like Nextcloud and only having full backups made the wish for an incremental backup to a standby location come up.

      Insofar as external app SSO goes, I very much agree that it is an important addition for the future, and I have a somewhat simplified solution that I'm working on (as opposed to the beast that is Shibboleth, since I've looked at packaging it for Cloudron and been...put off by the effort). The drive to do so has also been the thought in my mind that a big part of an app that would allow you to leverage Cloudron as an IdP would be a similar sort of flexibility that makes the rest of the system so strong. I'm aiming for a multi-system IdP app, in essence, which would allow for SAML, OAuth2, and potentially CAS exchanges for a start. I think it would be great to get RADIUS into the mix as well, though that may be better served as its own app. There are some outstanding challenges with the way the Cloudron LDAP system is set up presently, especially with respect to groups, as well as some profile fields, that will need to be sorted out before that's at its full potential, but hopefully we can get a proof of concept available at some point in the near future.

      Wow! That would be awesome and in my opinion a HUGE step for cloudron to become enterprise ready. Together with high availability / load balancing clustering, cloudron could easily be used in larger environment as well.

      1 Reply Last reply
      1
      • girishG Offline
        girishG Offline
        girish
        Staff
        wrote on last edited by
        #6

        @tkd said in Scaling / High Availability Cloudron Setup:

        Ability to use floating IPs

        Note that this is possible already. Get a floating IP and then go to Network view and put the IP there. Cloudron will now use that IP for the DNS. Many users already use it this way with Elastic IP as well.

        Ability to scale based on the number of applications running / resources needed - adding additional Cloudron nodes?

        This is in our radar and definitely doable but the biggest challenge for us has been to justify implementing these features as we haven't found customers who would be willing to pay $ for complex features like these. If you are in the enterprise/medium business bracket and willing to work with us here, please contact us on support@cloudron.io.

        1 Reply Last reply
        5
        • marcusquinnM Offline
          marcusquinnM Offline
          marcusquinn
          wrote on last edited by
          #7

          Certainly a fan of Cloudron and subscriber.

          Also a fan of (the perhaps lesser known) D2C.io, which does make clustered HA setups pretty easy but doesn't have the same App ecosystem or community yet.

          Maybe you guys could collaborate?

          Web Design https://www.evergreen.je
          Development https://brandlight.org
          Life https://marcusquinn.com

          jdaviescoatesJ 1 Reply Last reply
          1
          • marcusquinnM marcusquinn

            Certainly a fan of Cloudron and subscriber.

            Also a fan of (the perhaps lesser known) D2C.io, which does make clustered HA setups pretty easy but doesn't have the same App ecosystem or community yet.

            Maybe you guys could collaborate?

            jdaviescoatesJ Offline
            jdaviescoatesJ Offline
            jdaviescoates
            wrote on last edited by
            #8

            I also note that Hetzner Cloud have just added a load balancers feature which I think could be used to scale Cloudron too, see

            https://www.hetzner.com/news/07-2020-load-balancer/

            I use Cloudron with Gandi & Hetzner

            marcusquinnM 1 Reply Last reply
            0
            • jdaviescoatesJ jdaviescoates

              I also note that Hetzner Cloud have just added a load balancers feature which I think could be used to scale Cloudron too, see

              https://www.hetzner.com/news/07-2020-load-balancer/

              marcusquinnM Offline
              marcusquinnM Offline
              marcusquinn
              wrote on last edited by
              #9

              @jdaviescoates I think most hosts offer load balancers - but keeping the LB within the app would keep it more portable.

              I respect that D2C has overlap but I still see them as distinct and potential cross-pollination. Any example of the clustered app setups and minimum containers can be seen here: https://docs.d2c.io/getting-started/stack-hub/

              D2C is still a hosted and proprietary solution including support services, whereas Clouron is open-source but less GUI to tinker or be intimidating, so the target audiences are different, and it might be that the way D2C works not being open-source is not compatible and Kubernetes (a la https://kubeapps.com) would be a more compatible approach?

              Web Design https://www.evergreen.je
              Development https://brandlight.org
              Life https://marcusquinn.com

              1 Reply Last reply
              0
              • robiR Offline
                robiR Offline
                robi
                wrote on last edited by
                #10

                Having had a recent look at Portainer again, they seem to indicate Edge management capability and is OSS.

                Would it make sense to use some of their code/concepts for a Multi-Cloudron use case?

                Conscious tech

                jimcavoliJ 1 Reply Last reply
                1
                • robiR robi

                  Having had a recent look at Portainer again, they seem to indicate Edge management capability and is OSS.

                  Would it make sense to use some of their code/concepts for a Multi-Cloudron use case?

                  jimcavoliJ Offline
                  jimcavoliJ Offline
                  jimcavoli
                  App Dev
                  wrote on last edited by
                  #11

                  @robi From what I can tell, Portainer seems like a management interface for many orchestrators, which seems a level removed from the actual cluster scheduler itself, and therefore a bit higher-level than the next step we'd need in the cloudron journey to clustered operation. Frankly, even though I've advocated (and will likely continue to 🙂 ) for a largely HashiCorp-based approach, the first/easiest thing might be to experiment with Docker Swarm. I'm not a particular fan of Swarm, but to be fair it's been a while since I seriously evaluated it. Still a big fan of Nomad specifically for this particular use case, and I think it is the best fit for the problem. I do have a bit of work done on the "HashiStack" approach already, but it's going to be a pretty seismic change if I get it finished, and I've not yet explored all the tendrils from the management/box side that will need to be updated. I can try to get some more serious progress laid down on that around the Christmas holidays, I hope.

                  1 Reply Last reply
                  3
                  • marcusquinnM Offline
                    marcusquinnM Offline
                    marcusquinn
                    wrote on last edited by
                    #12

                    I love the idea of the HashiCorp Nomad but never got around to testing.

                    Terraform, Vagrant and Vault being their better-known products that became standards for those interested.

                    The other option being a Cloudron home-cooked solution with HA-Proxy, Nginx-Cluster, Unison, and whatever DB clustered versions.

                    Caution being that clustered DBs will have performance trade-offs, more-so for mirrored multi-master, and less-so for master-slave failovers.

                    Web Design https://www.evergreen.je
                    Development https://brandlight.org
                    Life https://marcusquinn.com

                    1 Reply Last reply
                    0
                    • robiR Offline
                      robiR Offline
                      robi
                      wrote on last edited by
                      #13

                      Starting from the data perspective:

                      • DRBD - https://www.linbit.com/drbd/

                      Distributed Replicated Storage System

                      DRBD®– software is a distributed replicated storage system for the Linux platform. It is implemented as a kernel driver, several userspace management applications, and some shell scripts.

                      DRBD is traditionally used in high availability (HA) computer clusters, but beginning with DRBD version 9, it can also be used to create larger software defined storage pools with a focus on cloud integration.

                      alt text


                      • LinStor - https://github.com/LINBIT/linstor-server

                      LINSTOR® is open-source software designed to manage block storage devices for large Linux server clusters. It’s used to provide persistent Linux block storage for cloudnative and hypervisor environments.


                      • OpenEBS - https://openebs.io/

                      OpenEBS enables Stateful applications to easily access Dynamic Local Persistent Volumes (PVs) or Replicated PVs. By using the Container Attached Storage pattern users report lower costs, easier management, and more control for their teams.


                      • Yugabyte DB - https://github.com/yugabyte/yugabyte-db

                      Developers get low latency reads, ACID transactions and globally consistent secondary indexes and full SQL. Develop both scale-out RDBMS and internet-scale OLTP apps with ease.

                      DBAs & Operations simplify operations with a single database that delivers linear scalability, automatic global data distribution, multi-TB density per node and rebalancing without performance bottlenecks or down time.

                      CEOs & Line of Business Owners reign in database sprawl and benefit from reduced infrastructure and software licensing costs. Deliver new features and enter new markets with more speed and agility.

                      Core Features

                      • Global Resilience
                      • Geo-replicated
                      • Strongly consistent across regions
                      • Extreme resilience to failures

                      High Performance

                      • Single-digit millisecond latency
                      • High throughput
                      • Written in C/C++

                      Internet Scale

                      • Massive write scalability
                      • App agility with flexible schemas
                      • Multi-TB data density per node

                      Cloud Native

                      • AWS, GCP, Azure, Pivotal
                      • Docker, Kubernetes
                      • Private data centers

                      Open Source

                      • 100% Apache 2.0 license
                      • PostgreSQL compatible
                      • Built-in enterprise features

                      Integrations

                      • Spring microservices
                      • Apache Kafka & KSQL
                      • Apache Spark

                      Conscious tech

                      1 Reply Last reply
                      0
                      • infogulchI Offline
                        infogulchI Offline
                        infogulch
                        wrote on last edited by infogulch
                        #14

                        I'm not sure I'd want fancy, distributed filesystems on by default for most apps. I feel like most apps would need custom changes to explicitly support distributed storage, and I'm skeptical that a blanket drop-in distributed-fs solution could meet the performance and reliability needs of the diversity of cloudron users.

                        I'd rather have multi-node app management than distributed app runtime. Manage all your cloudron nodes and assign apps between them, migrate them etc, but most apps can still only be deployed to one cloudron instance at a time. At least I think this would be a better scaling/ha goal for a v1 implementation.

                        mehdiM girishG 2 Replies Last reply
                        4
                        • infogulchI infogulch

                          I'm not sure I'd want fancy, distributed filesystems on by default for most apps. I feel like most apps would need custom changes to explicitly support distributed storage, and I'm skeptical that a blanket drop-in distributed-fs solution could meet the performance and reliability needs of the diversity of cloudron users.

                          I'd rather have multi-node app management than distributed app runtime. Manage all your cloudron nodes and assign apps between them, migrate them etc, but most apps can still only be deployed to one cloudron instance at a time. At least I think this would be a better scaling/ha goal for a v1 implementation.

                          mehdiM Offline
                          mehdiM Offline
                          mehdi
                          App Dev
                          wrote on last edited by
                          #15

                          @infogulch said in Scaling / High Availability Cloudron Setup:

                          I'd rather have multi-node app management than distributed app runtime. Manage all your cloudron nodes and assign apps between them, migrate them etc, but most apps can still only be deployed to one cloudron instance at a time. At least I think this would be a better scaling/ha goal for a v1 implementation.

                          I totally agree, and I think it's the way the cloudron team is headed for the V1 🙂

                          1 Reply Last reply
                          6
                          • infogulchI infogulch

                            I'm not sure I'd want fancy, distributed filesystems on by default for most apps. I feel like most apps would need custom changes to explicitly support distributed storage, and I'm skeptical that a blanket drop-in distributed-fs solution could meet the performance and reliability needs of the diversity of cloudron users.

                            I'd rather have multi-node app management than distributed app runtime. Manage all your cloudron nodes and assign apps between them, migrate them etc, but most apps can still only be deployed to one cloudron instance at a time. At least I think this would be a better scaling/ha goal for a v1 implementation.

                            girishG Offline
                            girishG Offline
                            girish
                            Staff
                            wrote on last edited by
                            #16

                            @infogulch Yes, I recall overthinking it that way (i.e trying to scale and distribute etc) but @mehdi corrected my thoughts a while ago about this and mentioned focusing on just managing nodes. I remember writing this somewhere, but I cannot find my notes.

                            1 Reply Last reply
                            7
                            • marcusquinnM Offline
                              marcusquinnM Offline
                              marcusquinn
                              wrote on last edited by
                              #17

                              Having been down the high-availability setup path with K8S, it isn't a small ask and without compromises. I prefer to think of HA on the server level - so good servers with RAID10 or VPS that does all that for you, couple that with a solid backup and restore setup and you can get as close to HA as those more complex solutions.

                              I'd rather see focus on the multi-cloud control panel and granular backup policies first.

                              It's the same as encryption - everyone thinks they want it, until they realise how many people and policies there needs to be for key holders because of the vulnerability for loss moving from the technology to the people.

                              Web Design https://www.evergreen.je
                              Development https://brandlight.org
                              Life https://marcusquinn.com

                              P 1 Reply Last reply
                              4
                              • marcusquinnM marcusquinn

                                Having been down the high-availability setup path with K8S, it isn't a small ask and without compromises. I prefer to think of HA on the server level - so good servers with RAID10 or VPS that does all that for you, couple that with a solid backup and restore setup and you can get as close to HA as those more complex solutions.

                                I'd rather see focus on the multi-cloud control panel and granular backup policies first.

                                It's the same as encryption - everyone thinks they want it, until they realise how many people and policies there needs to be for key holders because of the vulnerability for loss moving from the technology to the people.

                                P Offline
                                P Offline
                                plusone-nick
                                wrote on last edited by
                                #18

                                @marcusquinn More like common hypervisor HA features instead of full blow K8 HA? Mainly the ability to migrate an app to a different node and further move/manage its backup and DNS

                                ✌💙+1

                                marcusquinnM 1 Reply Last reply
                                0
                                • jimcavoliJ Offline
                                  jimcavoliJ Offline
                                  jimcavoli
                                  App Dev
                                  wrote on last edited by
                                  #19

                                  k8s is not a great fit imo for cloudron without introducing much bigger changes...there are roads to that runtime with some intermediary schedulers as well though, which is why I like Nomad in this space the most. I've actually been working up a prototype using the HashiStack Consul/Nomad (plus or minus vault) to provide a distributed runtime, but that's a reasonably long way off seeing any sort of integration into the core of things. It's a big shift on its own, and needs a lot of refinement. Obviously so would a k8s approach. In the immediate term, managing across multiple full-on cloudron instances is fairly clean, and if implemented correctly, could actually still be useful in that world as well. It's the first, easiest, smallest thing to do and therefore in my opinion is valuable, regardless of where the higher-powered distributed runtime ideas go.

                                  1 Reply Last reply
                                  3
                                  • P plusone-nick

                                    @marcusquinn More like common hypervisor HA features instead of full blow K8 HA? Mainly the ability to migrate an app to a different node and further move/manage its backup and DNS

                                    marcusquinnM Offline
                                    marcusquinnM Offline
                                    marcusquinn
                                    wrote on last edited by marcusquinn
                                    #20

                                    @plusone-nick I mean as in disk hardware redundancy. Most racks have 2 of everything else. In my experience a simple server setup on a good hardware rack will outperform K8S for uptime. I lost count of the times we were restarting one thing or another with Rancher to get something working that had no reason to fail than K8S getting it's knickers in a twist.

                                    The biggest risk to data loss is always the simple minds of the users!

                                    The biggest risk to availability is always the complex minds of the tools!

                                    No-one really needs high-availability, online banking goes offline frequently for maintenance. If Google has a bad day, people make a beverage and talk to each other.

                                    HA is snake oil in my experience.

                                    Web Design https://www.evergreen.je
                                    Development https://brandlight.org
                                    Life https://marcusquinn.com

                                    robiR MooCloud_MattM 2 Replies Last reply
                                    5
                                    • marcusquinnM marcusquinn

                                      @plusone-nick I mean as in disk hardware redundancy. Most racks have 2 of everything else. In my experience a simple server setup on a good hardware rack will outperform K8S for uptime. I lost count of the times we were restarting one thing or another with Rancher to get something working that had no reason to fail than K8S getting it's knickers in a twist.

                                      The biggest risk to data loss is always the simple minds of the users!

                                      The biggest risk to availability is always the complex minds of the tools!

                                      No-one really needs high-availability, online banking goes offline frequently for maintenance. If Google has a bad day, people make a beverage and talk to each other.

                                      HA is snake oil in my experience.

                                      robiR Offline
                                      robiR Offline
                                      robi
                                      wrote on last edited by
                                      #21

                                      @marcusquinn HA, no oil.
                                      https://neverfail.com/solution/continuous-application-availability/

                                      Conscious tech

                                      1 Reply Last reply
                                      0
                                      • marcusquinnM marcusquinn

                                        @plusone-nick I mean as in disk hardware redundancy. Most racks have 2 of everything else. In my experience a simple server setup on a good hardware rack will outperform K8S for uptime. I lost count of the times we were restarting one thing or another with Rancher to get something working that had no reason to fail than K8S getting it's knickers in a twist.

                                        The biggest risk to data loss is always the simple minds of the users!

                                        The biggest risk to availability is always the complex minds of the tools!

                                        No-one really needs high-availability, online banking goes offline frequently for maintenance. If Google has a bad day, people make a beverage and talk to each other.

                                        HA is snake oil in my experience.

                                        MooCloud_MattM Offline
                                        MooCloud_MattM Offline
                                        MooCloud_Matt
                                        wrote on last edited by
                                        #22

                                        @marcusquinn
                                        You can just not have live HA, but like a soft one.
                                        If a container is on a node that is not responding you can start it on a new node.

                                        Matteo. R.
                                        Founder and Tech-Support Manager.
                                        MooCloud MSP
                                        Swiss Managed Service Provider

                                        marcusquinnM 1 Reply Last reply
                                        0
                                        • MooCloud_MattM MooCloud_Matt

                                          @marcusquinn
                                          You can just not have live HA, but like a soft one.
                                          If a container is on a node that is not responding you can start it on a new node.

                                          marcusquinnM Offline
                                          marcusquinnM Offline
                                          marcusquinn
                                          wrote on last edited by
                                          #23

                                          @moocloud_matt Thanks, I know what HA is, and how it works, and far too many options for it - it's still the wrong approach for almost all online services.

                                          One thing I've learned in all my years is to always discount, ignore and do the opposite of anyone that says "just" in any comments, because they always represent the vast difference in time and cost between saying and doing.

                                          I'm well aware of the vast industry of people peddling HA pipe-dreams - I'm pretty sure I could beat all of them for uptime, by specifically avoiding doing every single thing they recommend, and just having the tried and tested strategy of keeping it simple.

                                          If you can't take a month off and then another month doing different things without having to do any maintenance or explain anything to anyone, your stack is too complicated.

                                          All HA ever did for me was cost me an additional couple of employees just to continually maintain it, and generally take away resources and attention from the actual things users wanted.

                                          No K8S, no excessive expertise costs, and no uptime problems since because there's just less to go wrong, and less opinion to distract from the actual usage of services that funds them.

                                          Web Design https://www.evergreen.je
                                          Development https://brandlight.org
                                          Life https://marcusquinn.com

                                          MooCloud_MattM 1 Reply Last reply
                                          3
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Bookmarks
                                          • Search