Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Navigation

    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    Graphite keeps crashing OOM

    Support
    graphs oom
    6
    26
    142
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • rmdes
      rmdes last edited by girish

      So every few hours Graphite crash, even when it has 2GB of memory, if I look at the log, this is what comes up : https://paste.armada.digital/todawelexi.sql

      it seems there is Python Twisted error that keeps repeating over and over again, my log file is 500+MB with a loop of the error in the link above.

      nebulon 1 Reply Last reply Reply Quote 0
      • nebulon
        nebulon Staff @rmdes last edited by

        @rmdes can you ensure that you run the latest Cloudron version 6.2.7 ? There were fixes for graphite in the last one.

        robi rmdes 2 Replies Last reply Reply Quote 0
        • robi
          robi @nebulon last edited by

          @nebulon I've had it happen on the latest v6.2.7 as well. Something keeps spiking its memory usage.

          1 Reply Last reply Reply Quote 0
          • girish
            girish Staff last edited by

            Updating to 6.2.7 should definitely make the twisted errors sorted out. There was an error in graphite web configuration in previous releases.

            robi 1 Reply Last reply Reply Quote 0
            • robi
              robi @girish last edited by

              @girish Happened twice today with 6.2.7.
              memory limit at 640MB down from 1250MB, where it had the same behavior.

              robi 1 Reply Last reply Reply Quote 0
              • robi
                robi @robi last edited by

                @robi OOM twice more yesterday.

                It would be great if the message & email sent included the limit reached & time stamp.

                The email includes the email time though.

                1 Reply Last reply Reply Quote 2
                • robi
                  robi last edited by

                  Apr 07 09:15:39 builtins.StopIteration:
                  Apr 07 09:15:39 07/04/2021 16:15:39 :: [console] Unhandled Error
                  Apr 07 09:15:39 Traceback (most recent call last):
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 250, in inContext
                  Apr 07 09:15:39 result = inContext.theWork()
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 266, in <lambda>
                  Apr 07 09:15:39 inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 122, in callWithContext
                  Apr 07 09:15:39 return self.currentContext().callWithContext(ctx, func, *args, **kw)
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 85, in callWithContext
                  Apr 07 09:15:39 return func(*args,**kw)
                  Apr 07 09:15:39 --- <exception caught here> ---
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/carbon/writer.py", line 189, in writeForever
                  Apr 07 09:15:39 writeCachedDataPoints()
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/carbon/writer.py", line 98, in writeCachedDataPoints
                  Apr 07 09:15:39 (metric, datapoints) = cache.drain_metric()
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/carbon/cache.py", line 187, in drain_metric
                  Apr 07 09:15:39 metric = self.strategy.choose_item()
                  Apr 07 09:15:39 File "/usr/lib/python3/dist-packages/carbon/cache.py", line 116, in choose_item
                  Apr 07 09:15:39 return next(self.queue)
                  Apr 07 09:15:39 builtins.StopIteration:
                  Apr 07 09:15:40 07/04/2021 16:15:40 :: [console] Unhandled Error
                  Apr 07 09:15:40 Traceback (most recent call last):
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 250, in inContext
                  Apr 07 09:15:40 result = inContext.theWork()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 266, in <lambda>
                  Apr 07 09:15:40 inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 122, in callWithContext
                  Apr 07 09:15:40 return self.currentContext().callWithContext(ctx, func, *args, **kw)
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 85, in callWithContext
                  Apr 07 09:15:40 return func(*args,**kw)
                  Apr 07 09:15:40 --- <exception caught here> ---
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/writer.py", line 189, in writeForever
                  Apr 07 09:15:40 writeCachedDataPoints()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/writer.py", line 98, in writeCachedDataPoints
                  Apr 07 09:15:40 (metric, datapoints) = cache.drain_metric()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/cache.py", line 187, in drain_metric
                  Apr 07 09:15:40 metric = self.strategy.choose_item()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/cache.py", line 116, in choose_item
                  Apr 07 09:15:40 return next(self.queue)
                  Apr 07 09:15:40 builtins.StopIteration:
                  Apr 07 09:15:40 07/04/2021 16:15:40 :: [console] Unhandled Error
                  Apr 07 09:15:40 Traceback (most recent call last):
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 250, in inContext
                  Apr 07 09:15:40 result = inContext.theWork()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 266, in <lambda>
                  Apr 07 09:15:40 inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 122, in callWithContext
                  Apr 07 09:15:40 return self.currentContext().callWithContext(ctx, func, *args, **kw)
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 85, in callWithContext
                  Apr 07 09:15:40 return func(*args,**kw)
                  Apr 07 09:15:40 --- <exception caught here> ---
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/writer.py", line 189, in writeForever
                  Apr 07 09:15:40 writeCachedDataPoints()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/writer.py", line 98, in writeCachedDataPoints
                  Apr 07 09:15:40 (metric, datapoints) = cache.drain_metric()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/cache.py", line 187, in drain_metric
                  Apr 07 09:15:40 metric = self.strategy.choose_item()
                  Apr 07 09:15:40 File "/usr/lib/python3/dist-packages/carbon/cache.py", line 116, in choose_item
                  Apr 07 09:15:40 return next(self.queue)
                  Apr 07 09:15:40 builtins.StopIteration:
                  

                  It still runs out of memory with 2.5GB limit.

                  Also my browser tab for the logs starts timing out as if it can't keep up.. maybe it's in a continuous crash loop.

                  rmdes 1 Reply Last reply Reply Quote 1
                  • rmdes
                    rmdes @robi last edited by

                    @robi same here, continued crash loop, log file is really huge

                    1 Reply Last reply Reply Quote 0
                    • rmdes
                      rmdes @nebulon last edited by

                      @nebulon running the latest cloudron version, confirmed

                      rmdes 1 Reply Last reply Reply Quote 0
                      • rmdes
                        rmdes @rmdes last edited by

                        I don't have any visible python twisted package error anymore, but it seems the two last OOM event output this error : https://paste.armada.digital/enijufixep.coffeescript

                        1 Reply Last reply Reply Quote 0
                        • jdaviescoates
                          jdaviescoates last edited by

                          I'm also on 6.2.7 but graphite has just started crashing with OOM in the last couple of days (it's never crashed for me before then).

                          Also agree with @robi that the notifications within Cloudron really ought to include the time it happened (which I can only tell by seeing what time the email notification arrived).

                          1 Reply Last reply Reply Quote 1
                          • robi
                            robi last edited by

                            A manual restart by pushing the button in services, seems to have calmed down the crashing for now.

                            I hope the log rotator does it's job, we don't need to have so many huge logs around.

                            1 Reply Last reply Reply Quote 1
                            • jdaviescoates
                              jdaviescoates last edited by jdaviescoates

                              Just had another Graphite OOM crash.

                              Seems strange seeing it's mostly just me using my Cloudron atm and I'm not really doing anything on it.

                              What does Graphite actually do?

                              nebulon 1 Reply Last reply Reply Quote 0
                              • nebulon
                                nebulon Staff @jdaviescoates last edited by

                                @jdaviescoates that service (graphite+collectd) collects the data used in the graphs, like memory usage over time. Given that it causes issues from time to time and also we don't really utilize it well, we are thinking of maybe collecting the data on our own and ditch graphite.

                                jdaviescoates 1 Reply Last reply Reply Quote 2
                                • jdaviescoates
                                  jdaviescoates @nebulon last edited by

                                  Thanks

                                  @nebulon said in Graphite keeps crashing OOM:

                                  collecting the data on our own

                                  What would that look like?

                                  nebulon 1 Reply Last reply Reply Quote 0
                                  • nebulon
                                    nebulon Staff @jdaviescoates last edited by

                                    @jdaviescoates we don't know yet 😉

                                    scooke 1 Reply Last reply Reply Quote 1
                                    • scooke
                                      scooke @nebulon last edited by

                                      @nebulon Caprover uses Netdata... would that be possible?

                                      1 Reply Last reply Reply Quote 2
                                      • robi
                                        robi last edited by

                                        0bcb80f1-c3a8-4e0d-af61-6a02f89d7332-image.png
                                        After a server restart, graphite won't start. Reconfig doesn't help.

                                        robi 1 Reply Last reply Reply Quote 0
                                        • robi
                                          robi @robi last edited by

                                          I decided to reboot the box for security upgrades (from notifications) and it came up without errors this time.

                                          1 Reply Last reply Reply Quote 0
                                          • jdaviescoates
                                            jdaviescoates last edited by

                                            Graphite OOM, again.

                                            nebulon 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post