Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps - Status | Demo | Docs | Install
  1. Cloudron Forum
  2. Discuss
  3. Backup task crashes when a ClickHouse app deletes a temp merge dir mid-snapshot

Backup task crashes when a ClickHouse app deletes a temp merge dir mid-snapshot

Scheduled Pinned Locked Moved Discuss
bugcloudronclickhousebackup
3 Posts 2 Posters 31 Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Online
    L Online
    LoudLemur
    wrote last edited by LoudLemur
    #1

    Backup task crashes when a ClickHouse app deletes a temp merge dir mid-snapshot

    (Cannot read properties of null (reading 'sort'))

    Summary

    A full backup task aborts entirely when it reaches a ClickHouse-backed app (in my case Langfuse). The rsync syncer walks the app's data tree, but ClickHouse deletes a background-merge temp directory (data/clickhouse/store/*/tmp_merge_*) between enumeration and read. readTree gets null for that directory's contents and throws on .sort(), which propagates up and kills the whole task — not just the one app.

    Impact

    This is the important part: one racy app aborts the entire backup run. In the run below the task died at app 31 of 72, so all 41 remaining apps were silently left on the previous night's snapshot. The failure is timing-dependent (it only triggers if ClickHouse happens to be merging during the snapshot window), so backups appear intermittently broken with no config change on the user's side.

    Environment

    Cloudron version 9.2
    Server OS 9.2.0
    Ubuntu version
    Ubuntu 24.04.2 LTS Linux 6.8.0-124-generic
    Backup format rsync, encrypted (encrypted: true)
    Backup storage Hetzner Storage Box over SSH (port 23)
    Affected app Langfuse (uses ClickHouse)

    What happens

    The same vanishing-temp-dir condition shows up twice in one snapshot. First it's caught harmlessly by the precondition du, which just warns and continues:

    du: cannot access '/home/yellowtent/appsdata/5edfb175-…/data/clickhouse/store/f0a/f0a84db1-…/tmp_merge_202606_23202_23297_19': No such file or directory
    

    Then it hits the rsync syncer, which does not tolerate it and crashes the task:

    backupupload: upload completed. error: TypeError: Cannot read properties of null (reading 'sort')
        at readTree (file:///home/yellowtent/box/src/syncer.js:31:47)
        at traverse (file:///home/yellowtent/box/src/syncer.js:130:30)
        at traverse (file:///home/yellowtent/box/src/syncer.js:136:17)
        … (recursion)
        at Object.sync (file:///home/yellowtent/box/src/syncer.js:159:5)
        at sync (file:///home/yellowtent/box/src/backupformat/rsync.js:166:63)
        at Object.upload (file:///home/yellowtent/box/src/backupformat/rsync.js:336:18)
        at async Object.upload (file:///home/yellowtent/box/src/backuptask.js:101:37)
    
    tasks: setCompleted - 4798: {"result":null,"error":{"message":"Cannot read properties of null (reading 'sort')","reason":"External Error"},"percent":100}
    

    Root cause

    readTree (box/src/syncer.js:31) reads a directory's entries and sorts them. When a subdirectory is removed between the initial find/enumeration and the per-directory read, the read returns null and null.sort() throws. The exception unwinds through traverse → sync → rsync.js → backuptask.js, so the whole task is marked failed rather than the single directory being skipped.

    This is expected, normal ClickHouse behaviour, not app misbehaviour: ClickHouse continuously creates and renames/deletes tmp_merge_* (and tmp_insert_*, tmp_fetch_*) directories under store/<uuid>/ as background merges complete. Any tool that enumerates the store and then reads it will occasionally find a directory gone. ClickHouse upstream has hit and patched the equivalent race in their own tooling, e.g. https://github.com/ClickHouse/ClickHouse/pull/44874 — so accommodating it on the reader side is the standard approach.

    Suggested fix

    Make readTree resilient to a directory disappearing mid-traversal instead of propagating null:

    • Guard against a null/undefined entry list before .sort().
    • On ENOENT (or a null read) for a directory that vanished after enumeration, treat it as empty / skip it and continue the walk, rather than aborting.

    This matches how du already behaves in the same task (warn and continue) and would make every ClickHouse-based package back up reliably.

    Why this matters beyond Langfuse

    Any package built on ClickHouse is affected the same way — Plausible, PostHog, SigNoz, Langfuse, etc. As more analytics apps adopt ClickHouse, more users will see intermittent, hard-to-diagnose full-backup aborts where the visible symptom (a stale backup on an unrelated app) is far from the actual cause.

    EDIT:

    Workarounds in the meantime

    • Exclude the ClickHouse app from automatic backups so it can't abort the whole run, and back it up separately.
    • Note: stopping the app and then running cloudron backup create --app <fqdn> does not work, because Cloudron does not back up a stopped app. A quiesced backup therefore has to be a filesystem-level snapshot taken outside the platform. Otherwise, because the failure is a timing collision, simply re-running the scheduled backup usually clears it.
    1 Reply Last reply
    1
    • L LoudLemur referenced this topic
    • L Online
      L Online
      LoudLemur
      wrote last edited by
      #2

      Following up on our above report, confirming this still reproduces on Cloudron 9.2.0, with the exact spot in the source in case it helps.

      We hit it on a real app backup just now:

      App backup error: Backup failed: Cannot read properties of null (reading 'sort')
      

      It reproduced on the second backup attempt under active ClickHouse merge load, so the race window isn't narrow.

      9.2.0 does add a guard, but it sits after the .sort(). In box/src/syncer.js, readTree is:

      const names = safe.fs.readdirSync(dirPath).sort();   // line 31
      if (!names) return [];                                // line 32
      

      @cloudron/safetydance's readdirSync returns null on a vanished directory (the ENOENT when a ClickHouse tmp_merge_* part is renamed/removed mid-snapshot). So line 31 evaluates null.sort() and throws before the line-32 if (!names) guard can run — the guard is dead code for this crash, and the whole-server backup run still aborts.

      The fix is to null-check before .sort():

      const names = safe.fs.readdirSync(dirPath);
      if (!names) return [];
      names.sort();
      

      (or (safe.fs.readdirSync(dirPath) || []).sort()). The same file's readCache (around lines 22–23) already does it in this order — assign, null-check, then use — so this just makes readTree consistent with the existing pattern in the file.

      Happy to test a patch against a live ClickHouse-bundling app under merge load.

      1 Reply Last reply
      1
      • girishG Offline
        girishG Offline
        girish
        Staff
        wrote last edited by
        #3

        @loudlemur thanks for the report. Fixed in https://git.cloudron.io/platform/box/-/commit/33f3ca39996ad5b3af2b5dd2320d1552dd8952c0

        1 Reply Last reply
        0

        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

        With your input, this post could be even better 💗

        Register Login
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • Bookmarks
        • Search