Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. ArchiveBox
  3. ArchiveBox default installation exposes private data and uploads to archive.org without user consent

ArchiveBox default installation exposes private data and uploads to archive.org without user consent

Scheduled Pinned Locked Moved Unsolved ArchiveBox
5 Posts 5 Posters 607 Views 6 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G Offline
    G Offline
    go-run-jump
    wrote on last edited by
    #1

    Bug Report: ArchiveBox on Cloudron

    ArchiveBox default installation exposes private data and uploads to archive.org without user consent

    Description:

    In the standard installation of ArchiveBox on Cloudron, all content is publicly accessible by default, and archived content is automatically uploaded to archive.org. This behavior is unexpected and potentially harmful for users of a platform like Cloudron, which is often used for personal or sensitive data storage.

    Steps to reproduce:

    1. Install ArchiveBox on Cloudron using the standard installation process.
    2. Add content to be archived.
    3. Observe that the content is publicly accessible and being uploaded to archive.org.

    Expected behavior:

    The default installation should prioritize user privacy and data protection. The following settings should be set as standard in ArchiveBox.conf:

    [PRIVACY]
    SAVE_ARCHIVE_DOT_ORG = False
    PUBLIC_INDEX = False
    PUBLIC_SNAPSHOTS = False
    PUBLIC_ADD_VIEW = False
    

    Actual behavior:

    • All archived content is publicly accessible.
    • Content is automatically uploaded to archive.org without user consent.
    • Users must manually change privacy settings after installation.

    Impact:

    • Potential exposure of sensitive or private information.
    • Unauthorized distribution of copyrighted or confidential material.
    • Users may face difficulties in removing unintentionally uploaded content from archive.org.

    Suggested fix:

    Update the default installation configuration to include the privacy settings mentioned above. This will ensure that user data remains private by default, and no automatic uploads to archive.org occur without explicit user consent.

    1 Reply Last reply
    3
    • girishG Offline
      girishG Offline
      girish
      Staff
      wrote on last edited by
      #2

      @go-run-jump thanks for reporting, we will investigate .

      1 Reply Last reply
      1
      • girishG girish marked this topic as a question on
      • P Offline
        P Offline
        pirate
        wrote on last edited by pirate
        #3

        To clarify, no archived data is "uploaded" to Archive.org, only URLs are sent to them, and they only archive things that are publically accessible on the web (which they could arguably find through other means). If the URL requires cookies or a login of any kind, they do not archive it or store the URL.

        Here are more in-depth explanations for the reasoning behind this default:

        • https://news.ycombinator.com/item?id=26866689
        • https://github.com/ArchiveBox/ArchiveBox#archiving-private-content
        • https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview

        In summary, ArchiveBox in its default mode should not be used on anything other than public URLs. We cannot make private mode the default (which would imply it's safe for novice users) because of this major security risk it incurs: https://github.com/ArchiveBox/ArchiveBox#security-risks-of-viewing-archived-js (as is explained in several locations in the docs).

        If you are bold enough to attempt archiving private content, there is an detailed process involving reading about the security risks, setting up HTTPS ingress / separate domains content and the admin UI, changing UI permissions, setting up an admin user, changing several defaults, etc. SAVE_ARCHIVE_DOT_ORG=False is just a small piece of the threat model.

        I understand it's a controversial decision that alienates some users, but making it wide-open by default was an intentional choice so users are more immediately aware that it's designed for low-security public archival out-of-the-box, with involved configuration needed to change that.

        If you insist on changing the default cloudron config to be closer to private mode, I ask that you at least force users to read the two docs links I shared above at some point in the setup process to understand that more hardening is needed for safe archival of private data.

        1 Reply Last reply
        4
        • jdaviescoatesJ Offline
          jdaviescoatesJ Offline
          jdaviescoates
          wrote on last edited by
          #4

          I think the defaults should be kept as is myself

          I use Cloudron with Gandi & Hetzner

          1 Reply Last reply
          0
          • nebulonN Away
            nebulonN Away
            nebulon
            Staff
            wrote on last edited by
            #5

            @pirate thanks for the elaborate explanation! This makes sense. We anyways try to stay as close with the default configs for app packages as upstream usually always knows best and often has deeper reasoning as seen in this example again.

            1 Reply Last reply
            3
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • Bookmarks
            • Search