Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Navigation

    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    ArchiveBox -- Personal Internet Archive

    App Wishlist
    2
    2
    108
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • heliostatic
      heliostatic last edited by

      https://archivebox.io
      "ArchiveBox takes a list of website URLs you want to archive, and creates a local, static, browsable HTML clone of the content from those websites (it saves HTML, JS, media files, PDFs, images and more)."

      Can import links from:

      • Pocket, Pinboard, Instapaper
      • RSS, XML, JSON, or plain text lists
      • Browser history or bookmarks (Chrome, Firefox, Safari, IE, Opera, and more)
        Shaarli, Delicious, Reddit Saved Posts, Wallabag, Unmark.it, and any other text with links in it!

      Can save these things for each site:

      • favicon.ico favicon of the site
      • example.com/page-name.html wget clone of the site, with .html appended if not present
      • output.pdf Printed PDF of site using headless chrome
      • screenshot.png 1440x900 screenshot of site using headless chrome
      • output.html DOM Dump of the HTML after rendering using headless chrome
      • archive.org.txt A link to the saved site on archive.org
      • warc/ for the html + gzipped warc file .gz
      • media/ any mp4, mp3, subtitles, and metadata found using youtube-dl
      • git/ clone of any repository for github, bitbucket, or gitlab links
      • index.html & index.json HTML and JSON index files containing metadata and details

      There's a Docker image, as well: https://github.com/pirate/ArchiveBox

      1 Reply Last reply Reply Quote 8
      • robi
        robi last edited by

        Came across this today, looks just like a Python script.

        1 Reply Last reply Reply Quote 0
        • First post
          Last post