Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Paperless-ngx
  3. Installing Gotenberg and Tika was simpler than I expected (if you have spare VPSes)!

Installing Gotenberg and Tika was simpler than I expected (if you have spare VPSes)!

Scheduled Pinned Locked Moved Paperless-ngx
4 Posts 2 Posters 2.5k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • scookeS Offline
    scookeS Offline
    scooke
    wrote on last edited by
    #1

    Someone HAS shared this info already, but I wanted to share it again.

    I was really wanting to upload Word files. But obviously they won't get processed unless you have Gotenbern and Tika installed. Thanks to the post above from @ChristopherMag I thought I'd try to get those installed.

    My challenges: I don't want to install on the VPS with Cloudron because I don't know if that will mess it up. Conceptually I also wasn't sure what address to use for a same-VPS-install-as-Cloudron since they wouldn't be in the same Docker world ( I don't know what it's called) as Cloudron. So, I looked at my other VPS.

    VPS 2 runs CapRover, and CapRover offers Gotenberg. My first attempt failed due to a 504 error, which is some kind of timeout error. So I lengthened the access times and wait times for Gotenberg in the CapRover install form, and the second time it worked. I then copied and pasted the four lines that @ChristopherMag had in his post and edited it for my Gotenberg url*.

    Then, I needed another VPS since the RAM on VPS 2 was close to max with that latest app. Fortunately, I have other VPSes! One of them is running a LAMP setup plus a Presearch install. I figured that adding Tika to that should be fine, and if something borked up, it wouldn't be so tragic a loss. I followed https://github.com/apache/tika-docker, running docker run -d -p 127.0.0.1:9998:9998 apache/tika:<tag> (replacing localhost with my VPS' IP). It worked. Then I entered that url in the paperless-ngx paperless.conf. Voila! It all works.

    • However, I had tried to install Gotenberg using Easypanel (on yet another VPS - yeah, I'm a LEB fan), and when I entered the url from that I used the :3000. But it never worked. The Easypanel dashboard made it seem like I needed to have port 3000 as part of the url for paperless-ngx, but it never worked.

    So when I found I could use CapRover for Gotenberg, I saw that it's dashboard just gave the domain as the url, minus the port. So, I thought, "OK, I will try that." Now, in my paperless.conf, the Tika url includes the port 9998, but the Gotenberg url doesn't, and it works. I wonder if I needed to have had the port when using Easypanel, but I am not going to try because right now it's all working.

    I guess I'm surprised because in the past when I've tried to install more than one thing using Docker, by hand, it never worked. I always had to connect them somehow (I think it was more docker-compose at the time), and I could never figure it out. But maybe I understand it a bit better now, and I was pretty sure that plopping in Tika beside Presearch shouldn't mess anything up. I'm glad I could use CapRover though and Gotenberg has more working parts than Tika to function. My only concern is a warning from the Tika page, In the example above, we recommend binding the server to localhost because Docker alters iptables and may expose your tika-server to the internet. If you are confident that your tika-server is on an isolated network you can simply run: I need to do some reading to see if using the IP of the VPS might weaken that server somehow.

    A life lived in fear is a life half-lived

    timconsidineT 2 Replies Last reply
    3
    • scookeS scooke

      Someone HAS shared this info already, but I wanted to share it again.

      I was really wanting to upload Word files. But obviously they won't get processed unless you have Gotenbern and Tika installed. Thanks to the post above from @ChristopherMag I thought I'd try to get those installed.

      My challenges: I don't want to install on the VPS with Cloudron because I don't know if that will mess it up. Conceptually I also wasn't sure what address to use for a same-VPS-install-as-Cloudron since they wouldn't be in the same Docker world ( I don't know what it's called) as Cloudron. So, I looked at my other VPS.

      VPS 2 runs CapRover, and CapRover offers Gotenberg. My first attempt failed due to a 504 error, which is some kind of timeout error. So I lengthened the access times and wait times for Gotenberg in the CapRover install form, and the second time it worked. I then copied and pasted the four lines that @ChristopherMag had in his post and edited it for my Gotenberg url*.

      Then, I needed another VPS since the RAM on VPS 2 was close to max with that latest app. Fortunately, I have other VPSes! One of them is running a LAMP setup plus a Presearch install. I figured that adding Tika to that should be fine, and if something borked up, it wouldn't be so tragic a loss. I followed https://github.com/apache/tika-docker, running docker run -d -p 127.0.0.1:9998:9998 apache/tika:<tag> (replacing localhost with my VPS' IP). It worked. Then I entered that url in the paperless-ngx paperless.conf. Voila! It all works.

      • However, I had tried to install Gotenberg using Easypanel (on yet another VPS - yeah, I'm a LEB fan), and when I entered the url from that I used the :3000. But it never worked. The Easypanel dashboard made it seem like I needed to have port 3000 as part of the url for paperless-ngx, but it never worked.

      So when I found I could use CapRover for Gotenberg, I saw that it's dashboard just gave the domain as the url, minus the port. So, I thought, "OK, I will try that." Now, in my paperless.conf, the Tika url includes the port 9998, but the Gotenberg url doesn't, and it works. I wonder if I needed to have had the port when using Easypanel, but I am not going to try because right now it's all working.

      I guess I'm surprised because in the past when I've tried to install more than one thing using Docker, by hand, it never worked. I always had to connect them somehow (I think it was more docker-compose at the time), and I could never figure it out. But maybe I understand it a bit better now, and I was pretty sure that plopping in Tika beside Presearch shouldn't mess anything up. I'm glad I could use CapRover though and Gotenberg has more working parts than Tika to function. My only concern is a warning from the Tika page, In the example above, we recommend binding the server to localhost because Docker alters iptables and may expose your tika-server to the internet. If you are confident that your tika-server is on an isolated network you can simply run: I need to do some reading to see if using the IP of the VPS might weaken that server somehow.

      timconsidineT Offline
      timconsidineT Offline
      timconsidine
      App Dev
      wrote on last edited by
      #2

      @scooke interesting, thank you for the feedback.

      1 Reply Last reply
      0
      • scookeS scooke

        Someone HAS shared this info already, but I wanted to share it again.

        I was really wanting to upload Word files. But obviously they won't get processed unless you have Gotenbern and Tika installed. Thanks to the post above from @ChristopherMag I thought I'd try to get those installed.

        My challenges: I don't want to install on the VPS with Cloudron because I don't know if that will mess it up. Conceptually I also wasn't sure what address to use for a same-VPS-install-as-Cloudron since they wouldn't be in the same Docker world ( I don't know what it's called) as Cloudron. So, I looked at my other VPS.

        VPS 2 runs CapRover, and CapRover offers Gotenberg. My first attempt failed due to a 504 error, which is some kind of timeout error. So I lengthened the access times and wait times for Gotenberg in the CapRover install form, and the second time it worked. I then copied and pasted the four lines that @ChristopherMag had in his post and edited it for my Gotenberg url*.

        Then, I needed another VPS since the RAM on VPS 2 was close to max with that latest app. Fortunately, I have other VPSes! One of them is running a LAMP setup plus a Presearch install. I figured that adding Tika to that should be fine, and if something borked up, it wouldn't be so tragic a loss. I followed https://github.com/apache/tika-docker, running docker run -d -p 127.0.0.1:9998:9998 apache/tika:<tag> (replacing localhost with my VPS' IP). It worked. Then I entered that url in the paperless-ngx paperless.conf. Voila! It all works.

        • However, I had tried to install Gotenberg using Easypanel (on yet another VPS - yeah, I'm a LEB fan), and when I entered the url from that I used the :3000. But it never worked. The Easypanel dashboard made it seem like I needed to have port 3000 as part of the url for paperless-ngx, but it never worked.

        So when I found I could use CapRover for Gotenberg, I saw that it's dashboard just gave the domain as the url, minus the port. So, I thought, "OK, I will try that." Now, in my paperless.conf, the Tika url includes the port 9998, but the Gotenberg url doesn't, and it works. I wonder if I needed to have had the port when using Easypanel, but I am not going to try because right now it's all working.

        I guess I'm surprised because in the past when I've tried to install more than one thing using Docker, by hand, it never worked. I always had to connect them somehow (I think it was more docker-compose at the time), and I could never figure it out. But maybe I understand it a bit better now, and I was pretty sure that plopping in Tika beside Presearch shouldn't mess anything up. I'm glad I could use CapRover though and Gotenberg has more working parts than Tika to function. My only concern is a warning from the Tika page, In the example above, we recommend binding the server to localhost because Docker alters iptables and may expose your tika-server to the internet. If you are confident that your tika-server is on an isolated network you can simply run: I need to do some reading to see if using the IP of the VPS might weaken that server somehow.

        timconsidineT Offline
        timconsidineT Offline
        timconsidine
        App Dev
        wrote on last edited by
        #3

        @scooke maybe you can clarify one thing which is not clear in my mind

        I put PDF and JPG/PNG into Paperless because these formats are usually not edited, they're semi-frozen.

        XLS(X) and DOC(X) are often more living documents with edits, especially XLS(X). Does that mean you re-upload into Paperless when you made a local edit ? And delete the old one ? Or you only upload MS documents which are "finished" and won't change ?

        I think Paperless is great and use it for "documents of record", invoices, agreements etc. I tend to think Nextcloud (or Seafile in my case) is more appropriate for living documents.

        Interested in your and other views.

        scookeS 1 Reply Last reply
        2
        • timconsidineT timconsidine

          @scooke maybe you can clarify one thing which is not clear in my mind

          I put PDF and JPG/PNG into Paperless because these formats are usually not edited, they're semi-frozen.

          XLS(X) and DOC(X) are often more living documents with edits, especially XLS(X). Does that mean you re-upload into Paperless when you made a local edit ? And delete the old one ? Or you only upload MS documents which are "finished" and won't change ?

          I think Paperless is great and use it for "documents of record", invoices, agreements etc. I tend to think Nextcloud (or Seafile in my case) is more appropriate for living documents.

          Interested in your and other views.

          scookeS Offline
          scookeS Offline
          scooke
          wrote on last edited by
          #4

          @timconsidine Yes, definitely for "finished" documents.

          A life lived in fear is a life half-lived

          1 Reply Last reply
          2
          Reply
          • Reply as topic
          Log in to reply
          • Oldest to Newest
          • Newest to Oldest
          • Most Votes


          • Login

          • Don't have an account? Register

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • Bookmarks
          • Search