<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Add Apache Tika to Paperless-ngx package.]]></title><description><![CDATA[<p dir="auto">To process office and eml files, paperless needs a Apache tika server. IT can be hosted with docker and needs to be added in the paperless config. That's it.</p>
<p dir="auto">Could you please do that?</p>
<p dir="auto"><a href="https://docs.paperless-ngx.com/configuration/#tika" target="_blank" rel="noopener noreferrer nofollow ugc">https://docs.paperless-ngx.com/configuration/#tika</a></p>
]]></description><link>https://forum.cloudron.io/topic/12151/add-apache-tika-to-paperless-ngx-package</link><generator>RSS for Node</generator><lastBuildDate>Mon, 08 Jun 2026 20:14:13 GMT</lastBuildDate><atom:link href="https://forum.cloudron.io/topic/12151.rss" rel="self" type="application/rss+xml"/><pubDate>Tue, 23 Jul 2024 18:58:02 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Wed, 20 Nov 2024 06:02:18 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/nebulon" aria-label="Profile: nebulon">@<bdi>nebulon</bdi></a> This is super Nebulon. There are so many apps Cloudron has made available. But adding "small" requests like this to existing apps makes Cloudron all that more appealing. You've just saved me the cost of a separate VPS on which I had been running Tika and Gotenburg! Has there been a better improvement of any app than this?! Thank you!</p>
<p dir="auto">PAPERLESS_TIKA_ENABLED=true<br />
PAPERLESS_TIKA_ENDPOINT=<a href="http://localhost:9998" target="_blank" rel="noopener noreferrer nofollow ugc">http://localhost:9998</a><br />
PAPERLESS_TIKA_GOTENBERG_ENDPOINT=<a href="http://localhost:3000" target="_blank" rel="noopener noreferrer nofollow ugc">http://localhost:3000</a></p>
]]></description><link>https://forum.cloudron.io/post/97101</link><guid isPermaLink="true">https://forum.cloudron.io/post/97101</guid><dc:creator><![CDATA[scooke]]></dc:creator><pubDate>Wed, 20 Nov 2024 06:02:18 GMT</pubDate></item><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Sun, 03 Nov 2024 19:12:51 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/jarod" aria-label="Profile: Jarod">@<bdi>Jarod</bdi></a> my best guess: we need a lot more samples to get to the bottom of this. My “problem”: I use Thunderbird and don't normally deal with eml files. But I will try it out next week. Let's use this thread to record our findings.</p>
]]></description><link>https://forum.cloudron.io/post/96344</link><guid isPermaLink="true">https://forum.cloudron.io/post/96344</guid><dc:creator><![CDATA[luckow]]></dc:creator><pubDate>Sun, 03 Nov 2024 19:12:51 GMT</pubDate></item><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Sun, 03 Nov 2024 18:38:42 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/luckow" aria-label="Profile: luckow">@<bdi>luckow</bdi></a> said in <a href="/post/96342">Add Apache Tika to Paperless-ngx package.</a>:</p>
<blockquote>
<p dir="auto">Testcase: Save eml file to desktop. Open paperless ngx. Upload eml file.</p>
</blockquote>
<p dir="auto">That's how I done it.</p>
]]></description><link>https://forum.cloudron.io/post/96343</link><guid isPermaLink="true">https://forum.cloudron.io/post/96343</guid><dc:creator><![CDATA[Jarod]]></dc:creator><pubDate>Sun, 03 Nov 2024 18:38:42 GMT</pubDate></item><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Sun, 03 Nov 2024 18:15:54 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/jarod" aria-label="Profile: Jarod">@<bdi>Jarod</bdi></a> my test worked<br />
<img src="/assets/uploads/files/1730657483733-92dd6d05-44c5-4d4b-a351-0256529d6577-image.png" alt="92dd6d05-44c5-4d4b-a351-0256529d6577-image.png" class=" img-fluid img-markdown" /></p>
<p dir="auto">Testcase: Save eml file to desktop. Open paperless ngx. Upload eml file.</p>
<p dir="auto">To be fair: the result does not meet my expectations. <img src="https://forum.cloudron.io/assets/plugins/nodebb-plugin-emoji/emoji/android/1f642.png?v=74f512c8ff7" class="not-responsive emoji emoji-android emoji--slightly_smiling_face" style="height:23px;width:auto;vertical-align:middle" title=":)" alt="🙂" /><br />
<img src="/assets/uploads/files/1730657635315-47c257e3-8b36-4742-b51e-76ac54920b3d-image-resized.png" alt="47c257e3-8b36-4742-b51e-76ac54920b3d-image.png" class=" img-fluid img-markdown" /></p>
<p dir="auto">Without reading the documents, my expectation was a pdf.</p>
]]></description><link>https://forum.cloudron.io/post/96342</link><guid isPermaLink="true">https://forum.cloudron.io/post/96342</guid><dc:creator><![CDATA[luckow]]></dc:creator><pubDate>Sun, 03 Nov 2024 18:15:54 GMT</pubDate></item><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Sun, 03 Nov 2024 17:55:11 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/nebulon" aria-label="Profile: nebulon">@<bdi>nebulon</bdi></a><br />
I get the following error, when I try to add a eml file</p>
<pre><code>[2024-11-03 17:53:03,119] [ERROR] [paperless.tasks] ConsumeTaskPlugin failed: WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error occurred while consuming document WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error while converting email to PDF: [Errno 111] Connection refused
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
    yield
  File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 236, in handle_request
    resp = self._pool.handle_request(req)
  File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request
    raise exc from None
  File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request
    response = connection.handle_request(
  File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 99, in handle_request
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 76, in handle_request
    stream = self._connect(request)
  File "/usr/local/lib/python3.10/dist-packages/httpcore/_sync/connection.py", line 122, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
  File "/usr/local/lib/python3.10/dist-packages/httpcore/_backends/sync.py", line 205, in connect_tcp
    with map_exceptions(exc_map):
  File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/local/lib/python3.10/dist-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/app/code/src/paperless_mail/parsers.py", line 354, in generate_pdf_from_mail
    .run()
  File "/usr/local/lib/python3.10/dist-packages/gotenberg_client/_base.py", line 113, in run
    resp = self._client.post(url=self._route, headers=self._headers, data=self._form_data, files=self._get_files())
  File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1157, in post
    return self.request(
  File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 837, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 926, in send
    response = self._send_handling_auth(
  File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 954, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 991, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/usr/local/lib/python3.10/dist-packages/httpx/_client.py", line 1027, in _send_single_request
    response = transport.handle_request(request)
  File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 235, in handle_request
    with map_httpcore_exceptions():
  File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/asgiref/sync.py", line 327, in main_wrap
    raise exc_info[1]
  File "/app/code/src/documents/consumer.py", line 476, in run
    document_parser.parse(self.working_copy, mime_type, self.filename)
  File "/app/code/src/paperless_mail/parsers.py", line 183, in parse
    self.archive_path = self.generate_pdf(mail)
  File "/app/code/src/paperless_mail/parsers.py", line 223, in generate_pdf
    mail_pdf_file = self.generate_pdf_from_mail(mail_message)
  File "/app/code/src/paperless_mail/parsers.py", line 357, in generate_pdf_from_mail
    raise ParseError(
documents.parsers.ParseError: Error while converting email to PDF: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/app/code/src/documents/tasks.py", line 148, in consume_file
    msg = plugin.run()
  File "/app/code/src/documents/consumer.py", line 508, in run
    self._fail(
  File "/app/code/src/documents/consumer.py", line 151, in _fail
    raise ConsumerError(f"{self.filename}: {log_message or message}") from exception
documents.consumer.ConsumerError: WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error occurred while consuming document WG_ aktuelle Zahlen aus dem Fachdienst 35.eml: Error while converting email to PDF: [Errno 111] Connection refused
</code></pre>
]]></description><link>https://forum.cloudron.io/post/96341</link><guid isPermaLink="true">https://forum.cloudron.io/post/96341</guid><dc:creator><![CDATA[Jarod]]></dc:creator><pubDate>Sun, 03 Nov 2024 17:55:11 GMT</pubDate></item><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Wed, 30 Oct 2024 15:13:54 GMT]]></title><description><![CDATA[<p dir="auto">New app package contains now tika and gotenberg</p>
]]></description><link>https://forum.cloudron.io/post/96232</link><guid isPermaLink="true">https://forum.cloudron.io/post/96232</guid><dc:creator><![CDATA[nebulon]]></dc:creator><pubDate>Wed, 30 Oct 2024 15:13:54 GMT</pubDate></item><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Tue, 01 Oct 2024 13:59:59 GMT]]></title><description><![CDATA[<p dir="auto">We have created an internal task to look into this.</p>
]]></description><link>https://forum.cloudron.io/post/94979</link><guid isPermaLink="true">https://forum.cloudron.io/post/94979</guid><dc:creator><![CDATA[nebulon]]></dc:creator><pubDate>Tue, 01 Oct 2024 13:59:59 GMT</pubDate></item><item><title><![CDATA[Reply to Add Apache Tika to Paperless-ngx package. on Fri, 27 Sep 2024 14:42:10 GMT]]></title><description><![CDATA[<p dir="auto">And Gotenberg too. <img src="https://forum.cloudron.io/assets/plugins/nodebb-plugin-emoji/emoji/android/1f642.png?v=74f512c8ff7" class="not-responsive emoji emoji-android emoji--slightly_smiling_face" style="height:23px;width:auto;vertical-align:middle" title=":)" alt="🙂" /><br />
<a href="https://docs.paperless-ngx.com/configuration/#optional-services" target="_blank" rel="noopener noreferrer nofollow ugc">https://docs.paperless-ngx.com/configuration/#optional-services</a></p>
]]></description><link>https://forum.cloudron.io/post/94818</link><guid isPermaLink="true">https://forum.cloudron.io/post/94818</guid><dc:creator><![CDATA[luckow]]></dc:creator><pubDate>Fri, 27 Sep 2024 14:42:10 GMT</pubDate></item></channel></rss>