Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Stirling-PDF
  3. PDF to Text/RTF fails

PDF to Text/RTF fails

Scheduled Pinned Locked Moved Solved Stirling-PDF
30 Posts 10 Posters 1.8k Views 10 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • girishG girish

    @jdaviescoates the javaldx issue is sorted out in latest update atleast (I think).

    archosA Offline
    archosA Offline
    archos
    wrote on last edited by
    #9

    @girish There is also a problem when converting any file to PDF.
    Snímek obrazovky_2023-07-01_12-58-05.png

     
    java.io.IOException: Command process failed with exit code 1. Error message: /usr/local/bin/unoconv:19: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
      from distutils.version import LooseVersion
    unoconv: Cannot find a suitable office installation on your system.
    ERROR: Please locate your office installation and send your feedback to:
           http://github.com/dagwieers/unoconv/issues
    	at stirling.software.SPDF.utils.ProcessExecutor.runCommandWithOutputHandling(ProcessExecutor.java:93)
    	at stirling.software.SPDF.controller.api.converters.ConvertOfficeController.convertToPdf(ConvertOfficeController.java:42)
    	at stirling.software.SPDF.controller.api.converters.ConvertOfficeController.processPdfWithOCR(ConvertOfficeController.java:74)
    	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:207)
    	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:152)
    	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118)
    	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:884)
    	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797)
    	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
    	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1081)
    	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:974)
    	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1011)
    	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914)
    	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590)
    	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885)
    	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658)
    	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:205)
    	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51)
    	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    	at stirling.software.SPDF.config.MetricsFilter.doFilterInternal(MetricsFilter.java:41)
    	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
    	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    	at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
    	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    	at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:109)
    	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
    	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174)
    	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149)
    	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:166)
    	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90)
    	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482)
    	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115)
    	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93)
    	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
    	at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:738)
    	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341)
    	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:390)
    	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
    	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:894)
    	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741)
    	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
    	at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
    	at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
    	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    	at java.base/java.lang.Thread.run(Thread.java:833)
    
    1 Reply Last reply
    0
    • girishG Offline
      girishG Offline
      girish
      Staff
      wrote on last edited by
      #10

      @archos I think this one is fixed with the latest update.

      archosA 1 Reply Last reply
      1
      • BerlinerB Offline
        BerlinerB Offline
        Berliner
        wrote on last edited by
        #11

        @girish this is right. Updated the app and now it works.

        1 Reply Last reply
        0
        • girishG girish has marked this topic as solved on
        • girishG Offline
          girishG Offline
          girish
          Staff
          wrote on last edited by
          #12

          OCR should also work better now. It was missing a bunch of packages.

          1 Reply Last reply
          4
          • girishG girish

            @archos I think this one is fixed with the latest update.

            archosA Offline
            archosA Offline
            archos
            wrote on last edited by archos
            #13

            @girish But converting to PDF doesn't work for me, I tried reinstalling the apps but it's the same.Snímek obrazovky_2023-07-01_17-34-26.png
            I also tried different formats DOC, XLS, XLSX, ODT.

            1 Reply Last reply
            1
            • necrevistonnezrN Offline
              necrevistonnezrN Offline
              necrevistonnezr
              wrote on last edited by
              #14

              For me, it’s doing something but doesn’t let me access the file after conversion - nothing happens:

              Jul 01 23:05:45 Running command: soffice --infilter=writer_pdf_import --convert-to doc --outdir /tmp/output_12731918207241714492 /tmp/input_12820457046940439487.pdfCommand output:
              Jul 01 23:05:45 convert /tmp/input_12820457046940439487.pdf -> /tmp/output_12731918207241714492/input_12820457046940439487.doc using filter : MS Word 97
              Jul 01 23:06:04 Running command: soffice --infilter=writer_pdf_import --convert-to docx --outdir /tmp/output_13110194667904771198 /tmp/input_2013883566298576050.pdfCommand output:
              Jul 01 23:06:04 convert /tmp/input_2013883566298576050.pdf -> /tmp/output_13110194667904771198/input_2013883566298576050.docx using filter : MS Word 2007 XML
              Jul 01 23:06:35 Running command: soffice --infilter=writer_pdf_import --convert-to rtf --outdir /tmp/output_12759007429764915141 /tmp/input_5102818123428729636.pdfCommand output:
              Jul 01 23:06:35 convert /tmp/input_5102818123428729636.pdf -> /tmp/output_12759007429764915141/input_5102818123428729636.rtf using filter : Rich Text Format
              
              1 Reply Last reply
              0
              • girishG Offline
                girishG Offline
                girish
                Staff
                wrote on last edited by
                #15

                @necrevistonnezr I noticed that for some of the conversions. The processing button would be disabled but the browser already downloaded the file under Downloads.

                1 Reply Last reply
                1
                • girishG Offline
                  girishG Offline
                  girish
                  Staff
                  wrote on last edited by
                  #16

                  @archos please try with v1.1.0 package . Think I have fixed it for real now.

                  archosA 1 Reply Last reply
                  3
                  • girishG girish

                    @archos please try with v1.1.0 package . Think I have fixed it for real now.

                    archosA Offline
                    archosA Offline
                    archos
                    wrote on last edited by
                    #17

                    @girish Great, thanks so much for the update. Already the conversion to PDF format is working.

                    1 Reply Last reply
                    2
                    • L Offline
                      L Offline
                      LoudLemur
                      wrote on last edited by
                      #18

                      I tried this yesterday, pdf-to-rtf, pdf-to-txt. Both failed, and created empty files instead. Might it have been due to a lack of memory being allocated? App Version: 0.10.3

                      jdaviescoatesJ 1 Reply Last reply
                      1
                      • nebulonN Offline
                        nebulonN Offline
                        nebulon
                        Staff
                        wrote on last edited by
                        #19

                        Were there any errors shown or if the app crashed out of memory, you should see a notification about that in Cloudron. And just to be sure, have you tried other pdf documents?

                        1 Reply Last reply
                        0
                        • L LoudLemur

                          I tried this yesterday, pdf-to-rtf, pdf-to-txt. Both failed, and created empty files instead. Might it have been due to a lack of memory being allocated? App Version: 0.10.3

                          jdaviescoatesJ Offline
                          jdaviescoatesJ Offline
                          jdaviescoates
                          wrote on last edited by
                          #20

                          @LoudLemur said in PDF to Text/RTF fails:

                          I tried this yesterday, pdf-to-rtf, pdf-to-txt. Both failed, and created empty files instead.

                          Same.

                          @nebulon said in PDF to Text/RTF fails:

                          Were there any errors shown

                          No errors were shown on screen nor in the app logs.

                          @nebulon said in PDF to Text/RTF fails:

                          And just to be sure, have you tried other pdf documents?

                          Yes, I've tried numerous PDFs all with the same empty .txt file as the result.

                          I use Cloudron with Gandi & Hetzner

                          1 Reply Last reply
                          0
                          • girishG Offline
                            girishG Offline
                            girish
                            Staff
                            wrote on last edited by
                            #21

                            Yeah, I have never got the conversion to work. I think it's probably an upstream bug.

                            jdaviescoatesJ F 2 Replies Last reply
                            0
                            • girishG girish

                              Yeah, I have never got the conversion to work. I think it's probably an upstream bug.

                              jdaviescoatesJ Offline
                              jdaviescoatesJ Offline
                              jdaviescoates
                              wrote on last edited by
                              #22

                              @girish said in PDF to Text/RTF fails:

                              Yeah, I have never got the conversion to work.

                              So I guess this shouldn't be marked as solved.

                              @girish said in PDF to Text/RTF fails:

                              I think it's probably an upstream bug.

                              If that's the case perhaps @froodle can help?

                              I use Cloudron with Gandi & Hetzner

                              girishG 1 Reply Last reply
                              0
                              • jdaviescoatesJ jdaviescoates

                                @girish said in PDF to Text/RTF fails:

                                Yeah, I have never got the conversion to work.

                                So I guess this shouldn't be marked as solved.

                                @girish said in PDF to Text/RTF fails:

                                I think it's probably an upstream bug.

                                If that's the case perhaps @froodle can help?

                                girishG Offline
                                girishG Offline
                                girish
                                Staff
                                wrote on last edited by
                                #23

                                @jdaviescoates You can also try if running soffice --infilter=writer_pdf_import --convert-to txt:Text --outdir /tmp/invoice Invoice.pdf works . For me, it doesn't produce anything even on my laptop.

                                jdaviescoatesJ 1 Reply Last reply
                                0
                                • girishG girish has marked this topic as unsolved on
                                • girishG girish

                                  @jdaviescoates You can also try if running soffice --infilter=writer_pdf_import --convert-to txt:Text --outdir /tmp/invoice Invoice.pdf works . For me, it doesn't produce anything even on my laptop.

                                  jdaviescoatesJ Offline
                                  jdaviescoatesJ Offline
                                  jdaviescoates
                                  wrote on last edited by
                                  #24

                                  @girish said in PDF to Text/RTF fails:

                                  @jdaviescoates You can also try if running soffice --infilter=writer_pdf_import --convert-to txt:Text --outdir /tmp/invoice Invoice.pdf works . For me, it doesn't produce anything even on my laptop.

                                  When I try that on my laptop I just get:

                                  Warning: failed to launch javaldx - java may not function correctly
                                  

                                  I use Cloudron with Gandi & Hetzner

                                  girishG 1 Reply Last reply
                                  0
                                  • jdaviescoatesJ jdaviescoates

                                    @girish said in PDF to Text/RTF fails:

                                    @jdaviescoates You can also try if running soffice --infilter=writer_pdf_import --convert-to txt:Text --outdir /tmp/invoice Invoice.pdf works . For me, it doesn't produce anything even on my laptop.

                                    When I try that on my laptop I just get:

                                    Warning: failed to launch javaldx - java may not function correctly
                                    
                                    girishG Offline
                                    girishG Offline
                                    girish
                                    Staff
                                    wrote on last edited by
                                    #25

                                    @jdaviescoates said in PDF to Text/RTF fails:

                                    Warning: failed to launch javaldx - java may not function correctly

                                    I made this error go away by installing a whole bunch of packages. Some openoffice java support etc. I don't recall which ones now.

                                    1 Reply Last reply
                                    2
                                    • girishG girish

                                      Yeah, I have never got the conversion to work. I think it's probably an upstream bug.

                                      F Offline
                                      F Offline
                                      froodle
                                      wrote on last edited by froodle
                                      #26

                                      @girish this step has no OCR if that's what you're wanting, you would need to run OCR step first
                                      In that usecase this would only carry over the image file which txt wouldn't support
                                      However I can try debug this to see if I can reproduce if you are on about a pdf containing actual text
                                      Could very likely be issue on stirling pdf side

                                      jdaviescoatesJ 1 Reply Last reply
                                      1
                                      • F froodle

                                        @girish this step has no OCR if that's what you're wanting, you would need to run OCR step first
                                        In that usecase this would only carry over the image file which txt wouldn't support
                                        However I can try debug this to see if I can reproduce if you are on about a pdf containing actual text
                                        Could very likely be issue on stirling pdf side

                                        jdaviescoatesJ Offline
                                        jdaviescoatesJ Offline
                                        jdaviescoates
                                        wrote on last edited by
                                        #27

                                        @froodle said in PDF to Text/RTF fails:

                                        However I can try debug this to see if I can reproduce if you are on about a pdf containing actual text
                                        Could very likely be issue on stirling pdf side

                                        I've tried numerous PDFs with actual text and all of them failed just resulting in a blank text file.

                                        I use Cloudron with Gandi & Hetzner

                                        F 1 Reply Last reply
                                        1
                                        • jdaviescoatesJ jdaviescoates

                                          @froodle said in PDF to Text/RTF fails:

                                          However I can try debug this to see if I can reproduce if you are on about a pdf containing actual text
                                          Could very likely be issue on stirling pdf side

                                          I've tried numerous PDFs with actual text and all of them failed just resulting in a blank text file.

                                          F Offline
                                          F Offline
                                          froodle
                                          wrote on last edited by
                                          #28

                                          @jdaviescoates confirmed txt has issue I'll be disabling that feature and add a fix to backlog

                                          But rtf option of that page works fine for me (not in cloudron)

                                          1 Reply Last reply
                                          3
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Bookmarks
                                          • Search