Tesseract-OCR (Optical Character Recognition) on Cloudron
-
wrote on Jun 22, 2022, 4:12 PM last edited by
Tesseract helps your computer recognize text embedded in images and extract it as text. It is a text recognition engine.
OCR can be useful for example in the editing of memes or in computer gaming, where you wish to take data from the game and process it outside of the game in another application.
There is a Docker image.
https://github.com/tesseract-ocr/tesseract
Tesseract might be of use with paperless-ng, which Cloudron already supports. There is a thread mentioning this here:
https://forum.cloudron.io/topic/6346/multi-language-ocr-support/12?_=1655907503717
Ubuntu PPA:
https://launchpad.net/~alex-p/+archive/ubuntu/tesseract-ocr-develDocker (Tesseract 5.0 is out now, I think these are only 4.0)
https://tesseract-ocr.github.io/tessdoc/Docker-Containers.html
Documentation:
https://tesseract-ocr.github.io/tessdoc/Home.html -
-
A quick reading suggests that this is a CLI tool (and not an app). This is also installed in paperless already btw. @LoudLemur Are you having trouble with tesseract and paperless?
-
A quick reading suggests that this is a CLI tool (and not an app). This is also installed in paperless already btw. @LoudLemur Are you having trouble with tesseract and paperless?
-
@girish Wondering how we can get the Nextcloud app to leverage Tesseract for the OCR-full-text search nextcloud plugin.
wrote on Mar 20, 2023, 11:05 AM last edited by@rmdes Maybe the easiest way would be https://forum.cloudron.io/topic/8383/nextcloud-all-in-one-aio/ ?
-
wrote on Mar 22, 2023, 9:47 PM last edited by
@Dolgoipa said in Tesseract-OCR (Optical Character Recognition) on Cloudron:
I have heard of Tesseract before and have used it in a couple of projects. It's a great open-source OCR engine that is easy to use and can be integrated with other applications.
You said exactly that before.
I'm inclined to think you are only here to post the link you just shared.
-
@Dolgoipa said in Tesseract-OCR (Optical Character Recognition) on Cloudron:
I have heard of Tesseract before and have used it in a couple of projects. It's a great open-source OCR engine that is easy to use and can be integrated with other applications.
You said exactly that before.
I'm inclined to think you are only here to post the link you just shared.
@jdaviescoates good catch, think it is a bot.