-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process Image from URL #25
Comments
I don't believe that is supported by tesseract itself. Your best bet is to as part of your script retrieve the file and write it to disk then give it to the tesseract process. Unless this module can support the use of streams. Then you could probably as part of script retrieve the file and pipe it to the tesseract process. Unfortunately I don't think this is supported |
Ah, that makes sense. Thanks for the response! I'm trying to build something that would work on either mobile or desktop where I could input an image for processing via typical HTML native file upload input. If I can access the filesystem directly, I could process the image from there, but I don't think that's actually possible. I guess the only way to make it work would be to upload the image to the directory in which the app is actually stored. I was originally trying to host the app on Heroku and upload to AWS, but it doesn't seems like that's possible here. |
Here is a gist that:
A quick and messy and should work. https://gist.github.com/reecefenwick/96fc9c229ca2d21a633c This is one of my very old projects, which is a bit more of an advance implementation using rabbit mq for a client-worker architecture - https://github.com/reecefenwick/tesseract-service If you are looking to use it from a native app you will find you can actually embed tesseract in your application. I think Uber (ride share app) has done a similar thing, as they can scan your credit card; don't quote me on that, its probably something else. But yes, unfortunately the only way you will be able to expose tesseract through an API will be to store the file locally first :( |
@reecefenwick great feedback Re streaming w/ tessearct in general: it looks like you can at least stream filenames in via stdin at least: https://github.com/tesseract-ocr/tesseract/wiki/FAQ#how-to-do-streaming So for example, if you had an incoming file upload in the form of a Skipper upstream, you could fork the adapter you're using for persistence and build a Transform stream that receives a few MBs of the file, then writes that temporarily to disk and sends in the path of that temporary file to tesseract. The trickiest part there is figuring out how to split up the large file upload-- and I'd say that depends on the file format (e.g. whether you're doing video or a PDF) |
Maybe a silly question, but is there a way I can process an image from an http:// URL? It only seems to work with locally-stored files.
The text was updated successfully, but these errors were encountered: