Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognise numbers and replace them with text #35

Open
tomerb15 opened this issue Aug 1, 2016 · 2 comments
Open

Recognise numbers and replace them with text #35

tomerb15 opened this issue Aug 1, 2016 · 2 comments

Comments

@tomerb15
Copy link

tomerb15 commented Aug 1, 2016

Is there a way to use tesseract to find the numbers in an Image and get the x,y of the numbers in the image?

I have numbers which acts as placeholders in the image (must be numbers) and they can be in various rotations within the image. (sometimes the numbers are upside down/sideways).

I need to replace the numbers with some text

@reecefenwick
Copy link

Tesseract can output hOCR file which is basically a html representation of the OCR results.

Gives you words and their coordinates, confidence and much more.

I'm unsure node-tesseract allows for this output, but you should be able to get it easily by running tesseract on the cli.

http://stackoverflow.com/q/8268928

You can then use something like this https://github.com/gierschv/node-hocr to parse the html into nice objects.

I use tesseract with my Java app and follow a similar process, get hocr, parse it, analyse coordinates of words etc

If I can find some time I'll see if I can getting something working in node.

@tomerb15
Copy link
Author

tomerb15 commented Aug 3, 2016

Thank you very much for your kind help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants