Skip to content

Commit

Permalink
Merge desmondmorris#70 pull request
Browse files Browse the repository at this point in the history
  • Loading branch information
patrykwlazlowicz committed Nov 12, 2020
1 parent 3024978 commit ac28d97
Show file tree
Hide file tree
Showing 7 changed files with 10 additions and 117 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
node_modules
.idea
67 changes: 1 addition & 66 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,3 @@
# Tesseract for node.js

[![NPM](https://nodei.co/npm/node-tesseract.png)](https://nodei.co/npm/node-tesseract/)

A simple wrapper for the Tesseract OCR package for node.js

## Requirements

* Tesseract 3.01 or higher is needed for this to work

## Installation
There is a hard dependency on the [Tesseract project](https://github.com/tesseract-ocr/tesseract). You can find installation instructions for various platforms on the project site. For Homebrew users, the installation is quick and easy.

brew install tesseract --with-all-languages

The above will install all of the language packages available, if you don't need them all you can remove the `--all-languages` flag and install them manually, by downloading them to your local machine and then exposing the `TESSDATA_PREFIX` variable into your path:

export TESSDATA_PREFIX=~/Downloads/

You can then go about installing the node-module to expose the JavaScript API:

npm install node-tesseract

## Usage

```JavaScript
var tesseract = require('node-tesseract');

// Recognize text of any language in any format
tesseract.process(__dirname + '/path/to/image.jpg',function(err, text) {
if(err) {
console.error(err);
} else {
console.log(text);
}
});

// Recognize German text in a single uniform block of text and set the binary path

var options = {
l: 'deu',
psm: 6,
binary: '/usr/local/bin/tesseract'
};

tesseract.process(__dirname + '/path/to/image.jpg', options, function(err, text) {
if(err) {
console.error(err);
} else {
console.log(text);
}
});
```

## Changelog
* **0.2.7**: Adds output file extension detection
* **0.2.6**: Catches exception when deleting tmp files that do not exist
* **0.2.5**: Preserves whitespace and replaces tmp module
* **0.2.4**: Removes console logging for messaging
* **0.2.3**: The ability to set the binary path via the config object. Better installation documentation.
* **0.2.2**: Adds test converage to utils module
* **0.2.1**: Strips leading & trailing whitespace from output by default
* **0.2.0**: Adds ability to pass options via a configuration object.
* **0.1.1**: Updates tmp module.
* **0.1.0**: Removes preprocessing functionatlity. See #3.
* **0.0.3**: Adds basic test coverage for process method
* **0.0.2**: Pulls in changes by [joscha](https://github.com/joscha) including: refactored to support tesseract 3.01, added language parameter, config parameter, documentation, Added support for custom preprocessors, OTB Preprocessor using ImageMagick 'convert'
* **0.0.1**: Initial version
This project is fork from [node-tesseract](https://github.com/desmondmorris/node-tesseract)
11 changes: 8 additions & 3 deletions lib/tesseract.js
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ var Tesseract = {
}

if (options.psm !== null) {
command.push('-psm ' + options.psm);
command.push('--psm ' + options.psm);
}

if (options.config !== null) {
Expand All @@ -83,7 +83,7 @@ var Tesseract = {
}

// Find one of the three possible extension
glob(output + '.+(html|hocr|txt)', function(err, files){
glob(output + '.+(html|hocr|txt)', function(err, files) {
if (err) {
callback(err, null);
return;
Expand All @@ -97,7 +97,12 @@ var Tesseract = {
var index = Tesseract.tmpFiles.indexOf(output);
if (~index) Tesseract.tmpFiles.splice(index, 1);

fs.unlinkSync(files[0]);
fs.unlink(files[0], (err) => {
if (err) {
callback(err, null);
return;
}
});

callback(null, data)
});
Expand Down
19 changes: 0 additions & 19 deletions test/tesseract.js

This file was deleted.

Binary file removed test/test.png
Binary file not shown.
17 changes: 0 additions & 17 deletions test/utils.js

This file was deleted.

12 changes: 0 additions & 12 deletions wercker.yml

This file was deleted.

0 comments on commit ac28d97

Please sign in to comment.