-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
312 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
--- | ||
weight: 70 | ||
date: "2023-11-02" | ||
author: "Vladimir Lapin" | ||
type: docs | ||
url: /recognize-parse-invoice/ | ||
feedback: OCRCLOUD | ||
title: Extracting information from scanned invoices | ||
description: Extract information such as numbers, dates, items, and totals from scanned invoices using the Aspose.OCR Cloud API. | ||
keywords: | ||
- OCR | ||
- recognize | ||
- invoice | ||
- parse | ||
- details | ||
- statement | ||
--- | ||
|
||
Invoices are commonly exchanged as scanned documents in many business and financial transactions. Aspose.OCR Cloud extends beyond traditional optical character recognition by employing natural language processing to extract specific information from invoices and filters out specific information from them. The results can serve various purposes, such as generating summary reports, stored in a database, or seamlessly integrated into accounting, financial, and banking software. | ||
|
||
The processing is performed in 3 API calls: | ||
|
||
1. [Get access token](/ocr/authorization/) | ||
2. [Send invoice for recognition](/ocr/send-invoice-for-recognition/) | ||
3. [Fetch machine-readable invoice data](/ocr/fetch-invoice-recognition-result/) | ||
|
||
Because Aspose.OCR Cloud is provided as a REST API, invoice processing can be performed from any platform with Internet access. | ||
|
||
Aspose also provides open-source [SDKs](/ocr/invoice-recognition-sdk/) for all popular programming languages, that wrap all routine invoice processing into a few native methods. It makes interaction with Aspose.OCR Cloud services much easier, allowing you to focus on the task at hand rather than technical details. | ||
|
||
{{% alert color="primary" %}} | ||
Make sure the application has access to the **api.aspose.cloud** domain. | ||
{{% /alert %}} |
123 changes: 123 additions & 0 deletions
123
ocr/developer-reference/recognize-parse-invoice/fetch-recognition-result/_index.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
--- | ||
weight: 20 | ||
date: "2023-11-02" | ||
author: "Vladimir Lapin" | ||
type: docs | ||
url: /fetch-invoice-recognition-result/ | ||
feedback: OCRCLOUD | ||
title: Fetching invoice processing result | ||
description: How to get the parsed invoice data from the Aspose.OCR Cloud queue. | ||
keywords: | ||
- OCR | ||
- recognize | ||
- queue | ||
- get | ||
- obtain | ||
- fetch | ||
- result | ||
- invoice | ||
--- | ||
|
||
When an invoice is [submitted](/ocr/send-invoice-for-recognition/) for processing, it is [queued](/ocr/recognition-workflow/) to ensure a stable response even under high load. To obtain the result, send a **GET** request to the `https://api.aspose.cloud/v5.0/ocr/RecognizeAndParseInvoice` Aspose.OCR Cloud REST API endpoint. To authorize the request, pass the [access token](/ocr/authorization/) in **Authorization** header (_Bearer authentication_). | ||
|
||
Provide the [unique identifier](/ocr/send-invoice-for-recognition/#return-value) of the invoice processing task in `id` parameter: | ||
|
||
```bash | ||
curl --request GET --location 'https://api.aspose.cloud/v5.0/ocr/RecognizeAndParseInvoice?id=39b37b24-86e8-4e91-9a99-6c2574853eb5' \ | ||
--header 'Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...HaRYOxBcCRCPLnrFCVXpw7UA' \ | ||
``` | ||
|
||
## Processing results | ||
|
||
The processing result is returned in JSON format in the response body. | ||
|
||
```json | ||
{ | ||
"id": "39b37b24-86e8-4e91-9a99-6c2574853eb5", | ||
"responseStatusCode": "Ok", | ||
"taskStatus": "Completed", | ||
"results": [ | ||
{ | ||
"type": "Text", | ||
"data": "eyJpc3N1ZV9kYXRlIjogIjIwMTctMTEtMjciL...sICJhY2NvdW50IjogIjJ4eHh4MG9veGsifQ==" | ||
} | ||
], | ||
"error": null | ||
} | ||
``` | ||
|
||
{{% alert color="primary" %}} | ||
Processing results are stored in the Aspose cloud and can be obtained by the task ID within **24 hours** after the invoice was sent to Aspose.OCR Cloud. | ||
{{% /alert %}} | ||
|
||
Property | Type | Description | ||
--------- | ---- | ----------- | ||
`id` | string | Unique identifier of the invoice processing task. Equals to the value of the `id` request property. | ||
`taskStatus` | string | [Current state](#task-statuses) of the invoice processing task in the queue. | ||
`responseStatusCode` | string | Processing response status. | ||
`results` | Base64 encoded JSON | [Invoice details](#invoice-details) in JSON format.<br />The data is returned as Base64 encoded string. You must decode it to deserialize into an object, display on the screen or save to a file. | ||
`error/messages` | string[] | Processing error messages, if any.<br />Even if the invoice was processed, you can still get notifications and warnings about non-fatal processing errors. | ||
|
||
## Invoice details | ||
|
||
The parsed invoice contents are returned in JSON format: | ||
|
||
```json | ||
{ | ||
"issue_date": "2017-11-27", | ||
"due_date": "", | ||
"supplier_name": "abc exports", | ||
"supplier_address": "4300 longbeach blvd longbeach california 90807 united states", | ||
"supplier_email": "", | ||
"supplier_phone": "15627349957", | ||
"supplier_tax_id": "", | ||
"receiver_name": "abc imports", | ||
"receiver_address": "140 wecker road manstield brisbane queensland 4122 australia", | ||
"receiver_tax_id": "", | ||
"currency": "usd", | ||
"total_amount": 43550.0, | ||
"vat": -1, | ||
"net_amount": 43550.0, | ||
"bank_name": "bank of america", | ||
"bic": "", | ||
"account": "2xxxx0ooxk" | ||
} | ||
``` | ||
|
||
At the moment, Aspose.OCR API recognizes the following invoice data: | ||
|
||
Property | Format | Description | ||
-------- | ------ | ----------- | ||
"issue_date" | string | Invoice issue date in _YYYY-MM-dd_ format. | ||
"due_date" | string | Invoice due date in _YYYY-MM-dd_ format. | ||
"supplier_name" | string | Supplier or service provider name. | ||
"supplier_address" | string | Supplier or service provider address (as one string). | ||
"supplier_email" | string | Supplier or service provider email address. | ||
"supplier_phone" | string | Supplier or service provider phone number (as provided in the invoice, without conversion to international format). | ||
"supplier_tax_id" | string | Supplier or service provider TIN or similar ID. | ||
"receiver_name" | string | Receiver name. | ||
"receiver_address" | string | Receiver address (as one string). | ||
"receiver_tax_id" | string | Receiver TIN or similar ID. | ||
"currency" | string | Invoice currency. | ||
"total_amount" | number | Total (raw) amount due. | ||
"vat" | number | VAT, percent. `-1` if the value is missing in the invoice. | ||
"net_amount" | number | Net amount due. | ||
"bank_name" | string | Supplier's bank name. | ||
"bic" | string | Supplier's SWIFT or similar code. | ||
"account" | string | Supplier's account number. | ||
|
||
{{% alert color="primary" %}} | ||
The availability of the properties above depends on the invoice text and structure. | ||
{{% /alert %}} | ||
|
||
## Task statuses | ||
|
||
Processing may take up to several seconds depending on the Aspose.OCR cloud load and the size of the original scan or photo. The status of the processing task is indicated in the `taskStatus` property of the processing result. | ||
|
||
Status code | Description | To do | ||
----------- | ----------- | ------ | ||
Pending | The invoice is queued for processing, but not yet processed. | Try fetching the result in a couple of seconds using the same ID. | ||
Processing | The invoice is currently being processed. | Fetch the result again using the same ID. | ||
Completed | The invoice is processed. | Read the result from `results` property. | ||
Error | An error occurred during processing. | Check messages in the `error` property for more information. | ||
NotExist | The request with the specified ID does not exist, or the result has already been deleted from the cloud storage. | Check the ID or [send the invoice for processing](/ocr/send-invoice-for-recognition/) again with the same parameters. |
58 changes: 58 additions & 0 deletions
58
ocr/developer-reference/recognize-parse-invoice/recognition-sdk/_index.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
--- | ||
weight: 30 | ||
date: "2023-11-02" | ||
author: "Vladimir Lapin" | ||
type: docs | ||
url: /invoice-recognition-sdk/ | ||
feedback: OCRCLOUD | ||
title: Invoice processing with Aspose.OCR Cloud SDK | ||
description: How to use Aspose.OCR Cloud SDK for parsing scanned or photographed invoices. | ||
keywords: | ||
- OCR | ||
- process | ||
- parse | ||
- programming | ||
- development | ||
- SDK | ||
- invoice | ||
--- | ||
|
||
Although you can directly call the Aspose.OCR Cloud REST API to [send invoices for processing](/ocr/send-invoice-for-recognition/) and [fetch parsed data](/ocr/fetch-invoice-recognition-result/), there is a much easier way to implement OCR functionality in your applications. We provide software development kits (SDKs) for all popular programming languages. They wrap up all routine operations such as establishing connections, sending API requests, and parsing responses into a few simple methods. It makes interaction with Aspose.OCR Cloud services much easier, allowing you to focus on business logic rather than technical details. | ||
|
||
{{< tabs tabID="1" tabTotal="1" tabName1=".NET" >}} | ||
|
||
{{< tab tabNum="1" >}} | ||
```csharp | ||
using Aspose.OCR.Cloud.SDK.Api; | ||
using Aspose.OCR.Cloud.SDK.Model; | ||
using System.Text; | ||
|
||
namespace Example | ||
{ | ||
internal class Program | ||
{ | ||
static void Main(string[] args) | ||
{ | ||
/** Authorize your requests to Aspose.OCR Cloud API */ | ||
RecognizeAndParseInvoiceApi api = new RecognizeAndParseInvoiceApi("<Client Id>", "<Client Secret>"); | ||
/** Read invoice image to array of bytes */ | ||
byte[] invoice = File.ReadAllBytes("invoice.png"); | ||
/** Specify recognition language */ | ||
OCRSettingsRecognizeAndParseInvoice recognitionSettings = new OCRSettingsRecognizeAndParseInvoice { | ||
Language = Language.English | ||
}; | ||
/** Send invoice for processing */ | ||
OCRRecognizeAndParseInvoiceBody source = new OCRRecognizeAndParseInvoiceBody(invoice, recognitionSettings); | ||
string taskID = api.PostRecognizeReceipt(source); | ||
/** Fetch recognition result */ | ||
OCRResponse result = api.GetRecognizeAndParseInvoice(taskID); | ||
Console.WriteLine(Encoding.UTF8.GetString(result.Results[0].Data)); | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Visit our GitHub repository for a working code and sample files: https://github.com/aspose-ocr-cloud/aspose-ocr-cloud-dotnet | ||
{{< /tab >}} | ||
|
||
{{< /tabs >}} |
97 changes: 97 additions & 0 deletions
97
ocr/developer-reference/recognize-parse-invoice/send-for-recognition/_index.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
--- | ||
weight: 10 | ||
date: "2023-11-02" | ||
author: "Vladimir Lapin" | ||
type: docs | ||
url: /send-invoice-for-recognition/ | ||
feedback: OCRCLOUD | ||
title: Sending invoice for recognition | ||
description: How to send a photo or scan of the invoice for processing to the Aspose.OCR Cloud API. | ||
keywords: | ||
- OCR | ||
- recognize | ||
- queue | ||
- send | ||
- invoice | ||
--- | ||
|
||
To extract information from a scanned or photographed invoice, send a **POST** request to the `https://api.aspose.cloud/v5.0/ocr/RecognizeAndParseInvoice` Aspose.OCR Cloud REST API endpoint. To authorize the request, pass the [access token](/ocr/authorization/) in **Authorization** header (_Bearer authentication_). | ||
|
||
The invoice and recognition parameters are provided in JSON format in the request body. | ||
|
||
```json | ||
{ | ||
"image": "Base64 string", | ||
"settings": { | ||
"language": "English", | ||
"makeSkewCorrect": true, | ||
"rotate": 0, | ||
"makeBinarization": false, | ||
"makeUpsampling": false, | ||
"makeSpellCheck": false, | ||
} | ||
} | ||
``` | ||
|
||
## Providing invoice image | ||
|
||
Photo or scan of the invoice is provided in a value of `image` property as a Base64 encoded string. | ||
|
||
{{% alert color="caution" %}} | ||
Base64 encoded file can be very long, especially when recognizing scans and high resolution photos. As a result, you may encounter an error when calling recognition via cURL in a shell command. Use the `getconf ARG_MAX` command to check the maximum length of the command arguments (in bytes). | ||
{{% /alert %}} | ||
|
||
## Recognition settings | ||
|
||
Property | Type | Default value | Description | ||
------- | ---- | ------------- | ----------- | ||
`language` | string | `English` | Specify a [language](/ocr/supported-languages/) for recognition. | ||
`makeSkewCorrect` | boolean | `true` | Automatically correct invoice image tilt (deskew) before proceeding to recognition.<br />Automatic deskew works for images rotated 15 degrees or less. If the scan or photo is rotated by a larger degree or upside down, you must manually specify the rotation angle. | ||
`rotate` | integer | `0` | Rotate an invoice image by the specified degree.<br />Should be used when the image is rotated by a significant angle or turned upside down. | ||
`makeBinarization` | boolean | `false` | Automatically convert an invoice to black and white before proceeding to recognition. | ||
`makeUpsampling` | boolean | `false` | Intellectually upscale an invoice image to improve small font recognition and detection of dense lines. | ||
`makeSpellCheck` | boolean | `false` | Automatically replace commonly misspelled words in recognition results with the correct ones. The dictionary is based on the [selected recognition language](/ocr/supported-languages/). | ||
|
||
## Image preprocessing order | ||
|
||
If image preprocessing filters are enabled, they are applied one after the other in the following order: | ||
|
||
1. [Upsampling](/ocr/upsample-image/#using-the-recognition-setting) (`"makeUpsampling": true`) | ||
2. [Skew correction](/ocr/deskew-image/#using-the-recognition-setting) (`"makeSkewCorrect": true`) | ||
|
||
If you want to apply preprocessing filters in another order, disable the corresponding recognition settings and use [self-managed preprocessing](/ocr/preprocess-image/). | ||
|
||
## Return value | ||
|
||
If successful, this method returns a string with a unique identifier (GUID) of the invoice recognition request in the [queue](/ocr/recognition-workflow/). | ||
|
||
Otherwise, it returns a HTTP status code corresponding to the error. | ||
|
||
## What's next | ||
|
||
Recognition and processing will take a few seconds, depending on the size of the source file and the current Aspose.Cloud load. See the article [Fetching invoice processing result](/ocr/fetch-invoice-recognition-result/) for information on how to get a JSON with parsed invoice data from the server. | ||
|
||
## cURL example | ||
|
||
{{< tabs tabID="1" tabTotal="2" tabName1="Request" tabName2="Response" >}} | ||
{{< tab tabNum="1" >}} | ||
```bash | ||
curl --location --request POST 'https://api.aspose.cloud/v5.0/ocr/RecognizeAndParseInvoice' \ | ||
--header 'Accept: text/plain' \ | ||
--header 'Content-Type: application/json' \ | ||
--header 'Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...HaRYOxBcCRCPLnrFCVXpw7UA' \ | ||
--data-raw '{ | ||
"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD...8AkTf/2Q==", | ||
"settings": { | ||
"language": "English", | ||
"makeSpellCheck": true | ||
} | ||
}' | ||
``` | ||
{{< /tab >}} | ||
{{< tab tabNum="2" >}} | ||
``` | ||
39b37b24-86e8-4e91-9a99-6c2574853eb5 | ||
``` | ||
{{< /tab >}} | ||
{{< /tabs >}} |