You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/post-consumption/content-matching.md
+9-14Lines changed: 9 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,26 +8,28 @@ Paperless-ngx does a great job matching documents with correct correspondents, s
8
8
However, there are documents for which the automatic matching doesn't work or a single regular expression match isn't sufficient.
9
9
For such cases, further examining the document's content after consumption is necessary.
10
10
11
-
## Update document details via organize
11
+
## Update document details via organize and the Paperless-ngx CLI
12
12
13
13
[organize](https://github.com/tfeldmann/organize) is an open-source, command-line file management automation tool.
14
14
It allows to execute certain actions based on custom filters. These can be easily defined in YAML.
15
15
16
-
Probably the most helpful filter in this context is the `filecontent` filter. The document's content can be matched with regular expressions
17
-
which allows to dynamically re-use (parts of) the matched content in subsequent actions.
16
+
Probably the most helpful filter in this context is the `filecontent` filter. The document's content can be matched with regular expressions which allows to dynamically re-use (parts of) the matched content in subsequent actions.
18
17
19
18
Following script
20
19
21
20
1. ensures that a newly-consumed document gets assigned a proper title based on the document's content.
22
21
This helps to stick to a consistent naming pattern for documents that you receive regularly, e.g. invoices.
23
22
2. extracts a value out of the document content and stores it in a given custom field
24
23
24
+
The Paperless-ngx CLI can be used to update other fields as well. Check the CLI's help or [GitHub repository](https://github.com/marcelbrueckner/paperless-ngx-cli) for more information.
25
+
25
26
### Prerequisites
26
27
27
28
For this solution to work, you will need to install the following packages:
[^1]: Poppler is required for organize's `filecontent` filter to work, see [https://github.com/tfeldmann/organize/issues/322](https://github.com/tfeldmann/organize/issues/322).
33
35
@@ -41,8 +43,7 @@ Sticking to the general idea of our scripts folder layout, we will end up with f
41
43
paperless-ngx/
42
44
├─ my-post-consumption-scripts/
43
45
│ ├─ organize/
44
-
│ │ ├─ organize.config.yml.tpl
45
-
│ │ └─ pngx-update-document.py
46
+
│ │ └─ organize.config.yml.tpl
46
47
│ └─ post-consumption-wrapper.sh
47
48
│ # Obviously the below file only exists
48
49
│ # if you're running Paperless-ngx via Docker Compose
@@ -57,9 +58,10 @@ paperless-ngx/
57
58
58
59
```bash
59
60
# Token to access the REST API
60
-
PAPERLESS_TOKEN=
61
+
PNGX_TOKEN=
61
62
# Your Paperless-ngx URL, without trailing slash
62
-
PAPERLESS_URL=
63
+
# If running your post-consumption script within Docker, its likely to be http://localhost:8000
0 commit comments