Skip to content

Commit 6032def

Browse files
committed
Address feedback
1 parent 07c172d commit 6032def

File tree

2 files changed

+32
-3
lines changed
  • docs/pipeline/enrichments/available-enrichments
    • custom-javascript-enrichment/writing
    • pii-pseudonymization-enrichment

2 files changed

+32
-3
lines changed

docs/pipeline/enrichments/available-enrichments/custom-javascript-enrichment/writing/index.md

+26-1
Original file line numberDiff line numberDiff line change
@@ -243,7 +243,9 @@ You might be tempted to update derived entities in a similar way by using `event
243243
244244
## Discarding the event
245245
246-
Sometimes you don’t want the event to appear in your data warehouse or lake, e.g. because you suspect it comes from a bot and not a real user. Starting with Enrich 5.3.0, it is possible to drop an event by calling `event.drop()` in JavaScript enrichment code:
246+
Sometimes you don’t want the event to appear in your data warehouse or lake, e.g. because you suspect it comes from a bot and not a real user.
247+
248+
Starting with Enrich 5.3.0, it is possible to drop an event by calling `event.drop()` in JavaScript enrichment code:
247249
248250
```js
249251
const botPattern = /.*Googlebot.*/;
@@ -257,12 +259,35 @@ function process(event) {
257259
}
258260
```
259261
262+
This mechanism can be used to drop not only good events, but also invalid events. The dropped events will not be sent to any stream or destination, thus lowering the infrastructure costs.
263+
260264
:::caution
261265
262266
There is no way to recover dropped events therefore use it with caution.
263267
264268
:::
265269
270+
Another way to discard events is throwing an exception in your JavaScript code, which will send the event to [failed events](/docs/fundamentals/failed-events/index.md):
271+
272+
```js
273+
const botPattern = /.*Googlebot.*/;
274+
275+
function process(event) {
276+
const useragent = event.getUseragent();
277+
278+
if (useragent !== null && botPattern.test(useragent)) {
279+
throw "Filtered event produced by Googlebot";
280+
}
281+
}
282+
```
283+
284+
:::caution
285+
286+
This will create an “enrichment failure” failed event, which may be tricky to distinguish from genuine failures in your enrichment code, e.g. due to a mistake.
287+
288+
:::
289+
290+
266291
## Accessing Java methods
267292
268293
Because the JavaScript enrichment runs inside the Enrich application, it has access to the Java standard library, as well as _some_ Java libraries (the ones used by Enrich). You can call Java methods via their fully qualified path, for example:

docs/pipeline/enrichments/available-enrichments/pii-pseudonymization-enrichment/index.md

+6-2
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,12 @@ It's **important** to keep these things in mind when using this enrichment:
4242
- Hashing a field can change its format (e.g. email) and its length, thus making a whole valid original event invalid if its schema is not compatible with the hashing.
4343
- When updating the `salt` after it has already been used, same original values hashed with previous and new salt will have different hashes, thus making a join impossible and/or creating duplicate values.
4444

45-
### Anonymous mode
46-
Enrich 5.3.0 introduced anonymous mode. When [anonymousOnly](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.enrichments/pii_enrichment_config/jsonschema/2-0-1#L155) is set to true, PII fields are masked only if the `SP-Anonymous` header is present.
45+
### `anonymousOnly` mode
46+
Enrich 5.3.0 introduced the `anonymousOnly` mode. When [anonymousOnly](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.enrichments/pii_enrichment_config/jsonschema/2-0-1#L155) is set to true, PII fields are masked only in events tracked in anonymous mode (i.e. the `SP-Anonymous` header is present).
47+
48+
This is useful for compliance with regulation such as GDPR, where you would start with [anonymous tracking](/docs/sources/trackers/javascript-trackers/web-tracker/anonymous-tracking/index.md) by default (all identifiers are masked) and switch to non-anonymous tracking when the user consents to data collection (all identifiers are kept).
49+
50+
By default, `anonymousOnly` is `false`, i.e. PII fields are always masked.
4751

4852
## Input
4953

0 commit comments

Comments
 (0)