Skip to content

Add more details about cross-domain tracking #1210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

mscwilson
Copy link
Collaborator

I took bits out of the reusable partial to use directly in the JS tracker page because it was easier to edit and break it up with subheadings

@mscwilson mscwilson requested a review from jethron April 10, 2025 16:18
Copy link

netlify bot commented Apr 10, 2025

Deploy Preview for snowplow-docs ready!

Name Link
🔨 Latest commit 281110a
🔍 Latest deploy log https://app.netlify.com/sites/snowplow-docs/deploys/67f7ef53459f480008a40df8
😎 Deploy Preview https://deploy-preview-1210--snowplow-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@jethron jethron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start, thanks for picking this up!

Comment on lines +11 to +13
:::note Base64 encoding
This enrichment expects the events to be base64-encoded. Configure this in the trackers.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the source for this?

The tracker base64 encodes the user_id, source app_id, and reason fields to make them URL-safe and to (slightly) obfuscate them in case they contain personal data (which could be unintentionally leaked to the destination site), but this is distinct from the normal base64 encoding config trackers have for SDJ payloads. No enrichment should need to be aware of the base64 encoding setting in trackers, it's already decoded by the pipeline when the enrichment runs.


The `_sp` parameter can be attached by our Web ([see cross-domain tracking](/docs/sources/trackers/javascript-trackers/web-tracker/cross-domain-tracking/index.md)) and [mobile trackers](/docs/sources/trackers/mobile-trackers/tracking-events/session-tracking/index.md#decorating-outgoing-links-using-cross-navigation-tracking) and contains user, session and app identifiers (e.g., domain user and session IDs, business user ID, source app ID). The information to include in the parameters is configurable in the trackers. This is useful for tracking the movement of users across different apps and platforms.
To add the `_sp` querystring, configure cross-domain tracking in the [web](/docs/sources/trackers/javascript-trackers/web-tracker/cross-domain-tracking/index.md) or [mobile trackers](/docs/sources/trackers/mobile-trackers/tracking-events/session-tracking/index.md#decorating-outgoing-links-using-cross-navigation-tracking). The querystring contains user, session, and app identifiers, for example domain user and session IDs, business user ID, or source application ID. This is useful for tracking the movement of users across different apps and platforms. The information to include in the parameters is configurable in the trackers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by the end of this paragraph the reader should kind of know if they need this or not, but it's very "what"/"how" rather than "why" at the moment, so that's not clear?

The link to cross-domain-tracking is doing a lot of work here also, this text is kind of ambiguous between the default cross-domain tracking and the extended version. Maybe needs a refresher on the normal behaviour and some explanation of the actual differences?

  • They both use _sp and include domain_userid + timestamp
  • The default doesn't require any enrichment to be enabled
  • Both default and extended will populate the atomic refr_domain_userid and refr_dvce_tstamp fields
  • This enrichment adds the information in an entity as well
  • Extended lets you include the domain_sessionid, user_id, source app_id and a custom reason, which are all configurable, in addition to the default domain_userid + timestamp (which can not be disabled)
  • If enabled, this enrichment will still parse the non-extended format correctly, so you do not need to co-ordinate enabling the configuration and updating tracking


If this enrichment isn't enabled, Enrich parses `_sp` querystring parameter according to the old format, `_sp={domainUserId}.{timestamp}`
The extended cross-navigation format is `_sp={domainUserId}.{timestamp}.{sessionId}.{subjectUserId}.{sourceId}.{platform}.{reason}`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sourceAppId is a bit clearer than sourceId -- and then probably sourcePlatform just for consistency; these are the app_id/platform values of the tracker that generates the parameter.


## Configuration

- [Schema](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.enrichments/cross_navigation_config/jsonschema/1-0-0)
- [Example](https://github.com/snowplow/enrich/blob/master/config/enrichments/cross_navigation_config.json)

```json reference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, cool!

This kind of makes the Schema link above redundant. Maybe add title="Schema" and swap them?

I'd say embed the example as well, but this is about as boring as enrichment configs get so I'm not sure it matters. 😅 I guess it makes it easy to copy/paste?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's not documented but apparently we can change the "See full example on GitHub" text too if that doesn't make sense in this context.

```json reference title="Schema" referenceLinkText="See schema on Github"


## Output

This enrichment adds a new derived context to the enriched event with [this schema](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/cross_navigation/jsonschema/1-0-0).
This enrichment adds a new derived entity to the enriched event based on [this schema](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/cross_navigation/jsonschema/1-0-0).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we embed this schema in the page as well? (You've created a monster)

Since we can specify that's it's JSON here the syntax highlighting is actually a nicer experience than you get on GH.

}
```

You can also call the `crossDomainLinker` function directly:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs an explainer; you would want to do this because it follows the old link tracking behaviour where it only adds the event listener to decorate the link to links that already exist on the page. If you're in an SPA or have some other dynamic content, those links won't be decorated even if they would satisfy your callback.

So it's probably something you'd want to do after each trackPageView call to make everything up to date.

Oh, saw the below section, this should just be removed and left to the "Update event listeners" section.


### Decorate all links

To decorate every link, regardless of its destination:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't suggest doing this, or at least warn against it.

</TabItem>
</Tabs>

Alternatively, only count links that parse as web URLs by checking the link's [`hostname`](https://developer.mozilla.org/en-US/docs/Web/API/HTMLAnchorElement/hostname). This should automatically exclude links that don't lead simply to other web pages.
Copy link
Contributor

@jethron jethron Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should automatically exclude links that don't lead simply to other web pages.

I'm not sure what this means?

Probably more useful to check the hostname against an array of domains or something. Oh, we do that below.


When the tracker loads it doesn't immediately decorate links. Instead, it adds event listeners to links which decorate them when a user clicks on them or navigates to them using the keyboard. This ensures that the timestamp added to the querystring is fresh.

If further links get added to the page after the tracker has loaded, you can use the tracker's `crossDomainLinker` method to add listeners again. Listeners won't be added to links which already have them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... so it's good practice to make sure you use the same linker function every call, it won't be updated.

| Property | Description | Extended | Short |
| --------------- | ---------------------------------------------- | -------- | ----- |
| `domainUserId` | Current tracker-generated UUID user identifier | ✅ | ✅ |
| `timestamp` | Current epoch timestamp, ms precision | ✅ | ✅ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: do these columns need width adjustment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants