Skip to content

Commit a542443

Browse files
authored
FMEPRD-306 Add Warehouse Native Experimentation Beta Documentation (#11720)
* FMEPRD-306 * Fix Links * More Links * Wording Nit * Add Beta Badge + Create Metrics Page * Replace Experiment Results + Add Integrations * Add Formatting * Update Tabs * Fix Link * Add Bullet * Remove BigQuery and Trino * Add Redshift and Snowflake Content (#11757) * Bump Release Date * Replace Experiment Results Screenshot + Remove CUPED * Clarify FME Settings (or Admin?) * Update Navigation * Replace Analyze Results Screenshot * Swap Images * SME Review * Add Content for View Experiment Results * Add Schema + SQL Commands * Add Create Experiment Content * More Changes * Remove More Cloud Experimentation Content
1 parent 8d72ff7 commit a542443

File tree

35 files changed

+1477
-13
lines changed

35 files changed

+1477
-13
lines changed

docs/feature-management-experimentation/60-experimentation/experiment-results/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,22 @@ Understanding how your experiment is performing, and whether it's driving meanin
99

1010
Review your experiment's metrics and overall status. Explore metric-level details, explore trends, and learn how your results are calculated.
1111

12-
For more information, see [Viewing experiment results](./viewing-experiment-results).
12+
For more information, see [Viewing experiment results](/docs/feature-management-experimentation/experimentation/experiment-results/viewing-experiment-results).
1313

1414
## Analyze experiment results
1515

1616
Drill down into experiment details to validate setup, explore user behavior, and identify potential issues.
1717

18-
For more information, see [Analyzing experiment results](./analyzing-experiment-results).
18+
For more information, see [Analyzing experiment results](/docs/feature-management-experimentation/experimentation/experiment-results/analyzing-experiment-results).
1919

2020
## Reallocate traffic
2121

2222
Once you've analyzed your results, you can adjust your rollout strategy by shifting users between treatments or rolling out to 100% of your users.
2323

24-
For more information, see [Reallocating traffic](./reallocate-traffic).
24+
For more information, see [Reallocating traffic](/docs/feature-management-experimentation/experimentation/experiment-results/reallocate-traffic).
2525

2626
## Share results
2727

2828
You can download key metrics, trends, and impact summaries in CSV or JSON format for offline analysis or sharing with teammates.
2929

30-
For more information, see [Sharing experiment results](./sharing-experiment-results/).
30+
For more information, see [Sharing experiment results](/docs/feature-management-experimentation/experimentation/experiment-results/sharing-experiment-results/).

docs/feature-management-experimentation/60-experimentation/experiment-results/viewing-experiment-results/metric-details-and-trends.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ On the impact snapshot chart, you can analyze data for ___key metrics___ using [
6969
* **Run more data-driven experiments.** Iterate on your next hypotheses or run follow-up experiments using the insights gained on what worked or didn’t in past experiments.
7070

7171
:::info
72-
[Multiple comparison correction](../../key-concepts/multiple-comparison-correction) is not applied to dimensional analysis.
72+
[Multiple comparison correction](/docs/feature-management-experimentation/experimentation/key-concepts/multiple-comparison-correction) is not applied to dimensional analysis.
7373
:::
7474

7575
Before you can select a _dimension_ to analyze on the metric Impact snapshot, you need to send a corresponding _[event property](/docs/feature-management-experimentation/experimentation/events/#event-properties)_, for the event measured by the metric. (You can set event properties in code when you call the FME SDK's `track` method.) An Admin also needs to [configure dimensions and values](/docs/feature-management-experimentation/experimentation/experiment-results/analyzing-experiment-results/dimensional-analysis/#configuring-dimensions-and-values) to show them in the Select a dimension dropdown.

docs/feature-management-experimentation/shared/metrics/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
A metric measures [events](https://help.split.io/hc/en-us/articles/360020585772) that are sent to Harness FME. Metrics can be defined to count the occurrence of events, measure event values, or measure event properties.
1+
A metric measures [events](/docs/feature-management-experimentation/experimentation/events/) that are sent to Harness FME. Metrics can be defined to count the occurrence of events, measure event values, or measure event properties.
22

3-
Metric results are calculated for each treatment of a feature flag that shares the same traffic type as the metric and has a percentage targeting rule applied. Impact can be calculated between a selected comparison treatment and baseline treatment within a feature flag. Results are displayed on the [Metrics impact tab](https://help.split.io/hc/en-us/articles/360020844451) of the feature flag.
3+
Metric results are calculated for each treatment of a feature flag that shares the same traffic type as the metric and has a percentage targeting rule applied. Impact can be calculated between a selected comparison treatment and baseline treatment within a feature flag.
44

55
### Common metrics
66

@@ -37,15 +37,15 @@ In the table below, we assume the traffic type selected for the metric is `user`
3737

3838
## Metric categories
3939

40-
For more information about metric categories, see [Metric categorization](./categories/).
40+
For more information about metric categories, see [Metric categorization](/docs/feature-management-experimentation/experimentation/metrics/categories/).
4141

4242
## Configure an alert policy
4343

4444
You can set an alert policy for a metric and Harness FME will notify you if a feature flag impacts the metric beyond a threshold you define. For more information, review the [Configuring metric alerting guide](/docs/feature-management-experimentation/release-monitoring/metrics/setup/metric-alert-policy/).
4545

4646
## Audit logs
4747

48-
Audit logs are captured every time the metric's definition or alert policy is changed. For more information, review the [Audit logs](https://help.split.io/hc/en-us/articles/360020579472-Audit-logs) guide.
48+
Audit logs are captured every time the metric's definition or alert policy is changed. For more information, review the [Audit logs](/docs/feature-management-experimentation/management-and-administration/account-settings/audit-logs/) guide.
4949

5050
## Metric list
5151

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
title: Health Check
3+
sidebar_position: 40
4+
---
5+
6+
import HealthCheck from '/docs/feature-management-experimentation/60-experimentation/experiment-results/analyzing-experiment-results/health-check.md';
7+
8+
<HealthCheck />
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
title: Analyze Experiment Results
3+
sidebar_position: 20
4+
---
5+
6+
import AnalyzeResults from '/docs/feature-management-experimentation/60-experimentation/experiment-results/analyzing-experiment-results/index.md';
7+
8+
<AnalyzeResults />
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Warehouse Native Experimentation Results
3+
sidebar_label: Warehouse Native Experiment Results
4+
description: Analyze your experiment results in Harness FME.
5+
sidebar_position: 5
6+
---
7+
8+
<CTABanner
9+
buttonText="Request Access"
10+
title="Warehouse Native is in beta!"
11+
tagline="Get early access to run Harness FME experiments directly in your data warehouse."
12+
link="https://developer.harness.io/docs/feature-management-experimentation/fme-support"
13+
closable={true}
14+
target="_self"
15+
/>
16+
17+
## Overview
18+
19+
Understanding how your experiment is performing, and whether it's driving meaningful impact, is key to making confident, data-informed product decisions. Warehouse Native experiment results help you interpret metrics derived directly from your <Tooltip id="fme.warehouse-native.data-warehouse">data warehouse</Tooltip>, assess experiment health, and share validated outcomes with stakeholders.
20+
21+
## View experiment results
22+
23+
Review key experiment metrics and overall significance in Harness FME.
24+
25+
![](../static/view-results.png)
26+
27+
Explore [how each metric performs](/docs/feature-management-experimentation/warehouse-native/experiment-results/view-experiment-results/) across treatments, inspect query-based data directly from your warehouse, and understand how results are calculated based on your metric definitions.
28+
29+
## Analyze experiment results
30+
31+
Drill down into experiment details to validate setup, confirm metric source alignment, and investigate user or account-level behavior.
32+
33+
![](../static/view-metrics.png)
34+
35+
Use [detailed metric breakdowns](/docs/feature-management-experimentation/warehouse-native/experiment-results/analyze-experiment-results/) to identify anomalies or confirm expected outcomes.
36+
37+
## Share results
38+
39+
Download experiment metrics, statistical summaries, and warehouse query outputs in CSV or JSON format for further analysis or collaboration with your team.
40+
41+
![](../static/share-results.png)
42+
43+
You can also share experiment results directly within Harness FME to maintain visibility across product, data, and engineering teams.
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
---
2+
title: View Experiment Results
3+
sidebar_position: 10
4+
---
5+
6+
## Overview
7+
8+
You can view your experiment results from the **Experiments** page. This page provides a centralized view of all experiments and allows you to quickly access performance metrics, significance levels, and summary details for each treatment group.
9+
10+
Click into any experiment to view detailed results, including the following:
11+
12+
* Experiment metadata, such as:
13+
14+
- Experiment name, owners, and tags
15+
- Start and end dates
16+
- Active targeting rule
17+
- Total number of exposures
18+
- Treatment group assignment counts and percentages
19+
20+
* Treatment comparison, including:
21+
22+
- The baseline treatment (e.g. `off`)
23+
- One or more comparison treatments (e.g. `low`)
24+
25+
## Use AI Summarize
26+
27+
For faster interpretation of experiment outcomes, the Experiments page includes an **AI Summarize** button. This analyzes key and guardrail metric results to generate a summary of your experiment, making it easier to share results and next steps with your team.
28+
29+
![Experiment Summary](../../static/summarize.png)
30+
31+
The summary is broken into three sections:
32+
33+
* **Winner Analysis**: Highlights whether a clear winner emerged across key metrics and guardrails.
34+
* **Overall Impact Summary**: Summarizes how the treatment impacted user behavior or business outcomes.
35+
* **Next Steps Suggestion**: Recommends what to do next, whether to iterate, roll out, or revisit your setup.
36+
37+
## Manually recalculating metrics
38+
39+
You can manually run calculations on-demand by clicking the Recalculate button. Recalculations can be run for key metrics only, or for all metrics (key, guardrail, and supporting). **Most recalculations take up to five minutes, but can take longer, depending on the size of your data and the length of your experiment.**
40+
41+
Reasons you may choose to recalculate metrics:
42+
43+
* If you create or modify a metric after the last updated metric impact calculation, recalculate to get the latest results.
44+
* If you assign a metric to the Key metrics or Supporting metrics groups, recalculate to populate results for those metrics.
45+
46+
The **Recalculate** button will be disabled when:
47+
48+
* **A forced recalculation is already scheduled.** A calculation is in progress. You can click the Recalculate button again, as soon as the currently running calculation finishes.
49+
50+
## Concluding on interim data
51+
52+
Although we show the statistical results for multiple interim points, we caution against drawing conclusions from interim data. Each interim point at which the data is analyzed has its own chance of bringing a false positive result, so looking at more points brings more chance of a false positive. For more information about statistical significance and false positives, see [Statistical significance](/docs/feature-management-experimentation/release-monitoring/metrics/statistical-significance/).
53+
54+
If you were to look at all the p-values from the interim analysis points and claim a significant result if any of those were below your significance threshold, then you would have a substantially higher false positive rate than expected based on the threshold alone. For example, you would have far more than a 5% chance of seeing a falsely significant result when using a significance threshold of 0.05, if you concluded on any significant p-value shown in the metric details and trends view. This is because there are multiple chances for you to happen upon a time when the natural noise in the data happened to look like a real impact.
55+
56+
For this reason, it is good practice to only draw conclusions from your experiment at the predetermined conclusion point(s), such as at the end of the review period.
57+
58+
### Interpreting the line chart and trends
59+
60+
The line chart provides a visualization of how the measured impact has changed since the beginning of the feature flag. This may be useful for gaining insights on any seasonality or for identifying any unexpected sudden changes in the performance of the treatments.
61+
62+
However it is important to remember that there will naturally be noise and variation in the data, especially when the sample size is low at the beginning of a feature flag, so some differences in the measured impact over time are to be expected.
63+
64+
Additionally, since the data is cumulative, it may be expected that the impact changes as the run time of your feature flag increases. For example, the fraction of users who have done an event may be expected to increase over time simply because the users have had more time to do the action.
65+
66+
### Example Interpretation
67+
68+
The image below shows the impact over time line chart for an example A/A test, a feature flag where there is no true difference between the performance of the treatments. Despite there being no difference between the treatments, and hence a constant true impact of zero, the line chart shows a large measured difference at the beginning, and an apparent trend upwards over time.
69+
70+
This is due only to noise in the data at the early stages of the feature flag when the sample size is low, and the measured impact moving towards the true value as more data arrives.
71+
72+
![Line Chart](../../static/line-chart.png)
73+
74+
Note also that in the chart above there are 3 calculation buckets for which the error margin is entirely below zero, and hence the p-values at those points in time would imply a statistically significant impact. This is again due to noise and the unavoidable chance of false positive results.
75+
76+
If you weren't aware of the risk of peeking at the data, or of considering multiple evaluations of your feature flag at different points in time, then you may have concluded that a meaningful impact had been detected. However, by following the recommended practice of concluding only at the predetermined end time of your feature flag you would eventually have seen a statistically inconclusive result as expected for an A/A test.
77+
78+
If you have questions or need help troubleshooting, contact [[email protected]](mailto:[email protected]).
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
title: Warehouse Native Experimentation
3+
id: index
4+
slug: /feature-management-experimentation/warehouse-native
5+
sidebar_label: Overview
6+
sidebar_position: 1
7+
description: Learn how to run experiments in your data warehouse using Harness Feature Management & Experimentation (FME).
8+
---
9+
10+
<CTABanner
11+
buttonText="Request Access"
12+
title="Warehouse Native is in beta!"
13+
tagline="Get early access to run Harness FME experiments directly in your data warehouse."
14+
link="https://developer.harness.io/docs/feature-management-experimentation/fme-support"
15+
closable={true}
16+
target="_self"
17+
/>
18+
19+
## Overview
20+
21+
Warehouse Native enables [experimentation](/docs/feature-management-experimentation/experimentation/setup/) workflows, from targeting and assignment to analysis, and provides a statistical engine for analyzing existing experiments with measurement tools in Harness Feature Management & Experimentation (FME).
22+
23+
## How Warehouse Native works
24+
25+
Warehouse Native runs experimentation jobs directly in your <Tooltip id="fme.warehouse-native.data-warehouse">data warehouse</Tooltip> by using your existing data to calculate metrics and enrich experiment analyses.
26+
27+
![](./static/data-flow.png)
28+
29+
The data model is designed around two two primary types of data: **assignment data** and **performance/behavioral data**, which power the FME statistical engine in your warehouse.
30+
31+
Key components include:
32+
33+
- **Assignment data**: Tracks user or entity assignments to experiments. This includes metadata about the experiment.
34+
- **Performance and behavioral data**: Captures metrics, events, and user behavior relevant to the experiment.
35+
- **Experiment metadata**: Contains definitions for experiments, including the experiment ID, name, start/end dates, traffic allocation, and grouping logic.
36+
- **Metric definitions**: Defines how metrics are computed in the warehouse, including aggregation logic and denominators. These definitions ensure analyses are standardized across experiments.
37+
38+
### Cloud Experimentation
39+
40+
<Tooltip id="fme.warehouse-native.cloud-experimentation">Cloud Experiments</Tooltip> are executed and analyzed within Harness FME, which collects feature flag impressions and performance data from your application and integrations. For more information, see the [Cloud Experimentation documentation](/docs/feature-management-experimentation/experimentation).
41+
42+
```mermaid
43+
flowchart LR
44+
%% Customer infrastructure
45+
subgraph CI["Customer Infrastructure"]
46+
direction TB
47+
subgraph APP["Your Application"]
48+
FME["FME SDK"]
49+
style FME fill:#9b5de5,stroke:#9b5de5,color:#fff
50+
end
51+
52+
integrations["Integrations including Google Analytics, Segment, Sentry, mParticle, Amplitude, and Amazon S3"]
53+
style integrations fill:none,stroke:none,color:#fff
54+
end
55+
style CI fill:#8110B5,stroke:#8110B5,color:#fff
56+
57+
%% Harness FME System
58+
subgraph HFM["Harness FME"]
59+
direction TB
60+
61+
%% Horizontal input boxes without a subgraph
62+
FF["FME Feature Flags"]
63+
PD["Performance and behavioral data"]
64+
style FF fill:#9b5de5,stroke:#9b5de5,color:#fff
65+
style PD fill:#9b5de5,stroke:#9b5de5,color:#fff
66+
67+
AE["FME Attribution Engine"]
68+
style AE fill:#9b5de5,stroke:#9b5de5,color:#fff
69+
70+
%% Connect inputs to Attribution Engine
71+
FF --> AE
72+
PD --> AE
73+
end
74+
style HFM fill:#8110B5,stroke:#8110B5,color:#fff
75+
76+
%% Arrows from Customer Infra to input boxes
77+
CI -- "Feature flag impression data" --> FF
78+
CI -- "Performance and additional event data" --> PD
79+
```
80+
81+
### Warehouse Native
82+
83+
<Tooltip id="fme.warehouse-native.warehouse-native">Warehouse Native Experiments</Tooltip> are executed directly in your data warehouse, leveraging assignment and behavioral data from Harness FME to calculate metrics and run statistical analyses at scale.
84+
85+
```mermaid
86+
flowchart LR
87+
subgraph DW["Data Warehouse"]
88+
style DW fill:#8110B5,stroke:#8110B5,color:#fff
89+
direction TB
90+
AF["Assignment and FME feature flag data"]
91+
PB["Performance and behavioral data"]
92+
AE["FME Attribution Engine"]
93+
style AF fill:#9b5de5,stroke:#9b5de5,color:#fff
94+
style PB fill:#9b5de5,stroke:#9b5de5,color:#fff
95+
style AE fill:#9b5de5,stroke:#9b5de5,color:#fff
96+
end
97+
98+
subgraph HFME[" "]
99+
direction TB
100+
HFM["Harness FME"]
101+
PAD1[" "]:::invisible
102+
PAD2[" "]:::invisible
103+
end
104+
105+
classDef invisible fill:none,stroke:none;
106+
style HFM fill:#8110B5,stroke:#8110B5,color:#fff
107+
108+
DW --> HFM
109+
110+
```
111+
112+
## Get started
113+
114+
To get started, [connect a data warehouse](/docs/feature-management-experimentation/warehouse-native/integrations/) and set up [assignment and metric sources](/docs/feature-management-experimentation/warehouse-native/setup/) to enable Warehouse Native Experimentation in Harness FME.

0 commit comments

Comments
 (0)