-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[DOCS-12615] Data Observability reorg #33006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
a3ae6f9
52ccc7e
19f9f6a
6530a24
bcd778b
6af1b8b
a8fa4f1
befa985
a52c96d
be0b024
5aa4c85
fdf8020
5c48b4c
7abfc3e
9905234
2a78cf6
42282db
5a4987d
75acb38
56dfc52
1d554e5
c155880
b5522bb
c733103
34563a6
cc637d7
36aea7d
14dc8e8
362bcc8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -21,24 +21,27 @@ menu: | |||||
| - name: Application Performance | ||||||
| identifier: apm_heading | ||||||
| weight: 7000000 | ||||||
| - name: Data Observability | ||||||
| identifier: data_observability_heading | ||||||
| weight: 8000000 | ||||||
| - name: Digital Experience | ||||||
| identifier: digital_experience_heading | ||||||
| weight: 8000000 | ||||||
| weight: 9000000 | ||||||
| - name: Software Delivery | ||||||
| identifier: software_delivery_heading | ||||||
| weight: 9000000 | ||||||
| weight: 10000000 | ||||||
| - name: Security | ||||||
| identifier: security_platform_heading | ||||||
| weight: 10000000 | ||||||
| weight: 11000000 | ||||||
| - name: AI Observability | ||||||
| identifier: ai_observability_heading | ||||||
| weight: 11000000 | ||||||
| weight: 12000000 | ||||||
| - name: Log Management | ||||||
| identifier: log_management_heading | ||||||
| weight: 12000000 | ||||||
| weight: 13000000 | ||||||
| - name: Administration | ||||||
| identifier: administration_heading | ||||||
| weight: 13000000 | ||||||
| weight: 14000000 | ||||||
| - name: Getting Started | ||||||
| identifier: getting_started | ||||||
| url: getting_started/ | ||||||
|
|
@@ -5067,23 +5070,114 @@ menu: | |||||
| identifier: data_streams_metrics_and_tags | ||||||
| parent: data_streams | ||||||
| weight: 5 | ||||||
| - name: Data Jobs Monitoring | ||||||
| url: data_jobs/ | ||||||
| pre: data-jobs-monitoring | ||||||
| identifier: data_jobs | ||||||
| parent: apm_heading | ||||||
| weight: 60000 | ||||||
| - name: Data Observability | ||||||
| - name: Data Observability Overview | ||||||
| url: data_observability/ | ||||||
| pre: inventories | ||||||
| identifier: data_observability | ||||||
| parent: apm_heading | ||||||
| parent: data_observability_heading | ||||||
| weight: 60000 | ||||||
| - name: Quality Monitoring | ||||||
| url: data_observability/quality_monitoring/ | ||||||
| pre: check-light-wui | ||||||
| identifier: quality_monitoring | ||||||
| parent: data_observability_heading | ||||||
| weight: 70000 | ||||||
| - name: Datasets | ||||||
| url: data_observability/datasets | ||||||
| identifier: datasets | ||||||
| parent: data_observability | ||||||
| - name: Data Warehouses | ||||||
| url: data_observability/quality_monitoring/data_warehouses | ||||||
| identifier: data_warehouses | ||||||
| parent: quality_monitoring | ||||||
| weight: 200000 | ||||||
| - name: Snowflake | ||||||
| url: data_observability/quality_monitoring/data_warehouses/snowflake | ||||||
| identifier: snowflake | ||||||
| parent: data_warehouses | ||||||
| weight: 1000000 | ||||||
| - name: Databricks | ||||||
| url: data_observability/quality_monitoring/data_warehouses/databricks | ||||||
| identifier: databricks | ||||||
| parent: data_warehouses | ||||||
| weight: 2000000 | ||||||
| - name: BigQuery | ||||||
| url: data_observability/quality_monitoring/data_warehouses/bigquery | ||||||
| identifier: bigquery | ||||||
| parent: data_warehouses | ||||||
| weight: 3000000 | ||||||
| - name: Business Intelligence Integrations | ||||||
| url: data_observability/quality_monitoring/business_intelligence | ||||||
| identifier: business_intelligence_integrations | ||||||
| parent: quality_monitoring | ||||||
| weight: 300000 | ||||||
| - name: Tableau | ||||||
| url: data_observability/quality_monitoring/business_intelligence/tableau | ||||||
| identifier: business_intelligence_integrations_tableau | ||||||
| parent: business_intelligence_integrations | ||||||
| weight: 1000000 | ||||||
| - name: Sigma | ||||||
| url: data_observability/quality_monitoring/business_intelligence/sigma | ||||||
| identifier: business_intelligence_integrations_sigma | ||||||
| parent: business_intelligence_integrations | ||||||
| weight: 2000000 | ||||||
| - name: Metabase | ||||||
| url: data_observability/quality_monitoring/business_intelligence/metabase | ||||||
| identifier: business_intelligence_integrations_metabase | ||||||
| parent: business_intelligence_integrations | ||||||
| weight: 3000000 | ||||||
| - name: Power BI | ||||||
| url: data_observability/quality_monitoring/integrations/business_intelligence/powerbi | ||||||
| identifier: business_intelligence_integrations_powerbi | ||||||
| parent: business_intelligence_integrations | ||||||
| weight: 4000000 | ||||||
| - name: Jobs Monitoring | ||||||
| url: data_observability/jobs_monitoring/ | ||||||
| pre: data-jobs-monitoring | ||||||
| identifier: data_jobs | ||||||
| parent: data_observability_heading | ||||||
| weight: 80000 | ||||||
| - name: Databricks | ||||||
| url: data_observability/jobs_monitoring/databricks | ||||||
| identifier: jobs_monitoring_databricks | ||||||
| parent: data_jobs | ||||||
| weight: 100000 | ||||||
| - name: Airflow | ||||||
| url: data_observability/jobs_monitoring/airflow | ||||||
| identifier: jobs_monitoring_airflow | ||||||
| parent: data_jobs | ||||||
| weight: 200000 | ||||||
| - name: dbt Core | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's just make this a single dbt page with setup for both core and cloud (see other comment with more details) |
||||||
| url: data_observability/jobs_monitoring/dbtcore | ||||||
| identifier: jobs_monitoring_dbtcore | ||||||
| parent: data_jobs | ||||||
| weight: 300000 | ||||||
| - name: dbt Cloud | ||||||
| url: data_observability/jobs_monitoring/dbtcloud | ||||||
| identifier: jobs_monitoring_dbtcloud | ||||||
| parent: data_jobs | ||||||
| weight: 400000 | ||||||
| - name: Spark on Kubernetes | ||||||
| url: data_observability/jobs_monitoring/kubernetes | ||||||
| identifier: jobs_monitoring_kubernetes | ||||||
| parent: data_jobs | ||||||
| weight: 500000 | ||||||
| - name: Spark on Amazon EMR | ||||||
| url: data_observability/jobs_monitoring/emr | ||||||
| identifier: jobs_monitoring_emr | ||||||
| parent: transformation_integrations | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| weight: 600000 | ||||||
| - name: Spark on Google Dataproc | ||||||
| url: data_observability/jobs_monitoring/dataproc | ||||||
| identifier: jobs_monitoring_dataproc | ||||||
| parent: data_jobs | ||||||
| weight: 700000 | ||||||
| - name: Custom Jobs | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| url: data_observability/jobs_monitoring/openlineage | ||||||
| identifier: openlineage_integrations | ||||||
| parent: data_jobs | ||||||
| weight: 800000 | ||||||
| - name: Datadog Agent for OpenLineage Proxy | ||||||
| url: data_observability/jobs_monitoring/openlineage/datadog_agent_for_openlineage | ||||||
| identifier: openlineage_datadog_agent_for_openlineage | ||||||
| parent: openlineage_integrations | ||||||
| weight: 1000000 | ||||||
| - name: LLM Observability | ||||||
| url: llm_observability/ | ||||||
| pre: llm-observability | ||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,57 +2,48 @@ | |
| title: Data Observability | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kevinzenghu this page needs to be updated to explain suite level overview + what you get in quality and jobs. Right now it mostly indexes only on quality
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. check out this commit: 36aea7d#diff-962ee3dc7e8fcec564f2759e0b618688f539ccacafd059f1a0d1cb1a9eff1c35 |
||
| description: "Monitor data quality, performance, and cost with Data Observability to detect anomalies, analyze data lineage, and prevent issues affecting downstream systems." | ||
| further_reading: | ||
| - link: '/data_observability/datasets' | ||
| - link: '/data_observability/quality_monitoring/' | ||
| tag: 'Documentation' | ||
| text: 'Datasets' | ||
| - link: '/data_jobs' | ||
| text: 'Quality Monitoring' | ||
| - link: '/data_observability/jobs_monitoring' | ||
| tag: 'Documentation' | ||
| text: 'Data Jobs Monitoring' | ||
| - link: '/data_streams' | ||
| tag: 'Documentation' | ||
| text: 'Data Streams Monitoring' | ||
| - link: '/database_monitoring' | ||
| tag: 'Documentation' | ||
| text: 'Database Monitoring' | ||
| text: 'Jobs Monitoring' | ||
| - link: 'https://www.datadoghq.com/about/latest-news/press-releases/datadog-metaplane-aquistion/' | ||
| tag: 'Blog' | ||
| text: 'Datadog Brings Observability to Data Teams by Acquiring Metaplane' | ||
| text: 'Datadog Brings Observability to Data teams by Acquiring Metaplane' | ||
| --- | ||
|
|
||
| <div class="alert alert-info">Data Observability is in Preview.</div> | ||
|
|
||
| Data Observability helps data teams detect, resolve, and prevent issues that impact data quality, performance, and cost. It enables teams to monitor anomalies, troubleshoot faster, and maintain trust in the data powering downstream systems. | ||
| Data Observability helps data teams detect, resolve, and prevent issues that affect data quality, performance, and cost. It enables teams to monitor anomalies, troubleshoot faster, and maintain trust in the data powering downstream systems. | ||
|
|
||
| {{< img src="data_observability/data_observability_overview.png" alt="Lineage graph showing a failed Spark job upstream of a Snowflake table with an alert and four downstream nodes labeled Upstream issue." style="width:100%;" >}} | ||
| {{< img src="data_observability/data-obs-overview-1.png" alt="Lineage graph showing a failed application upstream." style="width:100%;" >}} | ||
|
|
||
| Datadog makes this possible by monitoring key signals across your data stack, including metrics, metadata, lineage, and logs. These signals help detect issues early and support reliable, high-quality data. | ||
| Data Observability consists of two products: | ||
|
|
||
| ## Key capabilities | ||
| - **[Quality Monitoring][3]**: Detect anomalies in your tables, including freshness delays, volume changes, and unexpected column-level metric shifts. | ||
| - **[Jobs Monitoring][4]**: Track the performance, reliability, and cost of data processing jobs across platforms like Spark, Databricks, and Airflow. | ||
|
|
||
| With Data Observability, you can: | ||
| Both products share end-to-end lineage, letting you trace data dependencies and correlate issues across your stack. | ||
|
|
||
| - Detect anomalies in volume, freshness, null rates, and distributions | ||
| - Analyze lineage to trace data dependencies from source to dashboard | ||
| - Integrate with pipelines to correlate issues with job runs, data streams, and infrastructure events | ||
| ## Quality Monitoring | ||
|
|
||
| ## Monitor data quality | ||
| Quality Monitoring tracks metrics and metadata across your tables to detect issues before they impact downstream systems: | ||
|
|
||
| {{< img src="data_observability/data_observability_lineage_quality.png" alt="Lineage graph centered on the quoted_pricing Snowflake table with an alert on a pricing metric and sidebar charts for freshness, row count, and size." style="width:100%;" >}} | ||
| - **Data metrics**: Null count, null percentage, uniqueness, mean, and standard deviation | ||
| - **Metadata**: Schema, row count, and freshness | ||
|
|
||
| Datadog continuously tracks metrics and metadata, including: | ||
| Configure static thresholds or use automatic anomaly detection to catch missing updates, unexpected row count changes, and metric outliers. | ||
|
|
||
| - Data metrics such as null count, null percentage, uniqueness, mean, and standard deviation | ||
| - Metadata such as schema, row count, and freshness | ||
| ## Jobs Monitoring | ||
|
|
||
| You can configure static thresholds or rely on automatic anomaly detection to identify unexpected changes, including: | ||
| Jobs Monitoring provides visibility into data processing jobs across your accounts and workspaces: | ||
|
|
||
| - Missing or delayed updates | ||
| - Unexpected row count changes | ||
| - Outliers in key metrics | ||
| - **Performance**: Track job duration, resource utilization, and identify inefficiencies like high idle CPU | ||
| - **Reliability**: Receive alerts when jobs fail or exceed expected completion times | ||
| - **Troubleshooting**: Analyze execution details, stack traces, and compare runs to identify issues | ||
|
|
||
| ## Trace lineage and understand impact | ||
|
|
||
| {{< img src="data_observability/data_observability_lineage_trace.png" alt="Lineage graph tracing data flow from Kafka through a failed Spark job to a Snowflake table with an alert and four downstream nodes labeled Upstream issue." style="width:100%;" >}} | ||
| {{< img src="data_observability/data-obs-lineage-blurred.png" alt="Lineage graph tracing data flow from Kafka through a failed Spark job to a Snowflake table with an alert and four downstream nodes labeled Upstream issue." style="width:100%;" >}} | ||
|
|
||
| Data Observability provides end-to-end lineage, helping you: | ||
|
|
||
|
|
@@ -62,7 +53,7 @@ Data Observability provides end-to-end lineage, helping you: | |
|
|
||
| ## Correlate with pipeline and infrastructure activity | ||
|
|
||
| {{< img src="data_observability/data_observability_pipeline_infra_correlation.png" alt="Lineage graph showing a failed Spark job with a missing S3 path error, plus a side panel with job run stats and duration trends." style="width:100%;" >}} | ||
| {{< img src="data_observability/data-obs-correlate-trace.png" alt="Lineage graph showing a failed Spark job with a missing S3 path error, plus a side panel with job run stats and duration trends." style="width:100%;" >}} | ||
|
|
||
| Understand how pipeline activity and infrastructure events impact your data. Datadog ingests logs and metadata from pipeline tools and user interactions to provide context for data quality issues, including: | ||
|
|
||
|
|
@@ -71,6 +62,21 @@ Understand how pipeline activity and infrastructure events impact your data. Dat | |
|
|
||
| This operational context helps you trace the source of data incidents and respond faster. | ||
|
|
||
| ## Required permissions | ||
|
|
||
| Data Observability requires the `integrations_read` permission to read integrations in your account and dynamically render content. Without this permission, you see a permissions screen instead of the app. | ||
|
|
||
| This permission is included in the [Datadog Standard Role][1]. If your current role doesn't include it, add `integrations_read` to your role, then refresh the page. | ||
|
|
||
| ## IP allowlists | ||
|
|
||
| If your organization enforces IP allowlists, add the IPs listed under the `webhooks` section of this [webhooks.json][2] file to your allowist. | ||
|
|
||
| ## Further reading | ||
|
|
||
| {{< partial name="whats-next/whats-next.html" >}} | ||
| {{< partial name="whats-next/whats-next.html" >}} | ||
|
|
||
| [1]: /account_management/rbac/?tab=datadogapplication#datadog-default-roles | ||
| [2]: https://ip-ranges.datadoghq.com/webhooks.json | ||
| [3]: /data_observability/quality_monitoring/ | ||
| [4]: /data_observability/jobs_monitoring/ | ||
Uh oh!
There was an error while loading. Please reload this page.