Skip to content

Conversation

@rtrieu
Copy link
Contributor

@rtrieu rtrieu commented Nov 21, 2025

What does this PR do? What is the motivation?

Reorg of existing Data Jobs docs:

  • Updates Data Jobs Monitoring on landing page to Data Observability
  • Creates new section called Data Observability
  • Moves Data Observability, renamed to Quality Monitoring
    • Datasets renamed to Data Quality
  • Moves Data Jobs Monitoring renamed to Jobs Monitoring
  • Adds new integrations pages
  • Adds aliases for existing pages for redirects

See also: #33006 (comment)

Merge instructions

Merge readiness:

  • Ready for merge

For Datadog employees:

Your branch name MUST follow the <name>/<description> convention and include the forward slash (/). Without this format, your pull request will not pass CI, the GitLab pipeline will not run, and you won't get a branch preview. Getting a branch preview makes it easier for us to check any issues with your PR, such as broken links.

If your branch doesn't follow this format, rename it or create a new branch and PR.

[6/5/2025] Merge queue has been disabled on the documentation repo. If you have write access to the repo, the PR has been reviewed by a Documentation team member, and all of the required checks have passed, you can use the Squash and Merge button to merge the PR. If you don't have write access, or you need help, reach out in the #documentation channel in Slack.

Additional notes

@rtrieu rtrieu added the WORK IN PROGRESS No review needed, it's a wip ;) label Nov 21, 2025
@rtrieu rtrieu requested review from a team as code owners November 21, 2025 20:48
@github-actions github-actions bot added the Architecture Everything related to the Doc backend label Nov 21, 2025
@rtrieu rtrieu changed the title Rtrieu/docs 12615 update data obs [DOCS-12615] Data Observability reorg Nov 21, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 21, 2025

Preview links (active after the build_preview check completes)

New or renamed files

Removed or renamed files (these should redirect)

Renamed files

Modified Files

@github-actions github-actions bot added the Guide Content impacting a guide label Nov 21, 2025
@github-actions github-actions bot added the Images Images are added/removed with this PR label Dec 12, 2025
@rtrieu rtrieu added editorial review Waiting on a more in-depth review and removed WORK IN PROGRESS No review needed, it's a wip ;) labels Dec 15, 2025
@rtrieu
Copy link
Contributor Author

rtrieu commented Dec 15, 2025

Created DOCS-12912 for docs review.

@rtrieu rtrieu requested a review from kevinzenghu December 15, 2025 21:06
Copy link
Contributor

@janine-c janine-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an ASTOUNDING amount of work, Rosa, well done!! Thanks for your patience for the review. All of my comments are really tiny and could easily be punted to a fast-follow, since I know your PMs want to get this released soon 🙂

@@ -1,44 +1,36 @@
---
title: Data Observability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinzenghu this page needs to be updated to explain suite level overview + what you get in quality and jobs. Right now it mostly indexes only on quality

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinzenghu
Copy link

@warrierr and I caught up just on on organization, specifically on whether to have a separate Integrations section. Originally I imagined a separate Integrations section would reduce redundancy across SKUs (especially as we introduce more) and encourage the cross-product suite message. But I'm convinced now that it's secondary to having a simple and clear onboarding experience for specific products.

We will probably have to change things in the future, but for now (and we're sorry for the late structure change @Rosa Trieu 😬) how about we go with this change?

I'll also leave this as a comment in the PR and change the doc

CleanShot 2025-12-18 at 17 40 16@2x

@rtrieu rtrieu requested a review from warrierr December 23, 2025 20:13
</div>

<div class="col">
<a class="card h-100" href="/data_observability/jobs_monitoring/dbtcore">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we don't have 2 different logos for dbt core and dbt cloud which is confusing. Can we combine this into one dbt page instead of 2 with different tabs for setup steps per platforms (like we have for Airflow)? So it would be a "dbt core" setup tab and one for "dbt cloud" The overview/next sections are similar between the two anyway anyway.

Screenshot 2025-12-24 at 1 14 09 PM Screenshot 2025-12-24 at 1 07 15 PM

identifier: jobs_monitoring_airflow
parent: data_jobs
weight: 200000
- name: dbt Core
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just make this a single dbt page with setup for both core and cloud (see other comment with more details)

- name: Spark on Amazon EMR
url: data_observability/jobs_monitoring/emr
identifier: jobs_monitoring_emr
parent: transformation_integrations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parent: transformation_integrations
parent: data_jobs

identifier: jobs_monitoring_dataproc
parent: data_jobs
weight: 700000
- name: Custom Jobs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Custom Jobs
- name: Custom Jobs (OpenLineage)


## Further reading

{{< partial name="whats-next/whats-next.html" >}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aliases:
- /data_jobs/databricks
further_reading:
- link: '/data_jobs'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the preview callout for Databricks serverless on this page

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also make the first sentence on this page

Data Jobs Monitoring gives visibility into the performance and reliability of your Databricks jobs and workflows running on clusters or serverless compute

---

{{< callout url="#" btn_hidden="true" header="Data Jobs Monitoring for Apache Airflow is in Preview" >}}
{{< callout url="#" btn_hidden="true" header="Data Jobs Monitoring for Apache Airflow is in preview" >}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the preview callout from this page as part of the GA

</div>
<div class="row row-cols-1 row-cols-md-4 g-2 g-xl-3 justify-content-sm-center">

<div class="col">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rtrieu can we reorganize this setup section now that we have more technologies that aren't Spark and the logos don't make that clear?

Specifically i'm thinking

Data Jobs Monitoring supports multiple job technologies. To get started, select your technology and follow the installation instructions:

- Logo list of Databricks, Airflow, dbt

Apache Spark jobs on the following platforms:
- Logo list of K8s, EMR, Google Dataproc

Screenshot 2025-12-24 at 1 33 43 PM

- [OpenLineage Python client (HTTP transport)](#option-2-openlineage-python-client-http-transport)
- [OpenLineage Python client (Datadog transport)](#option-3-openlineage-python-client-datadog-transport)

## Option 1: Direct HTTP with curl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clean this up with the tab based approach for showing each install option similar to what we have for Airflow? (a user only needs to choose one). Also makes it so we don't have to explicitly say "Option X"

Screenshot 2025-12-24 at 1 07 15 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Architecture Everything related to the Doc backend editorial review Waiting on a more in-depth review Guide Content impacting a guide Images Images are added/removed with this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants