Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: datazone example documentation #790

Merged
merged 3 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .projenrc.ts
Original file line number Diff line number Diff line change
Expand Up @@ -527,6 +527,7 @@ const datazoneMskGovernance = new awscdk.AwsCdkPythonApp({

datazoneMskGovernance.addGitIgnore('cdk.context.json');
datazoneMskGovernance.addGitIgnore('resources/flink/?');
datazoneMskGovernance.addGitIgnore('resources/flink/dependency-reduced-pom.xml');
datazoneMskGovernance.removeTask('deploy');
datazoneMskGovernance.removeTask('destroy');
datazoneMskGovernance.removeTask('diff');
Expand Down
1 change: 1 addition & 0 deletions examples/datazone-msk-governance/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0

package com.amazonaws.services.msf;

import com.amazonaws.services.kinesisanalytics.runtime.KinesisAnalyticsRuntime;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0

package com.amazonaws.services.msf.openlineage;


Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0

package com.amazonaws.services.msf.openlineage;

import io.openlineage.client.OpenLineage;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0

package com.amazonaws.services.msf.openlineage;

import io.openlineage.client.transports.Transport;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

import logging
from kafka import KafkaConsumer
from aws_schema_registry import SchemaRegistryClient
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

from consumer_factory import ConsumerFactory
import logging
from common import load_config
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

from producer.producer_factory import ProducerFactory
import os

Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

import yaml
from kafka import KafkaProducer
from aws_msk_iam_sasl_signer import MSKAuthTokenProvider
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

from openlineage.client import OpenLineageClient
from openlineage.client.run import RunEvent, RunState, Run, Job, Dataset
from openlineage.client.facet import SchemaDatasetFacet
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

import logging
from typing import Dict, Any
from dataclasses import dataclass
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

from aws_schema_registry import SchemaRegistryClient, Schema
from aws_schema_registry.adapter.kafka import KafkaSerializer
from aws_schema_registry.avro import AvroSchema
Expand Down
3 changes: 3 additions & 0 deletions examples/datazone-msk-governance/stacks/main.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

from aws_cdk import (
BundlingOptions,
CfnParameterProps,
Expand Down
4 changes: 2 additions & 2 deletions website/docs/examples/streaming-governance.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ sidebar_label: Streaming governance with DataZone example

# Streaming governance with Amazon DataZone

Govern Kafka topics from Amazon MSK in DataZone. Store metadata like lineage, schema and data format in DataZone business catalog. Subscribe to MSK topics via the DataZone subcription grant process.
Govern Kafka topics from Amazon MSK in DataZone. Store metadata including data lineage, schema and data format in DataZone's business catalog. Subscribe to MSK topics using the DataZone subscription and fulfillment workflows.

In this example, we will be using DSF on AWS to quickly build an end-to-end data governance for Kafka topic. The example provision resources to illustrate a scenario where a Lambda based Kafka producer publish a Kafka topic into DataZone to allow for discovery via its business catalog. The producer is using DSF on AWS to create the custom asset type in DataZone for MSK topics. DSF also provides a custom DataZone data source that automatically creates the asset in DataZone based on the producer Glue schema Registry. After the asset is created, a consumer can browse the DataZone catalog and request access to the MSK topic. When approved by the producer team, DSF custom authorizer automatically grants the consumer access to the MSK topic from a Flink application running on Amazon Managed Streaming for Apache Flink. Both producer and consumer are registering lineage information into DataZone.
In this example, we will be using DSF on AWS to quickly build an end-to-end data governance solution for Kafka topics. The example provisions resources to illustrate a scenario where a Lambda based Kafka producer publishes a Kafka topic into Amazon DataZone to allow for discovery via its business catalog. The producer is using DSF on AWS to create the custom asset type in DataZone for MSK topics. DSF also provides a custom DataZone data source that automatically creates the asset in DataZone based on the producer's Glue schema Registry. After the asset is created, a consumer can browse the DataZone catalog and request access to the MSK topic. When approved by the producer team, DSF custom authorizer automatically grants the consumer access to the MSK topic from a Flink application running on Amazon Managed Streaming for Apache Flink. Both producer and consumer are registering lineage information into DataZone.

The AWS CDK application using the DSF on AWS contains a single stack which provisions the following DSF constructs:
* [`DataZoneMskAssetType`](../constructs/library/04-Governance/03-datazone-msk-asset-type.mdx)
Expand Down
Loading