Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions datahub-web-react/src/app/ingest/source/builder/constants.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import athenaLogo from '@images/awsathenalogo.png';
import azureLogo from '@images/azure-ad.png';
import azureDataFactoryLogo from '@images/azuredatafactorylogo.svg';
import bigqueryLogo from '@images/bigquerylogo.png';
import cassandraLogo from '@images/cassandralogo.png';
import clickhouseLogo from '@images/clickhouselogo.png';
Expand Down Expand Up @@ -50,6 +51,8 @@ export const ATHENA = 'athena';
export const ATHENA_URN = `urn:li:dataPlatform:${ATHENA}`;
export const AZURE = 'azure-ad';
export const AZURE_URN = `urn:li:dataPlatform:${AZURE}`;
export const AZURE_DATA_FACTORY = 'azure-data-factory';
export const AZURE_DATA_FACTORY_URN = `urn:li:dataPlatform:${AZURE_DATA_FACTORY}`;
export const BIGQUERY = 'bigquery';
export const BIGQUERY_USAGE = 'bigquery-usage';
export const BIGQUERY_BETA = 'bigquery-beta';
Expand Down Expand Up @@ -162,6 +165,7 @@ export const STREAMLIT_URN = `urn:li:dataPlatform:${STREAMLIT}`;
export const PLATFORM_URN_TO_LOGO = {
[ATHENA_URN]: athenaLogo,
[AZURE_URN]: azureLogo,
[AZURE_DATA_FACTORY_URN]: azureDataFactoryLogo,
[BIGQUERY_URN]: bigqueryLogo,
[CLICKHOUSE_URN]: clickhouseLogo,
[COCKROACHDB_URN]: cockroachdbLogo,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import athenaLogo from '@images/awsathenalogo.png';
import azureLogo from '@images/azure-ad.png';
import azureDataFactoryLogo from '@images/azuredatafactorylogo.svg';
import bigqueryLogo from '@images/bigquerylogo.png';
import cassandraLogo from '@images/cassandralogo.png';
import clickhouseLogo from '@images/clickhouselogo.png';
Expand Down Expand Up @@ -48,6 +49,8 @@ export const ATHENA = 'athena';
export const ATHENA_URN = `urn:li:dataPlatform:${ATHENA}`;
export const AZURE = 'azure-ad';
export const AZURE_URN = `urn:li:dataPlatform:${AZURE}`;
export const AZURE_DATA_FACTORY = 'azure-data-factory';
export const AZURE_DATA_FACTORY_URN = `urn:li:dataPlatform:${AZURE_DATA_FACTORY}`;
export const BIGQUERY = 'bigquery';
export const BIGQUERY_BETA = 'bigquery-beta';
export const BIGQUERY_URN = `urn:li:dataPlatform:${BIGQUERY}`;
Expand Down Expand Up @@ -155,6 +158,7 @@ export const SNAPLOGIC_URN = `urn:li:dataPlatform:${SNAPLOGIC}`;
export const PLATFORM_URN_TO_LOGO = {
[ATHENA_URN]: athenaLogo,
[AZURE_URN]: azureLogo,
[AZURE_DATA_FACTORY_URN]: azureDataFactoryLogo,
[BIGQUERY_URN]: bigqueryLogo,
[CLICKHOUSE_URN]: clickhouseLogo,
[COCKROACHDB_URN]: cockroachdbLogo,
Expand Down
1 change: 1 addition & 0 deletions datahub-web-react/src/images/azuredatafactorylogo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
80 changes: 80 additions & 0 deletions metadata-ingestion/docs/sources/azure-data-factory/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Azure Data Factory

For context on getting started with ingestion, check out our [metadata ingestion guide](../../../../metadata-ingestion/README.md).

## Setup

To install this plugin, run `pip install 'acryl-datahub[azure-data-factory]'`.

## Quickstart Recipe

```yaml
source:
type: azure-data-factory
config:
# Required
subscription_id: ${AZURE_SUBSCRIPTION_ID}

# Authentication (service principal)
credential:
authentication_method: service_principal
client_id: ${AZURE_CLIENT_ID}
client_secret: ${AZURE_CLIENT_SECRET}
tenant_id: ${AZURE_TENANT_ID}

# Optional filters
factory_pattern:
allow: ["prod-.*"]

# Features
include_lineage: true
include_execution_history: false

env: PROD

sink:
type: datahub-rest
config:
server: "http://localhost:8080"
```

## Authentication Methods

| Method | Config Value | Use Case |
| ----------------- | ------------------- | ----------------- |
| Service Principal | `service_principal` | Production |
| Managed Identity | `managed_identity` | Azure-hosted |
| Azure CLI | `cli` | Local development |
| Auto-detect | `default` | Flexible |

## Config Details

| Field | Required | Description |
| ---------------------------------- | -------- | ----------------------------------------- |
| `subscription_id` | βœ… | Azure subscription ID |
| `credential.authentication_method` | | Auth method (default: `default`) |
| `credential.client_id` | | App (client) ID for service principal |
| `credential.client_secret` | | Client secret for service principal |
| `credential.tenant_id` | | Tenant (directory) ID |
| `resource_group` | | Filter to specific resource group |
| `factory_pattern` | | Regex allow/deny for factories |
| `pipeline_pattern` | | Regex allow/deny for pipelines |
| `include_lineage` | | Extract lineage (default: `true`) |
| `include_execution_history` | | Extract pipeline runs (default: `false`) |
| `execution_history_days` | | Days of history, 1-90 (default: `7`) |
| `platform_instance_map` | | Map linked services to platform instances |
| `env` | | Environment (default: `PROD`) |

## Entity Mapping

| ADF Concept | DataHub Entity |
| ------------ | ------------------- |
| Data Factory | Container |
| Pipeline | DataFlow |
| Activity | DataJob |
| Dataset | Dataset |
| Pipeline Run | DataProcessInstance |

## Questions

If you've got any questions on configuring this source, feel free to ping us on [our Slack](https://slack.datahubproject.io/).
Loading