Skip to content

Commit

Permalink
tf docs
Browse files Browse the repository at this point in the history
  • Loading branch information
hwang-db committed Jun 29, 2022
1 parent c6c0088 commit 019db68
Show file tree
Hide file tree
Showing 4 changed files with 164 additions and 18 deletions.
103 changes: 87 additions & 16 deletions adb-splunk/README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,106 @@
# Splunk-Databricks integration pattern and quick setup

This is an automated tf template to deploy Databricks workspace and a VM hosting Splunk, and integrate them.
This is a collaborative work with [email protected], with effort to automate the setup process.
This demo is a collaborative work with [email protected].

### Overall Architecture:
This is an automated terraform template to deploy Databricks workspace and a VM hosting Splunk (deployed docker image https://hub.docker.com/r/splunk/splunk/), and integrate Splunk-Databricks.


## Overall Architecture

<img src="../charts/splunk.png" width="600">


`Context`:
## Context

Source from repo:
Please read the source repo:
https://github.com/databrickslabs/splunk-integration

Quote: *The Databricks add-on for Splunk, an app, that allows Splunk Enterprise and Splunk Cloud users to run queries and execute actions, such as running notebooks and jobs, in Databricks.*

You can use this splunk-integration app to connect Databricks clusters with Splunk instance(s). This integration is bi-directional and you can query Splunk from Databricks, and also query Databricks Delta tables from splunk.
What you can do using this integration app? (Quoted from source repo)

What you can do using this integration app?
1. Run Databricks SQL queries right from the Splunk search bar and see the results in Splunk UI.
2. Execute actions in Databricks, such as notebook runs and jobs, from Splunk.
3. Use Splunk SQL database extension to integrate Databricks information with Splunk queries and reports.
4. Push events, summary, alerts to Splunk from Databricks.
5. Pull events, alerts data from Splunk into Databricks.

1. Run Databricks SQL queries right from the Splunk search bar and see the results in Splunk UI (Fig 1 )
2. Execute actions in Databricks, such as notebook runs and jobs, from Splunk (Fig 2 & Fig 3)
3. Use Splunk SQL database extension to integrate Databricks information with Splunk queries and reports (Fig 4 & Fig 5)
4. Push events, summary, alerts to Splunk from Databricks (Fig 6 and Fig 7)
5. Pull events, alerts data from Splunk into Databricks (Fig 8)
## Getting started

### Getting started
Step 1: Clone this repo to your local, and make sure you have installed Terraform on your machine. See https://learn.hashicorp.com/tutorials/terraform/install-cli on how to install terraform on your machine.

Step 1: Clone this repo to your local, `git clone xxx`. Make sure you have installed Terraform on your machine.

Step 2: Navigate to this folder `/adb-splunk`, run `terraform init` and `terraform apply` then yes. This will deploy the infra to your Azure subscription.
Step 2: Navigate to this folder `/adb-splunk`, run `terraform init` and `terraform apply` then type yes when prompted. This will deploy the infra to your Azure subscription, specifically it deploys a resource group, a vnet with 3 subnets inside, a databricks workspace, a vm, and a storage account.

Step 3: There will be an output id address, use that to replace the ip in http://20.212.33.56:8000, then login using default username and password:
`admin` and `password`, this brings you to the Splunk VM landing page.
Step 4: Log into Splunk vm UI, follow the instructuions to interact with Databricks clusters from within Splunk.
<!-- BEGIN_TF_DOCS -->
## Requirements

| Name | Version |
| ---------------------------------------------------------------------------- | -------- |
| <a name="requirement_azurerm"></a> [azurerm](#requirement\_azurerm) | >=2.83.0 |
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | >=0.5.1 |
| <a name="requirement_tls"></a> [tls](#requirement\_tls) | >= 3.1 |

## Providers

| Name | Version |
| ---------------------------------------------------------------- | ------- |
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | 3.11.0 |
| <a name="provider_external"></a> [external](#provider\_external) | 2.2.2 |
| <a name="provider_local"></a> [local](#provider\_local) | 2.2.3 |
| <a name="provider_random"></a> [random](#provider\_random) | 3.3.2 |
| <a name="provider_tls"></a> [tls](#provider\_tls) | 3.4.0 |

## Modules

| Name | Source | Version |
| -------------------------------------------------------------------------- | ---------------------- | ------- |
| <a name="module_adls_content"></a> [adls\_content](#module\_adls\_content) | ./modules/adls_content | n/a |

## Resources

| Name | Type |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
| [azurerm_databricks_workspace.this](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/databricks_workspace) | resource |
| [azurerm_linux_virtual_machine.example](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/linux_virtual_machine) | resource |
| [azurerm_network_interface.splunk-nic](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/network_interface) | resource |
| [azurerm_network_security_group.this](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/network_security_group) | resource |
| [azurerm_public_ip.splunk-nic-pubip](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/public_ip) | resource |
| [azurerm_resource_group.this](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/resource_group) | resource |
| [azurerm_storage_blob.splunk_databricks_app_file](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_blob) | resource |
| [azurerm_storage_blob.splunk_setup_file](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_blob) | resource |
| [azurerm_subnet.private](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet) | resource |
| [azurerm_subnet.public](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet) | resource |
| [azurerm_subnet.splunksubnet](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet) | resource |
| [azurerm_subnet_network_security_group_association.private](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet_network_security_group_association) | resource |
| [azurerm_subnet_network_security_group_association.public](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/subnet_network_security_group_association) | resource |
| [azurerm_virtual_machine_extension.splunksetupagent](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/virtual_machine_extension) | resource |
| [azurerm_virtual_network.this](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/virtual_network) | resource |
| [local_file.private_key](https://registry.terraform.io/providers/hashicorp/local/latest/docs/resources/file) | resource |
| [local_file.setupscript](https://registry.terraform.io/providers/hashicorp/local/latest/docs/resources/file) | resource |
| [random_string.naming](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource |
| [tls_private_key.splunk_ssh](https://registry.terraform.io/providers/hashicorp/tls/latest/docs/resources/private_key) | resource |
| [azurerm_client_config.current](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/client_config) | data source |
| [external_external.me](https://registry.terraform.io/providers/hashicorp/external/latest/docs/data-sources/external) | data source |

## Inputs

| Name | Description | Type | Default | Required |
| -------------------------------------------------------------------------------------------------------------- | ----------- | -------- | ----------------- | :------: |
| <a name="input_dbfs_prefix"></a> [dbfs\_prefix](#input\_dbfs\_prefix) | n/a | `string` | `"dbfs"` | no |
| <a name="input_no_public_ip"></a> [no\_public\_ip](#input\_no\_public\_ip) | n/a | `bool` | `true` | no |
| <a name="input_private_subnet_endpoints"></a> [private\_subnet\_endpoints](#input\_private\_subnet\_endpoints) | n/a | `list` | `[]` | no |
| <a name="input_rglocation"></a> [rglocation](#input\_rglocation) | n/a | `string` | `"southeastasia"` | no |
| <a name="input_spokecidr"></a> [spokecidr](#input\_spokecidr) | n/a | `string` | `"10.179.0.0/20"` | no |
| <a name="input_workspace_prefix"></a> [workspace\_prefix](#input\_workspace\_prefix) | n/a | `string` | `"adb"` | no |

## Outputs

| Name | Description |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
| <a name="output_databricks_azure_workspace_resource_id"></a> [databricks\_azure\_workspace\_resource\_id](#output\_databricks\_azure\_workspace\_resource\_id) | n/a |
| <a name="output_splunk_public_ip"></a> [splunk\_public\_ip](#output\_splunk\_public\_ip) | n/a |
| <a name="output_workspace_url"></a> [workspace\_url](#output\_workspace\_url) | n/a |
<!-- END_TF_DOCS -->
46 changes: 44 additions & 2 deletions adb-splunk/modules/adls_content/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,45 @@
### adls_content module
## adls_content module

This module will create a storage account into specified rg, and create a container. Ouputs will be storage account name and container name.
This module will create a storage account into specified rg, and create a container. Ouputs will be storage account name and container name.
<!-- BEGIN_TF_DOCS -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_azurerm"></a> [azurerm](#requirement\_azurerm) | >=3.0.0 |
| <a name="requirement_local"></a> [local](#requirement\_local) | >=2.2.3 |
| <a name="requirement_random"></a> [random](#requirement\_random) | >=3.3.2 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | >=3.0.0 |
| <a name="provider_random"></a> [random](#provider\_random) | >=3.3.2 |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [azurerm_storage_account.personaldropbox](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_account) | resource |
| [azurerm_storage_container.example_container](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_container) | resource |
| [random_string.naming](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_rg"></a> [rg](#input\_rg) | n/a | `string` | n/a | yes |
| <a name="input_storage_account_location"></a> [storage\_account\_location](#input\_storage\_account\_location) | n/a | `string` | `"southeastasia"` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_container_name"></a> [container\_name](#output\_container\_name) | n/a |
| <a name="output_storage_name"></a> [storage\_name](#output\_storage\_name) | n/a |
<!-- END_TF_DOCS -->
11 changes: 11 additions & 0 deletions adb-splunk/outputs.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
output "splunk_public_ip" {
value = azurerm_public_ip.splunk-nic-pubip.ip_address
}

output "databricks_azure_workspace_resource_id" {
// The ID of the Databricks Workspace in the Azure management plane.
value = azurerm_databricks_workspace.this.id
}

output "workspace_url" {
// The workspace URL which is of the format 'adb-{workspaceId}.{random}.azuredatabricks.net'
// this is not named as DATABRICKS_HOST, because it affect authentication
value = "https://${azurerm_databricks_workspace.this.workspace_url}/"
}
22 changes: 22 additions & 0 deletions adb-splunk/workspace.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
resource "azurerm_databricks_workspace" "this" {
name = "${local.prefix}-workspace"
resource_group_name = azurerm_resource_group.this.name
location = azurerm_resource_group.this.location
sku = "premium"
tags = local.tags
//infrastructure_encryption_enabled = true
custom_parameters {
no_public_ip = var.no_public_ip
virtual_network_id = azurerm_virtual_network.this.id
private_subnet_name = azurerm_subnet.private.name
public_subnet_name = azurerm_subnet.public.name
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
storage_account_name = local.dbfsname
}
# We need this, otherwise destroy doesn't cleanup things correctly
depends_on = [
azurerm_subnet_network_security_group_association.public,
azurerm_subnet_network_security_group_association.private
]
}

0 comments on commit 019db68

Please sign in to comment.