Platformatory · dasasathyan · Mar 26, 2023 · Mar 27, 2023 · Apr 13, 2023 · Apr 13, 2023
diff --git a/_posts/2023-03-26-Kafka-IaC.md b/_posts/2023-03-26-Kafka-IaC.md
@@ -0,0 +1,257 @@
+---
+layout: post
+title:  "Kafka Infrastructure as Code(IaC)"
+author: dasasathyan
+categories: [ Platform Engineering, Data, Infrastructure, Kafka ]
+featured: true
+hidden: true
+teaser: Provision Kafka Infrastructure with Juli-Ops, Pulumi, Terraform
+toc: true
+---
+
+# Infrastructure as Code(IaC)
+
+Organizations require automated infrastructure provisioning, but the tools used should not impose too many restrictions on development teams. Developers typically do their best work with a high degree of autonomy. Managing infrastructure is not an easy task, and any missteps can worsen the situation. Replicating configurations for additional clusters is also a tedious process. This is where Infrastructure as Code (IaC) comes in handy. By using source code, IaC helps to provision the same infrastructure across environments. The advantages of IaC include faster infrastructure provisioning, reduced risk of human errors, idempotency, fewer configuration steps, and the elimination of configuration drift.
+
+A few of the IaC tools are Terraform, Pulumi etc.
+
+Apache Kafka is a real-time data streaming technology capable of handling trillions of events. It is a distributed system with servers and clients that communicate via a TCP network protocol. A couple of terminologies to keep in mind are:
+
+1. Brokers - Brokers are servers in Kafka that store event streams from various sources. A Kafka cluster is typically comprised of several brokers. Every broker in a cluster is also a bootstrap server, meaning if you can connect to one broker in a cluster, you can connect to every broker.
+
+2. Topics - The data is written by many processes called produces and the same are read by consumers. The data are partitioned into different partitions called topics. Kafka runs on a cluster of one or more servers called brokers and the partitions are distributed across the cluster. 
+
+3. Kafka Connect - Messages can be copied to and from external applications and data systems with Kafka Connect. There are 2 different types of connectors. They are Source connector and Sink Connectors.
+
+4. Schema Registry - Schema Registry is a centralized repository that facilitates the management and validation of schemas for messages in Kafka topics. With Schema registry, producers and consumers of Kafka topics can ensure that data is consistent and compatible as schemas evolve over time.
+
+5. Kafka Streams - Kafka Streams library provides real-time stream processing capabilities, built on top of the Kafka producer and consumer APIs. It is used to  perform real-time data processing, apply various transformations, perform powerful aggregations on the messages.
+
+6. kSql - Similar to K Streams, KSQL is used to perform filtering, aggregations, joins, and windowing  operations, and generate real-time analytics and data transformations against Kafka Topics with SQL like interface.
+
+All the above-mentioned infrastructure like Topics, Connectors etc can be configured with IaC tools like julie-ops, terraform, pulumi.
+
+## JulieOps
+
+JulieOps, formally known as Kafka Topology Builder, is an open source project licensed under MIT License. It has got over 350+ stars on github. It is a tool designed to simplify the process of configuring topics, role-based access control (RBAC), Schema Registry, and other components. JulieOps is based on declarative programming principles, which means that developers can specify what is needed, and the tool takes care of the implementation details. The interface of JulieOps is a YAML file, which is known for its user-friendliness and straightforwardness. With JulieOps, developers can easily describe their configuration requirements and delegate the rest of the work to the tool. 
+
+[julie-ops][julie-ops] tool helps us to provision Kafka-related tasks in Confluent Cloud Infrastructure as a code. 
+The related tasks are usually [Topics][Topics], [Access Control][Access Control], [Handling schemas][Handling schemas],
+[ksql artifacts][ksql artifacts] etc. 
+All these tasks are configured as [topologies][topologies] in julie-ops.
+
+### Pre-Requisites
+
+- You need julie-ops installed locally or in docker
+- Topologies
+- Write the following configurations to a `.properties` file to connect to Kafka cluster:
+  ```
+    bootstrap.servers="<BOOTSTRAP_SERVER_URL>"
+    security.protocol=SASL_SSL
+    sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule   required username="<SASL_USERNAME>"   password="<SASL_PASSWORD>";
+    ssl.endpoint.identification.algorithm=https
+    sasl.mechanism=PLAIN
+    # Required for correctness in Apache Kafka clients prior to 2.6
+    client.dns.lookup=use_all_dns_ips
+    # Confluent Cloud Schema Registry
+    schema.registry.url="<SCHEMA_REGISTRY_URL>"
+    basic.auth.credentials.source=USER_INFO
+    schema.registry.basic.auth.user.info="<SCHEMA_REGISTRY_API_KEY>":"<SCHEMA_REGISTRY_API_SECRET>"
+  ```
+
+### How to run
+
+```
+julie-ops --broker <BROKERS> --clientConfig <PROPERTIES_FILE> --topology <TOPOLOGY_FILE>
+```
+
+Once the run is completed without any errors a successful run will look like
+
+```
+log4j:WARN No appenders could be found for logger (org.apache.kafka.clients.admin.AdminClientConfig).
+log4j:WARN Please initialize the log4j system properly.
+log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
+List of Topics:
+<topics that are created>
+List of ACLs:
+<acls that are created>
+List of Principles:
+List of Connectors:
+List of KSQL Artifacts:
+Kafka Topology updated
+```
+
+Want a quick start? checkout our sample JulieOps repo in [here].
+
+[julie-ops]: https://julieops.readthedocs.io/en/latest/#
+[Topics]: https://julieops.readthedocs.io/en/latest/futures/what-topic-management.html
+[Handling schemas]: https://julieops.readthedocs.io/en/latest/futures/what-schema-management.html
+[Access Control]: https://julieops.readthedocs.io/en/latest/futures/what-acl-management.html
+[ksql artifacts]: https://julieops.readthedocs.io/en/latest/futures/what-ksql-management.html
+[topologies]: https://julieops.readthedocs.io/en/latest/the-descriptor-files.html?highlight=topology
+[here]: https://github.com/Platformatory/kafka-cd-julie
+
+## Pulumi
+
+Selecting the appropriate Infrastructure as Code (IaC) tool is crucial, as each tool has its own advantages and disadvantages. As discussed earlier, IaC helps in automating infrastructure provisioning and eliminates the possibility of human errors. In this section, we will be using Pulumi to provision Confluent Cloud Topics and Connectors. While Pulumi supports various programming languages, such as Python, Typescript, Go, C#, Java, and YAML, we will be using Typescript in this blog post. By utilizing Pulumi, we can automate the deployment process and achieve faster and more reliable infrastructure provisioning. The provider is developed utilizing the official Terraform Provider from ConfluentInc and is accessible across multiple languages and platforms.
+
+### Provisioning Kafka Topics
+
+There are a couple of mandatory configs that are needed to create Kafka Topics. To begin with we need cluster arguments on which the infrastructure needs to be provisioned.
+
+```
+let clusterArgs: KafkaTopicKafkaCluster = {
+  id: cluster_id,
+};
+```
+
+Then, we will be needing the Kafka credentials the API Key & API Secret.
+
+```
+let clusterCredentials: KafkaTopicCredentials = {
+  key: kafka_api_key,
+  secret: kafka_api_secret,
+};
+```
+Then comes the topic configs. There are various Kafka topic configurations. Read about them [here](https://kafka.apache.org/documentation/#topicconfigs)
+
+```
+  let topic_args: KafkaTopicArgs = {
+    kafkaCluster: clusterArgs,
+    topicName: topicName.toLowerCase(),
+    restEndpoint: rest_endpoint,
+    credentials: clusterCredentials,
+    config: {
+      ["retention.ms"]: "-1",
+      ["retention.bytes"]: "-1",
+      ["num.partitions"]: "6",
+    },
+  };
+```
+
+Finally the creation of topics
+```
+const topics = new confluent.KafkaTopic(
+    topicNames[i].toLowerCase(),
+    topic_args
+  );
+```
+
+Save the above file to an `index.ts` and set the confluent cloud cluster credentials using `pulumi config set confluentcloud:cloudApiKey <cloud api key> --secret && pulumi config set confluentcloud:cloudApiSecret <cloud api secret> --secret`. It is important to pass the `--secret` flag to the config else the secrets will not be masked on the Pulumi infrastructure config files. On setting the credentials, pulumi will prompt for a stack to be selected. Select the stack if it already exists, else create a new stack.
+
+Once the credentials are set run `pulumi up` command to provision the topics in confluent cloud.
+
+### Provisioning Kafka Connectors
+
+Kafka Connect allows for the seamless integration of messages from Kafka topics with external applications and data systems. Connectors come in two types: Source connectors and Sink connectors.
+Source connectors - The connector that takes data from a Producer and feeds them into a topic is called a Source connector.
+Sink connectors. The connector that takes data from a Topic and delivers them to a Consumer is called a Sink Connector.
+
+Let’s provision a Kafka Sink Connector that writes data from a Kafka topic to an Azure Data Lake Storage(ADLS)
+
+The mandatory configs for provisioning a Kafka Connector are
+The name of the resource
+Environment
+Cluster
+And a couple of connector-specific configs like connector class, source topics etc.
+
+The configs can contain secrets like passwords, api keys, tokens. They are by default masked by Pulumi. Those configs have to be under `configSensitive` block in config and non sensitive configs have to be under `configNonsensitive` block.
+
+```
+let connector_args: confluent.ConnectorArgs = {
+    configNonsensitive: {
+      ["connector.class"]: "AzureDataLakeGen2Sink", # The class of the Connector. List of [supported connectors](https://docs.confluent.io/cloud/current/connectors/index.html#supported-connectors)
+      ["name"]: "Connector Name",
+      ["kafka.auth.mode"]: "KAFKA_API_KEY",
+      ["topics"]: topicNames,
+      ["input.data.format"]: "JSON",
+      ["output.data.format"]: "JSON",
+      ["time.interval"]: "HOURLY",
+      ["tasks.max"]: "2",
+      ["flush.size"]: "1000",
+      ["rotate.schedule.interval.ms"]: "3600000",
+      ["rotate.interval.ms"]: "3600000",
+      ["path.format"]: "'year'=YYYY/'month'=MM/'day'=dd/'hour'=HH",
+      ["topics.dir"]: "<Directory in ADLS>",
+    },
+    configSensitive: {
+      ["kafka.api.key"]: kafka_api_key,
+      ["kafka.api.secret"]: kafka_api_secret,
+      ["azure.datalake.gen2.account.name"]: azure_data_lake_account_name,
+      ["azure.datalake.gen2.access.key"]: azure_data_lake_access_key,
+    },
+    environment: cluster_environment,
+    kafkaCluster: cluster,
+  };
+
+  new confluent.Connector("pulumi-connector", connector_args);
+```
+
+If the Confluent Cloud cluster credentials are already set up, directly go ahead and run `pulumi up` command to provision the topics in the confluent cloud. If the Confluent Cloud cluster credentials aren’t set up, follow the steps from the topic provisioning and set them up.
+
+## Terraform
+
+Terraform is a widely-used IaC tool that supports a range of cloud, datacenter, and service providers. It is capable of provisioning infrastructure for popular cloud platforms such as Azure, AWS, Oracle, Google, and Kubernetes orchestration. A list of supported providers can be found on the official Terraform Registry. It's worth noting that the Pulumi provider is built on top of the Confluent Terraform Provider. Terraform uses the HashiCorp Configuration Language (HCL) to describe and provision infrastructure.
+
+### Provisioning Topics
+
+First, Initialize the provider with
+
+```
+terraform {
+  required_providers {
+    confluent = {
+      source  = "confluentinc/confluent"
+      version = "1.13.0"
+    }
+  }
+}
+```
+This installs the confluent cloud provider.
+
+Configure the confluent secrets with
+
+```
+provider "confluent" {
+  cloud_api_key    = var.confluent_cloud_api_key    # optionally use CONFLUENT_CLOUD_API_KEY env var
+  cloud_api_secret = var.confluent_cloud_api_secret # optionally use CONFLUENT_CLOUD_API_SECRET env var
+}
+```
+
+Once the Confluent provider and the configurations are set up, run `terraform init` and `terraform apply`.
+
+```
+resource "confluent_kafka_topic" "dev_topics" {
+  kafka_cluster {
+    id = var.cluster_id
+  }
+  for_each         = toset(var.topics)
+  topic_name       = each.value
+  rest_endpoint    = data.confluent_kafka_cluster.dev_cluster.rest_endpoint
+  partitions_count = 6
+  config = {
+    "retention.ms" = "604800000"
+  }
+  credentials {
+    key    = var.api_key
+    secret = var.api_secret
+  }
+}
+```
+
+| Features | Julie-Ops | Terraform | Pulumi |
+| -------- | --------- | -------- | -------- |
+| Language Support | YAML | HashiCorp Configuration Language (HCL) |Python, TypeScript, JavaScript, Go, C#, F#, Java, YAML |
+| Supported Resources to Provision | Topics, RBACs (for Kafka Consumers, Kafka Producers, Kafka Connect, Kafka Streams applications ( microservices ), KSQL applications, Schema Registry instances, Confluent Control Center, KSQL server instances), Schemas, ACLs | confluent_api_key, confluent_byok_key, confluent_cluster_link, confluent_connector, confluent_environment, confluent_identity_pool, confluent_identity_provider, confluent_invitation, confluent_kafka_acl, confluent_kafka_client_quota, confluent_kafka_cluster, confluent_kafka_cluster_config, confluent_kafka_mirror_topic, confluent_kafka_topic, confluent_ksql_cluster, confluent_network, confluent_peering, confluent_private_link_access, confluent_role_binding, confluent_schema, confluent_schema_registry_cluster, confluent_schema_registry_cluster_config, confluent_schema_registry_cluster_mode, confluent_service_account, confluent_subject_config, confluent_subject_mode, confluent_transit_gateway_attachment | ApiKey, ByokKey, ClusterLink, Connector, Environment, IdentityPool, IdentityProvider, Invitation, KafkaAcl, KafkaClientQuota, KafkaCluster, KafkaClusterConfig, KafkaMirrorTopic, KafkaTopic, KsqlCluster, Network, Peering, PrivateLinkAccess, Provider, RoleBinding, Schema, SchemaRegistryCluster, SchemaRegistryClusterConfig, SchemaRegistryClusterMode, ServiceAccount, SubjectConfig, SubjectMode,  TransitGatewayAttachment |
+| Import code from other IaC tools | No | No | Yes |
+| Secrets Encryption | Secrets are retrieved from `.properties` file | Secrets are stored in Vault and aren’t encrypted in the state file. | Secrets are encrypted. |
+| Open Sourced | Yes | Yes | Yes |
+| Github Stars | 350+ | 81 | 6 |
+| State Store | Stored in `.cluster-state` file | Stored in `.tfstate` file or Backend of user's choice | Managed by Pulumi Service and Backend of user's choice |
+
+# Conclusion
+
+Although Terraform remains the dominant Infrastructure as Code (IaC) tool in the industry, Pulumi is rapidly gaining traction. Each tool has its strengths and weaknesses, with Terraform being more established and providing a wider range of resources, while Pulumi is renowned for its ease of use and its growing community, which is continuously improving its functionality. A person who possesses coding experience but lacks familiarity with infrastructure as code tools may find Pulumi to be a more straightforward or fascinating tool to start with due to its support for multiple programming languages, including Python, TypeScript, JavaScript, Go, C#, F#, Java, and YAML.
+
+A suitable tool ultimately depends on specific needs, such as prioritizing stability and access to a vast resource and knowledge base, in which case Terraform may be the superior option, or prioritizing efficiency and the ability to use a familiar language, in which case Pulumi might be the ideal solution. Regardless of the chosen tool, both can effectively help in managing infrastructure code.
+