Demo Pipeline using Geode, Kafka, and Spring Cloud Stream

This repo showcases how Geode can be used in a simple streaming data pipeline. The source and sink apps are Spring Cloud Stream apps, and the processor is a simple Spring Boot app that enriches the data flowing through the pipeline with values pulled from a lookup against Geode.

Setup

In this demo we will set up a local cluster with minikube and deploy a pipeline that grabs from a file source and enriches the payload with data from geode. We use micrometer to grab throughput and count metrics.

Dependencies

You will need the following installed on your local workstation:

Docker

For mac installations.

Kubernetes & Minikube

On macs:

$ brew install kubernetes-cli
$ brew cask install minikube

Note that you will need to have a hypervisor installed; if you don't, check out the kubernetes guide to installing a hypervisor.

k9s for easy Kubernetes exploration (optional)

https://github.com/derailed/k9s

How to Spin up the Pipeline Locally

1. Start minikube

$ minikube start --cpus 4 --memory 8096 --vm-driver=hyperkit

Explore the k8s cluster with:

$ k9s

Note that selecting a container in k9s and pressing s will ssh you into that container. l will display logs.

2. Deploy kafka, geode, pipeline apps, prometheus and grafana

$ cd k8s
$ kubectl apply -f geode
$ kubectl apply -f kafka
$ kubectl apply -f prometheus
$ kubectl apply -f grafana
$ kubectl apply -f geode-stream.yml

After each step, confirm that the container is deployed. This may take a while as each container needs to be downloaded.

Congratulations! Your infrastructure should now be up and running in Kubernetes. Now let's populate Geode with some lookup data...

3. Create and Populate the Geode nodes

Copy over the database snapshot

$ kubectl cp geode/data/1mil.gfd server-0:/tmp/1mil.gfd

ssh into locator-0

If you're using k9s, select the locator and press s. If you are not using k9s, run the following command: $ kubectl exec -ti {locator-0-pod-name} bash. (Replace the value surrounded with curly braces with the actual name of your locator pod).

Run the gemfire shell

$ gfsh

Connect to the locator and create the region

gfsh > connect --locator=locator-0[10334] --jmx-manager=locator-0[1099]
gfsh > create region --name=telemetryRegion --type=REPLICATE

Import data

gfsh > import data --region=telemetryRegion --file=/tmp/1mil.gfd --member=server-0

Note that the file has to be on the member you're pointing to; in this case, /tmp/1mil.gfd is on geode-server-0

Query the region to confirm import success

gfsh > query --query='select count(*) from /telemetryRegion'

5. Open up Grafana for metrics visualization

Port forward grafana

$ kubectl port-forward {grafana-pod-name} 3000

Open grafana at `http://localhost:3000/login`

Use these credentials:

user:       admin
password:   password

Import sample dashboard

Look for an "Import" option on the menu. For a quickstart, import our sample json dashboard, which you can find under the /k8s/grafana/dashboards/ path.

6. Begin streaming data through the pipeline

The file source app is a spring cloud stream app that:

looks for files in the /tmp folder
reads files dropped into /tmp line by line
publish each line into a kafka topic

We will copy a file that contains 1 million sample telemetry records into the file source app. The file source app will begin streaming these records into a Kafka topic, which will then be consumed by our geode processor application, enriched with data stored in geode, and then published to another kafka topic. You should begin seeing total count and throughput in grafana as the data streams through the pipeline.

Copy telemetry data into the file source app

$ kubectl cp geode/data/1mil_telemetry.txt {file-source-pod-name}:/tmp/foo/1.txt

You can also view the data going through the processor and sink by watching the logs.

Scale it out in the Cloud!

Since the pipeline runs on Kubernetes, you can try it in your preferred cloud provider. Try scaling the geode locators and servers and see how that affects your throughput. Have fun with it!

Tear Down

$ kubectl delete deployment --all
$ kubectl delete statefulset --all

To tear down minikube:

$ minikube delete

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
geode-processor		geode-processor
gradle/wrapper		gradle/wrapper
k8s		k8s
.gitignore		.gitignore
README.md		README.md
checkstyle.xml		checkstyle.xml
diagram.png		diagram.png
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demo Pipeline using Geode, Kafka, and Spring Cloud Stream

Setup

Dependencies

Docker

Kubernetes & Minikube

k9s for easy Kubernetes exploration (optional)

How to Spin up the Pipeline Locally

1. Start minikube

2. Deploy kafka, geode, pipeline apps, prometheus and grafana

3. Create and Populate the Geode nodes

Copy over the database snapshot

ssh into locator-0

Run the gemfire shell

Connect to the locator and create the region

Import data

Query the region to confirm import success

5. Open up Grafana for metrics visualization

Port forward grafana

Open grafana at `http://localhost:3000/login`

Import sample dashboard

6. Begin streaming data through the pipeline

Copy telemetry data into the file source app

Scale it out in the Cloud!

Tear Down

About

Releases

Packages

Contributors 5

Languages

enfuse/geode-demo

Folders and files

Latest commit

History

Repository files navigation

Demo Pipeline using Geode, Kafka, and Spring Cloud Stream

Setup

Dependencies

Docker

Kubernetes & Minikube

k9s for easy Kubernetes exploration (optional)

How to Spin up the Pipeline Locally

1. Start minikube

2. Deploy kafka, geode, pipeline apps, prometheus and grafana

3. Create and Populate the Geode nodes

Copy over the database snapshot

ssh into locator-0

Run the gemfire shell

Connect to the locator and create the region

Import data

Query the region to confirm import success

5. Open up Grafana for metrics visualization

Port forward grafana

Open grafana at http://localhost:3000/login

Import sample dashboard

6. Begin streaming data through the pipeline

Copy telemetry data into the file source app

Scale it out in the Cloud!

Tear Down

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Open grafana at `http://localhost:3000/login`

Packages