7/29/2025

Building a Kafka + Airflow Pipeline on Kubernetes with Minikube

 The purpose of this blog post is to setup minikube cluster + kafka cluster and then deploying Apache Airflow DAG pipeline in local machine.


πŸ› ️ Prerequisites

Before we dive in, make sure you have the following installed:

- Minikube
- kubectl
- Helm
- Docker

πŸ“¦ Step 1: Set Up a Minikube Cluster

Spin up a Minikube cluster:

```
minikube start --memory=12000 --cpus=4
```

Check cluster:


```


kubectl get nodes
```


 Step 2: Install the Confluent for Kubernetes (CFK) Operator

Use this script to deploy Confluent:
https://github.com/dhanuka84/my-first-apache-airflow-setup/blob/main/cfk_kraft_quickstart.sh


πŸ”Œ Step 3: Access the Kafka Control Center

Forward Control Center:

```
kubectl port-forward controlcenter-0 9021:9021
```

Visit http://localhost:9021 and create topic:



```
my-csv-topic
```

☁️ Step 4: Install Apache Airflow via Helm

Add the official Airflow Helm chart repository and install it:

```
helm repo add apache-airflow https://airflow.apache.org
helm repo update

export NAMESPACE=confluent
export RELEASE_NAME=example-release

helm install $RELEASE_NAME apache-airflow/airflow --namespace $NAMESPACE --create-namespace
```

Expose the Airflow web UI:

```
kubectl port-forward svc/example-release-api-server 9080:8080 -n confluent
```

Log in to Airflow using:

- Username: admin
- Password: admin



🧰 Step 5: Customize the Airflow Image

Build and deploy custom image:
πŸ‘‰ https://github.com/dhanuka84/my-first-apache-airflow-setup/blob/main/deploy_airflow.sh

Run:

```
./deploy_airflow.sh 0.0.1
```

Wait for pods:

```
kubectl get pods -n confluent
```

πŸ§ͺ Step 6: Trigger the DAG

Use the Airflow UI to manually trigger your DAG. It sends 4 CSV rows as messages to the Kafka topic.




πŸ“¬ Step 7: Validate Kafka Messages

In Control Center UI, view `my-csv-topic` to verify the 4 records were published.



✅ Conclusion

You've set up a complete local pipeline using Kafka, Airflow, and Minikube. This stack is perfect for prototyping and can scale to production environments.


No comments:

Post a Comment