9/21/2025

Hands-On Study Guide: Runtime Security Monitoring with Falco & Grafana Stack

📖 Runtime Security Analogy: The Investigator and the Doctor

This guide explains the distinct but complementary roles of Falco, Falcosidekick, Prometheus, Grafana, and Loki.
To make it intuitive, we’ll use an analogy: a Crime Scene Investigator (CSI), their Dispatcher, their Doctor, and a shared Command Center.





👉 So the distinction is:

  • Falco => Sidekick (Dashboard + Fanout) => Loki = Incident Reports (logs, JSON events)

  • Falco => Prometheus => Grafana = Health Reports (numeric metrics)



🔎 Introduction: Two Jobs, One Mission

In any security operation, two functions are equally important:

  1. Investigating threats – detecting malicious or suspicious activity.

  2. Monitoring the health of the investigator – ensuring the security tools themselves are working reliably.

A tired or overworked investigator might miss a crucial clue. Similarly, a misconfigured security tool might miss a real threat.

Our runtime security stack mirrors this exact model. Different tools play different roles, working together to provide both detection and health monitoring.


👥 The Characters

  • The Crime Scene Investigator (CSI): An expert patrolling the city, spotting crimes based on a rulebook, and writing detailed reports.

  • The Dispatcher: The operator who instantly forwards the CSI’s reports to multiple destinations.

  • The Doctor: A physician whose job is to check on the CSI’s health and performance.

  • The Command Center: The central hub where the Police Chief can view both the crime reports and the investigator’s health charts.


🔄 Mapping the Analogy to the Technology

🕵️ The Crime Scene Investigator = Falco

  • The City: Your Kubernetes cluster.

  • Patrolling the Streets: Falco uses an eBPF probe to observe every system call: processes, files, network connections.

  • The Rulebook: custom-rules.yaml. If an action matches a rule (e.g., “Terminal shell in container”), Falco flags it.

  • Filing Reports: For each event, Falco generates a JSON log with rich details: who, what, where, when. These are sent via http_output.


📡 The Dispatcher’s Live Feed = Falcosidekick

  • The Dispatcher: Falcosidekick listens for Falco’s reports.

  • The Live Feed: The Falcosidekick Web UI (http://localhost:2802) shows a real-time stream of Falco alerts — the best way to see what’s happening right now.

  • Forwarding Reports: Falcosidekick can send these alerts to multiple systems simultaneously (Loki, Slack, Elasticsearch, etc.).


📁 The Case File Archive = Loki

  • Forwarding to the Archive: Falcosidekick pushes every Falco report to Loki.

  • The Archive: Loki stores the full JSON logs, creating a searchable history of every detected security event.

  • Deep Analysis: Using LogQL, you can slice and dice events, filter by rule, namespace, severity, or process.


🩺 The Investigator’s Doctor = Prometheus

  • Routine Check-ups: Prometheus polls Falco’s /metrics endpoint.

  • Vital Signs Collected:

    • falco_events_total → how many reports Falco has filed.

    • falco_syscalls_total → how much workload Falco has handled.

    • up → is Falco alive and responding?

  • Health Reports: Prometheus stores these as time-series metrics. It doesn’t know the contents of crime reports, only statistics about Falco’s activity.


🖥️ The Unified Command Center = Grafana + FalcoSidekick

Grafana brings both views together for the Police Chief:

Investigation Board (FalcoSideKick):
View Falco’s detailed JSON alerts:

  1.  Perfect for forensic analysis.

  2. Health Dashboard (Prometheus):
    View charts of Falco’s performance and workload, e.g.:

    • “Spike in high-priority alerts over the last hour?”

    • “Is Falco consuming too much CPU?”


✅ Conclusion: Why Separation is Powerful

This Investigator + Doctor model reflects industry best practice:

  • Loki → for detailed, text-rich, searchable event logs.

  • Prometheus → for fast, efficient, numerical health metrics.

By keeping logs and metrics separate but unifying them in Grafana, you get the best of both worlds:
🔹 Deep forensic detail for every incident.
🔹 Operational visibility into the reliability of your detection engine.


🛡️ Hands-On Study Guide: Runtime Security Monitoring with Falco & Grafana Stack


1. Prerequisites

  • A Kubernetes cluster (Minikube recommended: minikube start --cpus=4 --memory=8192)

  • Helm 3 installed

  • kubectl installed

  • Docker installed (for building sample apps)

  • Checkout the below github repository: 

https://github.com/dhanuka84/grafana-prometheus-app-monitoring/tree/falco


2. Lab Setup

2.1 Start Minikube

./scripts/start-minikube.sh


2.2 Install Monitoring Stack

Deploy Prometheus, Grafana, and Loki:

./scripts/install.sh


Check:

kubectl get pods -n monitoring

✅ Expected: Pods for Prometheus, Grafana, and Loki running.



3. Install Falco & Falcosidekick

3.1 Key Configurations

  • Falco: eBPF driver, JSON output, Prometheus metrics, HTTP output → Falcosidekick.

  • Falcosidekick: Web UI enabled, Loki output.

3.2 Deploy

./scripts/falco-install.sh


✅ Expected:


  • Falco DaemonSet running in the namespace falco.

  • Falcosidekick service/UI available.


4. Access Dashboards

./scripts/port-forward.sh





5. Test Falco’s Custom Rules


https://github.com/dhanuka84/grafana-prometheus-app-monitoring/blob/falco/scripts/falco-values.yaml



5.1 Simulate “Terminal shell in container”

kubectl -n falco run testbox --image=busybox:1.36 --restart=Never -it -- sh

# inside the pod:

ls

exit



5.2 Observe

  • Falcosidekick UI → real-time alert.




  • Grafana (Explore →Prometheus datasource) → query: falcosecurity_falco_cpu_usage_ratio{service="falco-metrics"}


No comments:

Post a Comment