eBPF: The Kernel Superpower Transforming Observability and Networking
How eBPF lets you run sandboxed programs in the Linux kernel — and why it's revolutionizing monitoring, networking, and security in production infrastructure.
eBPF: The Kernel Superpower Transforming Observability and Networking
There’s a moment in every production debugging session where you wish you could just see inside the kernel. Watch syscalls in real time. Trace network packets without modifying your applications. Profile CPU hotspots without rebooting.
eBPF makes this possible — safely, at production scale, without touching a line of application code.
It’s not an exaggeration to say eBPF is one of the most impactful Linux technologies of the past decade. Understanding it will make you a better infrastructure engineer.
What Is eBPF?
eBPF (extended Berkeley Packet Filter) lets you run sandboxed programs inside the Linux kernel without changing kernel source code or loading kernel modules.
Think of it as a tiny virtual machine embedded in the kernel. You write a program in a restricted C-like language, compile it to eBPF bytecode, and attach it to a kernel hook point. When that hook fires — a syscall, a network packet, a function entry — your program runs.
The kernel verifies the program before loading it to ensure it:
- Cannot crash the kernel
- Will terminate (no infinite loops)
- Only accesses authorized memory
This safety model is what makes eBPF viable in production. You’re not patching the kernel; you’re injecting safe, verified probes.
Hook Points: Where eBPF Programs Attach
eBPF programs attach to specific events in the kernel:
kprobes / kretprobes
Attach to the entry or return of any kernel function.
# Trace the do_sys_open kernel function
bpftrace -e 'kprobe:do_sys_open { printf("open: %s\n", str(arg1)); }'
tracepoints
Stable, pre-defined instrumentation points in the kernel. More reliable than kprobes across kernel versions.
# Trace all syscall entries
bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[probe] = count(); }'
uprobes
Attach to userspace function calls — profile application code without recompilation.
# Trace malloc calls in any process
bpftrace -e 'uprobe:/lib/x86_64-linux-gnu/libc.so.6:malloc { @allocs = count(); }'
XDP (eXpress Data Path)
Attach to the network driver, processing packets before they enter the kernel’s networking stack. Enables line-rate packet processing.
TC (Traffic Control)
Attach at the TC layer for packet filtering, shaping, and rewriting.
LSM (Linux Security Modules)
Attach to security hooks for mandatory access control enforcement.
eBPF Maps: Sharing Data
eBPF programs can’t return values in the traditional sense — they’re event-driven. Maps are the mechanism for:
- Storing state between invocations
- Passing data from kernel to userspace
- Sharing state between multiple eBPF programs
// Define a hash map in eBPF program
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, u32);
__type(value, u64);
__uint(max_entries, 1024);
} syscall_count SEC(".maps");
// Increment in kernel
u32 syscall_nr = ctx->id;
u64 *count = bpf_map_lookup_elem(&syscall_count, &syscall_nr);
if (count) (*count)++;
Userspace can read this map via the bpf() syscall, giving you a window into kernel-space data.
Practical Use Cases
1. Zero-Instrumentation Distributed Tracing
Tools like Pixie and Odigos use eBPF to capture HTTP requests, database queries, and gRPC calls — without any application changes or sidecars.
The eBPF program intercepts at the syscall level (read/write on sockets), parses protocol bytes, and exports spans to your tracing backend.
Application (unchanged)
→ write() syscall
→ eBPF hook captures HTTP request headers
→ Publishes to perf ring buffer
→ Userspace agent reads and exports to Jaeger/Tempo
2. Continuous CPU Profiling
Parca, Pyroscope, and Polar Signals use eBPF to capture stack traces at configurable intervals across all processes system-wide.
# Profile CPU usage across entire system for 30 seconds
profile-bpfcc -F 99 30
This produces flame graphs that reveal actual hotspots without sampling-based overhead or language-specific profilers.
3. Network Performance Monitoring
Track TCP retransmits, connection latency, and packet drops at the kernel level:
# Watch TCP retransmits in real time
bpftrace -e 'kprobe:tcp_retransmit_skb {
printf("retransmit: %s\n", comm);
}'
4. Security Enforcement
Falco uses eBPF to detect anomalous behavior: unexpected syscalls, privilege escalations, unusual network connections — all in real time.
Tetragon (from Cilium) goes further, providing security observability with the ability to block suspicious behavior at the kernel level using LSM hooks.
Cilium: eBPF for Kubernetes Networking
Cilium is the most prominent eBPF-powered production system. It replaces kube-proxy for Kubernetes networking, implementing:
- Service load balancing via XDP (instead of iptables)
- Network policies with identity-aware enforcement
- Hubble: network observability built on eBPF data
Why Replace iptables?
iptables uses linear rule matching — with 10,000 services, packet routing requires traversing 10,000 rules. eBPF hash maps provide O(1) lookup regardless of scale.
Benchmark results from large clusters show 5-10x better throughput and dramatically lower CPU usage on kube-proxy workloads when switching to Cilium.
# Cilium NetworkPolicy using identity labels
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-frontend-to-backend
spec:
endpointSelector:
matchLabels:
app: backend
ingress:
- fromEndpoints:
- matchLabels:
app: frontend
The eBPF Tooling Ecosystem
bpftrace
High-level scripting language for one-liners and quick probes. Best for exploration and debugging.
# Count syscalls by process name
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
BCC (BPF Compiler Collection)
Python/Lua frontend for writing more complex eBPF programs. Ships with dozens of useful tools (opensnoop, execsnoop, tcplife, etc.)
libbpf + CO-RE
The recommended approach for production-grade eBPF programs. CO-RE (Compile Once, Run Everywhere) handles kernel version compatibility automatically.
bpftool
Inspect running eBPF programs, maps, and BTF type information:
bpftool prog list # List loaded programs
bpftool map dump id 42 # Dump map contents
bpftool net show # Show network-attached programs
Limitations and Gotchas
Stack size limit: eBPF programs have a 512-byte stack limit. Complex programs require careful design or use of per-CPU maps.
No unbounded loops: The verifier rejects programs that might loop forever. Use bounded loops or tail calls for long-running logic.
Kernel version requirements: Many features require recent kernels. XDP needs 4.8+, BTF type information needs 5.2+, CO-RE needs 5.2+. Check your minimum supported kernel before designing around eBPF.
Verifier complexity: The verifier can be overly conservative. Programs that look correct can be rejected; error messages aren’t always helpful.
Getting Started
The fastest path to eBPF observability without writing programs:
# Install bpftrace (Ubuntu/Debian)
apt install bpftrace
# Top processes by syscall count
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
# Files opened system-wide
opensnoop-bpfcc
# New processes as they spawn
execsnoop-bpfcc
For Kubernetes, deploy Pixie (open source) for automatic service maps and request tracing with zero application changes.
eBPF is a paradigm shift. The choice is no longer between “instrument everything” (high overhead) and “observe nothing” (flying blind). eBPF gives you surgical, kernel-level visibility at negligible cost.
Once you’ve debugged a production issue with eBPF tracing that would have been invisible any other way, you’ll wonder how you ever lived without it.