Tetragon ebpf
Docker,  Ebpf,  Kubernetes,  linux

Ebpf Tetragon编译和调试指南

这篇文章详细介绍了基于 ebpf 的 Tetragon 工具的本地编译和调试方法。包括 Vagrant和 VirtualBox 的配置,Tetragon 和Tetra调试工具捕获和分析eBPF事件的方法等;以及在 Docker 中编译 Tetragon,并在 Kubernetes 集群中调试的过程。

调试依赖

tetragon github说明调试都在virtualbox中进行,所以开始调试之前需要安装vagrant和virtualbox

tetragon源码中有提供Vagrant,通过Vagrant启动virtualbox时,会自动安装所有的依赖和工具。详情参考tetragon源码。

启动虚拟机

通过下面的命令启动并登陆到virtualbox虚拟机:

vagrant up
vagrant ssh

在部分vagrant老版本上,vagrant up可能会报错,可以删除disk配置再试

diff --git a/Vagrantfildiffe b/Vagrantfile
--- a/Vagrantfile
+++ b/Vagrantfile
@@ -1,6 +1,5 @@
 Vagrant.configure("2") do |config|
   config.vm.box = "ubuntu/impish64"
-  config.vm.disk :disk, size: "50GB"
   config.vm.provision :docker

本地编译

  • 首先安装llvm和libbpf的依赖
make tools-install
  • 然后编译bpf程序和go程序
LD_LIBRARY_PATH=$(realpath ./lib) make

此命令会编译所有的可执行程序和Test程序,调试过程中为了提高效率可以只编译tetra和tetragon。

LD_LIBRARY_PATH=$(realpath ./lib) make tetra  tetragon

然后可以看到本地生成了tetragon和tetra程序,后续会说明吗这两个程序的作用。

 ls  tetra*
tetra  tetragon  tetragon-alignchecker

本地调试

启动tetragon

启动的时候需要指定ebpf .o文件在那个路径下,并且指定tetragon库。

sudo LD_LIBRARY_PATH=$(realpath ./lib) ./tetragon --bpf-lib bpf/objs

tetragon启动之后,会加载ebpf程序,并且在指定端口(默认为localhost:54321)等待listener的链接,listener连接之后,tetragon会把监听到的事件转发给listener。

tetragon启动log如下:

time="2022-05-29T10:37:56Z" level=info msg="Starting tetragon" version=v0.8.0-106-g5c3fd60
time="2022-05-29T10:37:56Z" level=info msg="config settings" 
     config="map[bpf-lib:bpf/objs btf: cilium-bpf: config-dir: config-file: debug:false enable-cilium-api:false enable-export-aggregation:false enable-k8s-api:false enable-process-ancestors:true enable-process-cred:false enable-process-ns:false export-aggregation-buffer-size:10000 export-aggregation-window-size:15s export-allowlist: export-denylist: export-file-compress:false export-file-max-backups:5 export-file-max-size-mb:10 export-file-rotation-interval:0s export-filename: export-rate-limit:-1 force-small-progs:false ignore-missing-progs:false kernel: log-format:text log-level:info metrics-server: netns-dir:/var/run/docker/netns/ process-cache-size:65536 procfs:/proc/ run-standalone:false server-address:localhost:54321 verbose:0]"
time="2022-05-29T10:37:56Z" level=info msg="Available sensors" sensors=
time="2022-05-29T10:37:56Z" level=info msg="Registered tracing sensors" sensors="kprobe sensor, tracepoint sensor"
time="2022-05-29T10:37:56Z" level=info msg="Registered probe types" types="tracepoint sensor, kprobe sensor"
time="2022-05-29T10:37:56Z" level=info msg="Disabling Kubernetes API"
time="2022-05-29T10:37:56Z" level=info msg="Disabling Cilium API"
time="2022-05-29T10:37:56Z" level=info msg="Starting process manager" enableCilium=false enableEventCache=false enableProcessCred=false enableProcessNs=false
time="2022-05-29T10:37:56Z" level=info msg="Exporter configuration" enabled=false fileName=
time="2022-05-29T10:37:56Z" level=info msg="Using metadata file" metadata=
time="2022-05-29T10:37:56Z" level=info msg="Loading sensor" name=__main__
time="2022-05-29T10:37:56Z" level=info msg="Loading kernel version 5.13.19"
time="2022-05-29T10:37:56Z" level=info msg="Starting gRPC server" address="localhost:54321"
time="2022-05-29T10:37:56Z" level=info msg="tetragon, map loaded." map=execve_map path=/sys/fs/bpf/tcpmon/execve_map sensor=__main__
time="2022-05-29T10:37:56Z" level=info msg="tetragon, map loaded." map=execve_map_stats path=/sys/fs/bpf/tcpmon/execve_map_stats sensor=__main__
time="2022-05-29T10:37:56Z" level=info msg="tetragon, map loaded." map=names_map path=/sys/fs/bpf/tcpmon/names_map sensor=__main__
time="2022-05-29T10:37:56Z" level=info msg="tetragon, map loaded." map=tcpmon_map path=/sys/fs/bpf/tcpmon/tcpmon_map sensor=__main__
time="2022-05-29T10:37:56Z" level=info msg="BPF prog was loaded" label=tracepoint/sys_exit prog=bpf/objs/bpf_exit.o
time="2022-05-29T10:37:56Z" level=info msg="BPF prog was loaded" label=kprobe/wake_up_new_task prog=bpf/objs/bpf_fork.o
time="2022-05-29T10:37:56Z" level=info msg="Load probe" Program=bpf/objs/bpf_execve_event_v53.o Type=execve
time="2022-05-29T10:37:57Z" level=info msg="Read ProcFS /proc/ appended 74/218 entries"
time="2022-05-29T10:37:57Z" level=warning msg="Procfs execve event pods/ identifier error" error="open /proc/0/cgroup: no such file or directory"
time="2022-05-29T10:37:57Z" level=info msg="BPF prog was loaded" label=tracepoint/sys_execve prog=bpf/objs/bpf_execve_event_v53.o
time="2022-05-29T10:37:57Z" level=info msg="Loaded BPF maps and events for sensor successfully" sensor=__main__
time="2022-05-29T10:37:57Z" level=info msg="Listening for events..."

从log中可以看到tetragon启动的config参数(第二行),还有tetragon的version等等。

修改默认log等级

从config参数中可以看到,默认的log级别是info,如果修改log等级为debug,需要在启动时添加--log-level=debug参数。

sudo LD_LIBRARY_PATH=$(realpath ./lib) ./tetragon --bpf-lib bpf/objs --log-level=debug

启动listerner:tetra

tetra是tetragon的一个CLI调试工具,tetra的具体参数和用法可以通过直接执行tetra查看。

Tetragon CLI

Usage:
  tetra [flags]
  tetra [command]

Available Commands:
  bugtool         Produce a tar archive with debug information
  getevents       Print events
  help            Help about any command
  sensors         Manage sensors
  stacktrace-tree Manage stacktrace trees
  status          Print health status
  tracingpolicy   Manage tracing policies
  version         Print version

Flags:
  -d, --debug                   Enable debug messages
  -h, --help                    help for tetra
      --server-address string   gRPC server address (default "localhost:54321")

Use "tetra [command] --help" for more information about a command.

通过tetra getevents可以获取tetragon发送的ebpf事件。tetra默认以json格式输出,例如:

 $ ./tetra getevents | jq "."
{
  "process_exec": {
    "process": {
      "exec_id": "OjI0Njk4Mjk4ODYxNDI6MTI0NDE=",
      "pid": 12441,
      "uid": 0,
      "cwd": "/run/containerd/io.containerd.runtime.v2.task/k8s.io/25ecffd7e2ee8332923500bd930eaebe2725bb643e7e9e2206b611f8666abac0/",
      "binary": "/usr/local/sbin/runc",
      "arguments": "--root /run/containerd/runc/k8s.io --log /run/containerd/io.containerd.runtime.v2.task/k8s.io/c7c9f733b618c7b46f0bd86e054fc5d940fbd7eb627cf009d6d9cd0584c846a6/log.json --log-format json exec --process /tmp/runc-process789407863 --detach --pid-file /run/containerd/io.containerd.runtime.v2.task/k8s.io/c7c9f733b618c7b46f0bd86e054fc5d940fbd7eb627cf009d6d9cd0584c846a6/4c9ea3805e0d2a23de4c0cdb983c52db832689aa821d3363ba8954b1b76359d9.pid c7c9f733b618c7b46f0bd86e054fc5d940fbd7eb627cf009d6d9cd0584c846a6",
      "flags": "execve clone",
      "start_time": "2022-05-29T10:54:19.452Z",
      "auid": 4294967295,
      "parent_exec_id": "OjMzMzgwMDAwMDAwOjI0NzM=",
      "refcnt": 1
    },
    "parent": {
      "exec_id": "OjMzMzgwMDAwMDAwOjI0NzM=",
      "pid": 2473,
      "uid": 0,
      "cwd": "/run/containerd/io.containerd.runtime.v2.task/k8s.io/25ecffd7e2ee8332923500bd930eaebe2725bb643e7e9e2206b611f8666abac0",
      "binary": "/usr/local/bin/containerd-shim-runc-v2",
      "arguments": "-namespace k8s.io -id 25ecffd7e2ee8332923500bd930eaebe2725bb643e7e9e2206b611f8666abac0 -address /run/containerd/containerd.sock",
      "flags": "procFS auid",
      "start_time": "2022-05-29T10:13:43.002Z",
      "auid": 0,
      "parent_exec_id": "OjIyODIwMDAwMDAwOjExOTE=",
      "refcnt": 4294967265
    }
  },
  "time": "2022-05-29T10:54:19.452Z"
}

通过tetra getevents –output compact 能获取到更友好的输出。

./tetra getevents --output compact
🚀 process  /usr/sbin/iptables -w 5 -W 100000 -S KUBE-KUBELET-CANARY -t mangle 
💥 exit     /usr/sbin/iptables -w 5 -W 100000 -S KUBE-KUBELET-CANARY -t mangle 0 
🚀 process  /usr/sbin/ip6tables -w 5 -W 100000 -S KUBE-KUBELET-CANARY -t mangle 
💥 exit     /usr/sbin/ip6tables -w 5 -W 100000 -S KUBE-KUBELET-CANARY -t mangle 0 

docker中编译

docker中编译的命令如下,在docker中编译默认会打包docker镜像,详细参考Makefile。

# Build Tetragon agent and operator images
LD_LIBRARY_PATH=$(realpath ./lib) make LOCAL_CLANG=0 image image-operator

# Bootstrap the cluster
contrib/localdev/bootstrap-kind-cluster.sh

# Install Tetragon
contrib/localdev/install-tetragon.sh --image cilium/tetragon:latest --operator cilium/tetragon-operator:latest

整个过程因为会按照kind工具,并且通过kind 安装单节点k8s cluster,所以会比较慢。

编译之后,tettagon会被部署在k8s cluster中,可以通过kubectl命令查看。

kubectl get pods -n kube-system

k8s cluster中调试

当tetragon在k8s cluster中部署之后,通过下面的命令查查看事件上报,输出的log比较多(json格式),可以通过jq命令进行日志过滤。

kubectl logs -n kube-system ds/tetragon -c export-stdout -f

参考 tetragon.

4 1 投票
文章评分
订阅评论
提醒
guest

1 评论
最旧
最新 最多投票
hey
hey
1 年 前

简洁,容易理解

1
0
希望看到您的想法,请您发表评论x