Skip to content

Debug gtp5g kernel module using stacktrace and eBPF

Note

Author: Ian Chen
Date: 2024/12/24


Case Study - kernel panic caused by the online charging PDU Session

free5GC is highly rely on the infrastructures provided by the Linux Kernel, especially the gtp5g kernel module.

@andy89923 found a reproducible kernel panic issue.
Follow the actions below can always produce the kernel panic:

  • Create online charging PDU Session
  • Ping the Data Network (should match the ip filter of the charging configuration)

Please also note that, the case of kernel panic will only happens if the version of gtp5g greater than v0.8.x.

Figure out the problem

Although we can get the panic log by using the dmesg. However, the stack dumps are not useful enough for kernel debugging at all.

However, we can use the decode_stacktrace.sh can find the specific line in source code by leveraging the vmlinux.

The original panic logs:

[  +0.004968] ------------[ cut here ]------------
[  +0.000002] kernel BUG at mm/slub.c:307!
[  +0.000109] invalid opcode: 0000 [#1] SMP PTI
[  +0.000056] CPU: 3 PID: 191301 Comm: nrf Tainted: G           OE     5.4.0-131-generic #147-Ubuntu
[  +0.000068] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[  +0.000047] RIP: 0010:kfree+0x236/0x250
[  +0.000048] Code: e7 e8 9e 71 fd ff e9 ef fe ff ff 4d 89 f1 41 b8 01 00 00 00 48 89 d9 48 89 da 4c 89 e6 4c 89 ef e8 6f fa ff ff e9 d0 fe ff ff <0f> 0b 48 8b 05 d1 51 77 01 e9 ff fd ff ff 66 66 2e 0f 1f 84 00 00
[  +0.000108] RSP: 0000:ffffa104c015c7f0 EFLAGS: 00010246
[  +0.000018] RAX: ffff93e58bc98000 RBX: ffff93e58bc98000 RCX: ffff93e58bc98000
[  +0.000017] RDX: 0000000000039962 RSI: bdd6aff4c23d967a RDI: ffff93e58bc98000
[  +0.000017] RBP: ffffa104c015c810 R08: ffff93e58bc98000 R09: ffffa104c015c8d8
[  +0.000018] R10: ffff93e5d302c680 R11: 0000000000000001 R12: fffffc7d8c2f2600
[  +0.000018] R13: ffff93e6adc06bc0 R14: ffffffff99edcf25 R15: ffff93e565a70600
[  +0.000017] FS:  000000c000580090(0000) GS:ffff93e6afac0000(0000) knlGS:0000000000000000
[  +0.000020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000014] CR2: 00007fac6fecf160 CR3: 000000034857c001 CR4: 0000000000760ee0
[  +0.000026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  +0.000018] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  +0.000021] PKRU: 55555554
[  +0.000007] Call Trace:
[  +0.000014]  <IRQ>
[  +0.000031]  skb_free_head+0x25/0x30
[  +0.000018]  skb_release_data+0x11d/0x180
[  +0.000015]  skb_release_all+0x26/0x30
[  +0.000015]  consume_skb+0x2c/0xb0
[  +0.000046]  gtp5g_dev_xmit+0xc3/0x170 [gtp5g]
[  +0.000016]  ? update_load_avg+0x7c/0x670
[  +0.000017]  dev_hard_start_xmit+0x91/0x1f0
[  +0.000019]  __dev_queue_xmit+0x75f/0x990
[  +0.000016]  ? nfnetlink_has_listeners+0x15/0x20 [nfnetlink]
[  +0.000016]  dev_queue_xmit+0x10/0x20
[  +0.000014]  neigh_direct_output+0x11/0x20
[  +0.000019]  ip_finish_output2+0x17e/0x580
[  +0.000016]  __ip_finish_output+0xf3/0x270
[  +0.000017]  ip_finish_output+0x2d/0xb0
[  +0.000018]  ip_output+0x75/0xf0
[  +0.000010]  ? __ip_finish_output+0x270/0x270
[  +0.000013]  ip_forward_finish+0x58/0x90
[  +0.000012]  ip_forward+0x3b9/0x4c0
[  +0.000010]  ? ip4_key_hashfn+0xb0/0xb0
[  +0.000012]  ip_sublist_rcv_finish+0x3d/0x50
[  +0.000021]  ip_sublist_rcv+0x1c5/0x270
[  +0.000956]  ? ip_rcv_finish_core.isra.0+0x3c0/0x3c0
[  +0.000637]  ip_list_rcv+0x10b/0x130
[  +0.000678]  __netif_receive_skb_list_core+0x228/0x250
[  +0.000576]  netif_receive_skb_list_internal+0x1a1/0x2b0
[  +0.000572]  gro_normal_list.part.0+0x1e/0x40
[  +0.000524]  napi_complete_done+0x91/0x130
[  +0.000557]  virtnet_poll+0x30d/0x450 [virtio_net]
[  +0.000558]  net_rx_action+0x142/0x390
[  +0.000598]  __do_softirq+0xd1/0x2c1
[  +0.000559]  irq_exit+0xae/0xb0
[  +0.000500]  do_IRQ+0x5a/0xf0
[  +0.000504]  common_interrupt+0xf/0xf
[  +0.000488]  </IRQ>
[  +0.000467] RIP: 0033:0x423172
[  +0.000467] Code: 23 4c 89 44 24 38 e8 8d 46 ff ff 48 85 f6 0f 84 a0 00 00 00 48 8b 94 24 88 00 00 00 49 89 f1 48 8b 74 24 48 4d 89 c8 4d 8b 09 <49> 29 d0 4d 85 c9 74 b0 4d 89 ca 49 29 d1 4c 39 ce 77 a5 4c 89 44
[  +0.001066] RSP: 002b:000000c000593e90 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb
[  +0.000517] RAX: 000000c000387200 RBX: 0000000000018e00 RCX: 5000000000000000
[  +0.000461] RDX: 000000c000380000 RSI: 0000000000020000 RDI: 0000000000000040
[  +0.000590] RBP: 000000c000593f08 R08: 000000c0003873d0 R09: 0000000000d17b82
[  +0.000601] R10: 0000000000d1ce8e R11: 0000000000018e00 R12: 0000000000000001
[  +0.000619] R13: 13679e0f6eacd19f R14: 000000c0005821a0 R15: 0000000000000000
[  +0.000461] Modules linked in: sctp vxlan xt_multiport xt_set ipt_rpfilter iptable_raw ip_set_hash_ip ip_set_hash_net ip_set wireguard ip6_udp_tunnel veth xfrm_user xfrm_algo nf_conntrack_netlink xt_addrtype xt_nat xt_tcpudp xt_MASQUERADE xt_mark xt_conntrack iptable_mangle ip6table_filter ip6table_mangle ip6table_nat ip6_tables iptable_nat nf_nat br_netfilter bridge stp llc nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nft_compat nf_tables nfnetlink iptable_filter xt_comment bpfilter aufs overlay dummy dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua binfmt_misc intel_rapl_msr intel_rapl_common isst_if_common nfit kvm_intel kvm rapl joydev input_leds serio_raw mac_hid qemu_fw_cfg sch_fq_codel gtp5g(OE) sunrpc ramoops udp_tunnel reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid bochs_drm drm_vram_helper ttm
[  +0.000189]  drm_kms_helper crct10dif_pclmul crc32_pclmul syscopyarea sysfillrect ghash_clmulni_intel sysimgblt aesni_intel crypto_simd cryptd glue_helper fb_sys_fops virtio_net psmouse net_failover drm failover virtio_scsi i2c_piix4 pata_acpi floppy
[  +0.005947] ---[ end trace e017af78fce65824 ]---

You can follow the commands below to make the panic logs more human-friendly:

$ sudo apt install linux-source-5.4.0
$ cd /usr/src/linux-source-5.4.0
$ sudo make -j$(nproc) vmlinux // if you don't have vmlinux file
$ sudo ./scripts/decode_stacktrace.sh ./vmlinux ./ ~/gtp5g/ < ~/panic.log  > ~/out.log

The output will looks like:
[  +0.004968] ------------[ cut here ]------------
[  +0.000002] kernel BUG at mm/slub.c:307!
[  +0.000109] invalid opcode: 0000 [#1] SMP PTI
[  +0.000056] CPU: 3 PID: 191301 Comm: nrf Tainted: G           OE     5.4.0-131-generic #147-Ubuntu
[  +0.000068] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[   +0.000047] RIP: 0010:kfree (/usr/src/linux-source-5.4.0/mm/slub.c:307 /usr/src/linux-source-5.4.0/mm/slub.c:302 /usr/src/linux-source-5.4.0/mm/slub.c:3035 /usr/src/linux-source-5.4.0/mm/slub.c:3060 /usr/src/linux-source-5.4.0/mm/slub.c:4027)
[ +0.000048] Code: e7 e8 9e 71 fd ff e9 ef fe ff ff 4d 89 f1 41 b8 01 00 00 00 48 89 d9 48 89 da 4c 89 e6 4c 89 ef e8 6f fa ff ff e9 d0 fe ff ff <0f> 0b 48 8b 05 d1 51 77 01 e9 ff fd ff ff 66 66 2e 0f 1f 84 00 00
All code
========
   0:   e7 e8                   out    %eax,$0xe8
   2:   9e                      sahf
   3:   71 fd                   jno    0x2
   5:   ff                      (bad)
   6:   e9 ef fe ff ff          jmpq   0xfffffffffffffefa
   b:   4d 89 f1                mov    %r14,%r9
   e:   41 b8 01 00 00 00       mov    $0x1,%r8d
  14:   48 89 d9                mov    %rbx,%rcx
  17:   48 89 da                mov    %rbx,%rdx
  1a:   4c 89 e6                mov    %r12,%rsi
  1d:   4c 89 ef                mov    %r13,%rdi
  20:   e8 6f fa ff ff          callq  0xfffffffffffffa94
  25:   e9 d0 fe ff ff          jmpq   0xfffffffffffffefa
  2a:*  0f 0b                   ud2         <-- trapping instruction
  2c:   48 8b 05 d1 51 77 01    mov    0x17751d1(%rip),%rax        # 0x1775204
  33:   e9 ff fd ff ff          jmpq   0xfffffffffffffe37
  38:   66                      data16
  39:   66                      data16
  3a:   2e                      cs
  3b:   0f                      .byte 0xf
  3c:   1f                      (bad)
  3d:   84 00                   test   %al,(%rax)
    ...

Code starting with the faulting instruction
===========================================
   0:   0f 0b                   ud2
   2:   48 8b 05 d1 51 77 01    mov    0x17751d1(%rip),%rax        # 0x17751da
   9:   e9 ff fd ff ff          jmpq   0xfffffffffffffe0d
   e:   66                      data16
   f:   66                      data16
  10:   2e                      cs
  11:   0f                      .byte 0xf
  12:   1f                      (bad)
  13:   84 00                   test   %al,(%rax)
    ...
[  +0.000108] RSP: 0000:ffffa104c015c7f0 EFLAGS: 00010246
[  +0.000018] RAX: ffff93e58bc98000 RBX: ffff93e58bc98000 RCX: ffff93e58bc98000
[  +0.000017] RDX: 0000000000039962 RSI: bdd6aff4c23d967a RDI: ffff93e58bc98000
[  +0.000017] RBP: ffffa104c015c810 R08: ffff93e58bc98000 R09: ffffa104c015c8d8
[  +0.000018] R10: ffff93e5d302c680 R11: 0000000000000001 R12: fffffc7d8c2f2600
[  +0.000018] R13: ffff93e6adc06bc0 R14: ffffffff99edcf25 R15: ffff93e565a70600
[  +0.000017] FS:  000000c000580090(0000) GS:ffff93e6afac0000(0000) knlGS:0000000000000000
[  +0.000020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000014] CR2: 00007fac6fecf160 CR3: 000000034857c001 CR4: 0000000000760ee0
[  +0.000026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  +0.000018] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  +0.000021] PKRU: 55555554
[  +0.000007] Call Trace:
[  +0.000014]  <IRQ>
[   +0.000031] skb_free_head (/usr/src/linux-source-5.4.0/net/core/skbuff.c:602)
[   +0.000018] skb_release_data (/usr/src/linux-source-5.4.0/net/core/skbuff.c:622)
[   +0.000015] skb_release_all (/usr/src/linux-source-5.4.0/net/core/skbuff.c:676)
[   +0.000015] consume_skb (/usr/src/linux-source-5.4.0/net/core/skbuff.c:690 /usr/src/linux-source-5.4.0/net/core/skbuff.c:848)
[   +0.000046] gtp5g_dev_xmit (/home/ianchen0119/gtp5g/src/gtpu/dev.c:136) gtp5g
[   +0.000016] ? update_load_avg (/usr/src/linux-source-5.4.0/kernel/sched/fair.c:3388 /usr/src/linux-source-5.4.0/kernel/sched/fair.c:3602)
[   +0.000017] dev_hard_start_xmit (/usr/src/linux-source-5.4.0/./include/linux/prandom.h:58 /usr/src/linux-source-5.4.0/net/core/dev.c:3216 /usr/src/linux-source-5.4.0/net/core/dev.c:3234)
[   +0.000019] __dev_queue_xmit (/usr/src/linux-source-5.4.0/./include/net/sch_generic.h:179 /usr/src/linux-source-5.4.0/net/core/dev.c:3453 /usr/src/linux-source-5.4.0/net/core/dev.c:3765)
[   +0.000016] ? nfnetlink_has_listeners+0x15/0x20 nfnetlink
[   +0.000016] dev_queue_xmit (/usr/src/linux-source-5.4.0/net/core/dev.c:3834)
[   +0.000014] neigh_direct_output (/usr/src/linux-source-5.4.0/net/core/neighbour.c:1548)
[   +0.000019] ip_finish_output2 (/usr/src/linux-source-5.4.0/./include/net/neighbour.h:510 /usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:236)
[   +0.000016] __ip_finish_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:317)
[   +0.000017] ip_finish_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:326)
[   +0.000018] ip_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:444)
[   +0.000010] ? __ip_finish_output (/usr/src/linux-source-5.4.0/net/ipv4/ip_output.c:320)
[   +0.000013] ip_forward_finish (/usr/src/linux-source-5.4.0/net/ipv4/ip_forward.c:84)
[   +0.000012] ip_forward (/usr/src/linux-source-5.4.0/./include/linux/netfilter.h:300 /usr/src/linux-source-5.4.0/net/ipv4/ip_forward.c:157)
[   +0.000010] ? ip4_key_hashfn (/usr/src/linux-source-5.4.0/net/ipv4/ip_forward.c:66)
[   +0.000012] ip_sublist_rcv_finish (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:539)
[   +0.000021] ip_sublist_rcv (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:588)
[   +0.000956] ? ip_rcv_finish_core.isra.0 (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:407)
[   +0.000637] ip_list_rcv (/usr/src/linux-source-5.4.0/net/ipv4/ip_input.c:622)
[   +0.000678] __netif_receive_skb_list_core (/usr/src/linux-source-5.4.0/net/core/dev.c:5014 /usr/src/linux-source-5.4.0/net/core/dev.c:5062)
[   +0.000576] netif_receive_skb_list_internal (/usr/src/linux-source-5.4.0/net/core/dev.c:5116 /usr/src/linux-source-5.4.0/net/core/dev.c:5209)
[   +0.000572] gro_normal_list.part.0 (/usr/src/linux-source-5.4.0/./include/linux/compiler.h:295 /usr/src/linux-source-5.4.0/./include/linux/list.h:28 /usr/src/linux-source-5.4.0/net/core/dev.c:5321)
[   +0.000524] napi_complete_done (/usr/src/linux-source-5.4.0/net/core/dev.c:6063 (discriminator 1) /usr/src/linux-source-5.4.0/net/core/dev.c:6051 (discriminator 1))
[   +0.000557] virtnet_poll+0x30d/0x450 virtio_net
[   +0.000558] net_rx_action (/usr/src/linux-source-5.4.0/net/core/dev.c:6366 /usr/src/linux-source-5.4.0/net/core/dev.c:6436)
[   +0.000598] __do_softirq (/usr/src/linux-source-5.4.0/./arch/x86/include/asm/jump_label.h:25 /usr/src/linux-source-5.4.0/./include/linux/jump_label.h:200 /usr/src/linux-source-5.4.0/./include/trace/events/irq.h:142 /usr/src/linux-source-5.4.0/kernel/softirq.c:293)
[   +0.000559] irq_exit (/usr/src/linux-source-5.4.0/kernel/softirq.c:373 /usr/src/linux-source-5.4.0/kernel/softirq.c:413)
[   +0.000500] do_IRQ (/usr/src/linux-source-5.4.0/arch/x86/kernel/irq.c:267 (discriminator 42))
[   +0.000504] common_interrupt (/usr/src/linux-source-5.4.0/arch/x86/entry/entry_64.S:613)
[  +0.000488]  </IRQ>
[  +0.000467] RIP: 0033:0x423172
[ +0.000467] Code: 23 4c 89 44 24 38 e8 8d 46 ff ff 48 85 f6 0f 84 a0 00 00 00 48 8b 94 24 88 00 00 00 49 89 f1 48 8b 74 24 48 4d 89 c8 4d 8b 09 <49> 29 d0 4d 85 c9 74 b0 4d 89 ca 49 29 d1 4c 39 ce 77 a5 4c 89 44
All code
========
   0:   23 4c 89 44             and    0x44(%rcx,%rcx,4),%ecx
   4:   24 38                   and    $0x38,%al
   6:   e8 8d 46 ff ff          callq  0xffffffffffff4698
   b:   48 85 f6                test   %rsi,%rsi
   e:   0f 84 a0 00 00 00       je     0xb4
  14:   48 8b 94 24 88 00 00    mov    0x88(%rsp),%rdx
  1b:   00
  1c:   49 89 f1                mov    %rsi,%r9
  1f:   48 8b 74 24 48          mov    0x48(%rsp),%rsi
  24:   4d 89 c8                mov    %r9,%r8
  27:   4d 8b 09                mov    (%r9),%r9
  2a:*  49 29 d0                sub    %rdx,%r8     <-- trapping instruction
  2d:   4d 85 c9                test   %r9,%r9
  30:   74 b0                   je     0xffffffffffffffe2
  32:   4d 89 ca                mov    %r9,%r10
  35:   49 29 d1                sub    %rdx,%r9
  38:   4c 39 ce                cmp    %r9,%rsi
  3b:   77 a5                   ja     0xffffffffffffffe2
  3d:   4c                      rex.WR
  3e:   89                      .byte 0x89
  3f:   44                      rex.R

Code starting with the faulting instruction
===========================================
   0:   49 29 d0                sub    %rdx,%r8
   3:   4d 85 c9                test   %r9,%r9
   6:   74 b0                   je     0xffffffffffffffb8
   8:   4d 89 ca                mov    %r9,%r10
   b:   49 29 d1                sub    %rdx,%r9
   e:   4c 39 ce                cmp    %r9,%rsi
  11:   77 a5                   ja     0xffffffffffffffb8
  13:   4c                      rex.WR
  14:   89                      .byte 0x89
  15:   44                      rex.R
[  +0.001066] RSP: 002b:000000c000593e90 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb
[  +0.000517] RAX: 000000c000387200 RBX: 0000000000018e00 RCX: 5000000000000000
[  +0.000461] RDX: 000000c000380000 RSI: 0000000000020000 RDI: 0000000000000040
[  +0.000590] RBP: 000000c000593f08 R08: 000000c0003873d0 R09: 0000000000d17b82
[  +0.000601] R10: 0000000000d1ce8e R11: 0000000000018e00 R12: 0000000000000001
[  +0.000619] R13: 13679e0f6eacd19f R14: 000000c0005821a0 R15: 0000000000000000
[  +0.000461] Modules linked in: sctp vxlan xt_multiport xt_set ipt_rpfilter iptable_raw ip_set_hash_ip ip_set_hash_net ip_set wireguard ip6_udp_tunnel veth xfrm_user xfrm_algo nf_conntrack_netlink xt_addrtype xt_nat xt_tcpudp xt_MASQUERADE xt_mark xt_conntrack iptable_mangle ip6table_filter ip6table_mangle ip6table_nat ip6_tables iptable_nat nf_nat br_netfilter bridge stp llc nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nft_compat nf_tables nfnetlink iptable_filter xt_comment bpfilter aufs overlay dummy dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua binfmt_misc intel_rapl_msr intel_rapl_common isst_if_common nfit kvm_intel kvm rapl joydev input_leds serio_raw mac_hid qemu_fw_cfg sch_fq_codel gtp5g(OE) sunrpc ramoops udp_tunnel reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid bochs_drm drm_vram_helper ttm
[  +0.000189]  drm_kms_helper crct10dif_pclmul crc32_pclmul syscopyarea sysfillrect ghash_clmulni_intel sysimgblt aesni_intel crypto_simd cryptd glue_helper fb_sys_fops virtio_net psmouse net_failover drm failover virtio_scsi i2c_piix4 pata_acpi floppy
[  +0.005947] ---[ end trace e017af78fce65824 ]---

The kernel panic is caused by the kfree function, I believe that the root cause is the socket buffer double-free. Moreover, gtp5g_dev_xmit is the dowlink entry function in GTP5G, So I can narrow down the scope of the problem.

For the online charging session, the FAR (Forwarding Action Rule) action will be changed to PKT_DROP after the first uplink packet be sent to data network til the UPF get the quota from SMF. So the downlink packet for responding the first uplink packet will be freed twice before UPF het the new quota.

The deatils of the charging system design can be found at CHF design document.

The interesting thing is that the panic won't happens til the we upgrade the gtp5g to v0.9.0^.
I believe that the issue started to be visible is effected by #101, because of it gives the more accurate packet counting (timing issue).

The potentials of the eBPF for kernel debugging

The article: Live-patching security vulnerabilities inside the Linux kernel with eBPF Linux Security Module, posted by CloudFlare, has well explained how they leverage the eBPF to detect critical events in the kernel, even make the kernel more secured!

It inspires me started thinking: is it possible to use eBPF to troubleshoot our own kernel module?

[Optional] Compile the kernel with BTF enabled

In our testing environment, we build the kernel v6.12.4:

$ wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.12.4.tar.xz
$ tar xvf linux-6.12.4.tar.xz

Once you have downloaded the kernel source, you can started building the kernel by following the steps described in the article: Building the Linux kernel with BTF.

What is BTF?

The Extended Berkeley Packet Filter (eBPF) is esteemed for its portability, a primary attribute of which is due to the BPF Type Format (BTF). More details about BTF can be discovered in this comprehensive guide.

Before the advent of Compile Once-Run Everywhere (CO-RE), developers working with eBPF had to compile an individual eBPF object for each kernel version they intended to support. This stipulation led toolkits, such as iovisor/bcc, to depend on runtime compilations to handle different kernel versions.

However, the introduction of CO-RE facilitated a significant shift in eBPF portability, allowing a single eBPF object to be loaded into multiple differing kernels. This is achieved by the libbpf loader, a component within the eBPF's loader and verification architecture. The libbpf loader arranges the necessary infrastructure for an eBPF object, including eBPF map creation, code relocation, setting up eBPF probes, managing links, handling their attachments, among others.

Here's the technical insight: both the eBPF object and the target kernel contain BTF information, generally embedded within their respective ELF (Executable and Linkable Format) files. The libbpf loader leverages this embedded BTF information to calculate the requisite changes such as relocations, map creations, probe attachments, and more for an eBPF object. As a result, this eBPF object can be loaded and have its programs executed across any kernel without the need for object modification, thus enhancing portability.
-- aquasecurity/btfhub

Generate BTF for GTP5G

Add the btf target in the Makefile like this:

diff --git a/Makefile b/Makefile
index bec4880..bab04d4 100644
--- a/Makefile
+++ b/Makefile
@@ -98,3 +98,6 @@ uninstall:
        $(DEPMOD)
        rm -f /etc/modules-load.d/gtp5g.conf
        rmmod -f  $(MODULE_NAME)
+
+btf:
+       pahole --btf_encode gtp5g.ko
+       pahole --btf_encode gtp5g.o

And run the commands below:

$ make
$ make btf
$ readelf -S gtp5g.ko | grep BTF
$ sudo make install

In this way, you will able to see the gtp5g btf by using btftool:

$ sudo bpftool btf list | grep gtp5g
410: name [gtp5g]  size 243774B

use the command below to dump gtp5g BTF to C file:

$ sudo bpftool btf dump file /sys/kernel/btf/gtp5g format c

Trace the function calls in GTP5G

The eBPF program type BPF_PROG_TYPE_TRACING are a newer alternative to kprobes and tracepoints since Linux Kernel v5.5, which provides practically zero overhead by leveraging the BPF trampoline.

The command below can be used to list all of available functions for tracing purpose:

$ sudo cat /sys/kernel/tracing/available_filter_functions | grep gtp5g
// ...
gtp5g_dbg_read [gtp5g]
proc_qos_write [gtp5g]
gtp5g_qos_read [gtp5g]
gtp5g_far_read [gtp5g]
proc_pdr_write [gtp5g]
proc_dbg_write [gtp5g]
gtp5g_pdr_read [gtp5g]
proc_qer_write [gtp5g]
proc_far_write [gtp5g]
proc_urr_write [gtp5g]
get_proc_gtp5g_dev_list_head [gtp5g]
init_proc_gtp5g_dev_list [gtp5g]
create_proc [gtp5g]
remove_proc [gtp5g]

If we want to trace the function entry of gtp5g_encap_recv, we can write a program like below:

#include "vmlinux.h"
#include <bpf_tracing.h>
#include <bpf_helpers.h>

SEC("fentry/gtp5g_encap_recv")
int BPF_PROG(gtp5g_recv, struct sock *sk, struct sk_buff *skb)
{
    if (!skb->dev) {
        bpf_printk("device doesn't exist");
    }
    bpf_printk("device name: %s", skb->dev->name);
    return 0;
}

char _license[] SEC("license") = "GPL";
  • Any program with the fentry prefix will be executed before the execution of target function (in this case is gtp5g_encap_recv()), and can be identified as the BPF_PROG_TYPE_TRACING type by the kernel.
  • The arguments of the BPF_PROG follows the kernel function you want to trace. for example:
    // The definition in the gtp5g, the BPF_PROG's args should be same with the target function
    static int gtp5g_encap_recv(struct sock *, struct sk_buff *);
    

Load and attach eBPF program

Typically, we can load and attach the eBPF program into specific system hook by using bpftool. However, the bpftool does not support all of attachment types provided by the kernel.

Therefore, I use the libbpfgo and libbpf to implement the agent program for loading eBPF program:

package main

import (
    "os"
    "time"

    bpf "github.com/aquasecurity/libbpfgo"
)

func main() {
    bpfModule, err := bpf.NewModuleFromFile("main.o")
    if err != nil {
        panic(err)
    }
    defer bpfModule.Close()

    if err := bpfModule.BPFLoadObject(); err != nil {
        panic(err)
    }

    prog, err := bpfModule.GetProgram("gtp5g_recv")
    if err != nil {
        panic(err)
    }

    link, err := prog.AttachGeneric()
    if err != nil {
        panic(err)
    }
    if link.FileDescriptor() == 0 {
        os.Exit(-1)
    }

    for {
        time.Sleep(10 * time.Second)
    }
}

INFO
You will need to write a Makefile for building the BPF program and its agent.
If you don't know how to get started, please refer to my side project: tinyLB.

To see the output of the attached eBPF program:

sudo cat /sys/kernel/debug/tracing/trace_pipe

Conclusion

This article shows how to troubleshoot the kernel panic by using the decode_stacktrace.sh script and how to trace the function calls in the kernel module by using eBPF.
The eBPF is a powerful tool for kernel debugging. It can be used to trace the function calls in the kernel module with low overhead and even make the kernel more secure by detecting critical events.

References

About

Greetings! My name is Ian. I'm currently a Technical Steering Committee member of the free5GC project. We're focusing on delevering a fully open-source 5G core network for academic and industry usage.
If you're interested in our project, the pull requests and issues are always welcome!