Tracing the latency of the 5G Registration Procedure with eBPF

Note

Author: Chia-Hui, Chen
Date: 2025/12/24

Overview

In modern cloud-native architectures, system complexity has reached unprecedented levels. When performance bottlenecks occur in high-performance, high-concurrency systems, developers need tools that can reach deep into the system's lower layers to pinpoint the exact sources of latency. This is where eBPF (extended Berkeley Packet Filter) comes into play.

Why eBPF?

eBPF allows us to dynamically inject "probes" into both the kernel and user space without modifying the target program's source code or performing a recompilation. For many debugging and optimization scenarios, eBPF serves as an excellent diagnostic tool, offering the following core advantages:

Non-intrusive Observation
It transparently collects performance data without interfering with the business logic. This means we can perform deep-dive analysis directly in a production environment without worrying about modifying code.
A Bridge Between Kernel and User
It doesn't just track user-space function calls via Uprobes; it simultaneously monitors Kprobes (kernel probes) and scheduling events. This cross-layer observability allows for "complete" monitoring of a function's lifecycle — capturing not only its active execution time but also any instances where it was descheduled or blocked by the kernel.

In this article, we will apply these eBPF capabilities to a real-world high-concurrency scenario: Tracing the latency of the 5G Registration Procedure. By monitoring key functions within the 5G Registration Procedure in free5GC, we will demonstrate how to pinpoint bottlenecks that traditional tools miss, providing a transparent view of a 5G NAS message's journey through the Go runtime.

Background

Before diving into performance bottleneck analysis, we must understand two key technical areas: how eBPF enables tracing and how Go’s low-level design makes this task exceptionally challenging.

eBPF and the Mechanics of Uprobes

eBPF is a Linux kernel technology that allows us to execute custom programs within a sandboxed environment inside the kernel without modifying the kernel source code.
Through Uprobes (User-level Probes), we can attach directly to specific memory addresses within a binary. When the program reaches that point, the eBPF program is triggered to collect register and memory information. This allows us to observe the internal state of a program without modifying its source code and with minimal overhead.

Challenges in Tracing Go Applications

ABIInternal: The Hidden Register Trap

Since Go 1.17, Go has utilized a Register-based Calling Convention (ABIInternal). Unlike the standard C ABI or older Go versions that passed arguments on the stack, parameters are now distributed across registers like RAX, RBX, and RCX.
- Impact: Generic eBPF tools that do not account for Go’s register mapping will capture meaningless data. We must manually locate these registers to extract correct information.
Stack Growth and the Risks of uretprobe

Goroutine stacks are dynamic. When a function detects insufficient stack space, it calls runtime.morestack to expand, "moving" the entire stack content to a larger memory segment.
- Impact: eBPF's uretprobe relies on modifying the return address upon function entry. If a stack move occurs mid-execution, the recorded return address becomes invalid, potentially causing the tracer to fail or, worse, crashing the application.
- Solution: To ensure safety, we can replace uretprobe with manual calculations of function exit offsets, utilizing standard Uprobes to capture exit events accurately.
Invisibility of the G-M-P Scheduler

The Linux kernel recognizes Threads, but remains unaware of Goroutines.
- Impact: Measuring time at the thread level makes it impossible to distinguish which Goroutine is consuming resources or to account for kernel scheduling interference.
- Solution: Instead of blind kernel-level tracing, we target the internal state transition point of the Go Runtime: runtime.casgstatus. By monitoring when a Goroutine enters _Grunning or _Gwaiting, we obtain actual CPU execution time from a "Goroutine-centric perspective.
Goroutine Migration Between Threads

In Go's G-M-P scheduling model, a Goroutine is not pinned to a specific OS thread. A Goroutine might start executing on Thread A, get descheduled, and later resume on Thread B.
- The Impact: Traditional eBPF tools often use the Thread ID (TID) as a key to store timestamps. If a Goroutine migrates during function execution, the exit probe on Thread B will fail to find the start time recorded by the entry probe on Thread A, leading to broken traces or corrupted data.
- Solution: Instead of relying on TID, we extract the Goid (Goroutine ID) directly from the Go runtime. On x86_64, the g pointer is stored in the R14 register. By reading the goid field from the g struct, we obtain a persistent identifier that follows the Goroutine across different threads. Using Goid as our primary key ensures that our timing remains accurate regardless of thread migration.

Methodology

Locating the Entry and Exit Points of the function

To capture the Entry and Exit Points of the function, we can use objdump command to analyze the disassembled binary.

Note: Function addresses are not fixed and will change every time the program is recompiled. Always re-verify the offsets for your current binary.

Example Command:

objdump -d ./amf --disassemble="github.com/free5gc/amf/internal/gmm.HandleRegistrationRequest"

The output will be like

We can use grep to quickly find all ret instruction addresses.

objdump -d ./amf --disassemble="github.com/free5gc/amf/internal/gmm.HandleRegistrationRequest" | grep ret

0xcac2a6 is one of the ret instruction address.

Note: Every ret instruction within the function must be traced to ensure the exit event is captured regardless of the execution path.

We can use head -n 10 to find the entry.

objdump -d ./amf --disassemble="github.com/free5gc/amf/internal/gmm.HandleRegistrationRequest" | head -n 10

0xcabf80 is the entry address.

nm can also be used to find the entry.

nm ./amf | grep -w "github.com/free5gc/amf/internal/gmm.HandleRegistrationRequest"

Note: Go's compiler may "inline" small functions for performance. If a function is inlined, it loses its unique symbol and address, making it untraceable via eBPF. To prevent this, you must add the //go:noinline compiler directive directly above the function definition:
//go:noinline
func YourFunction() { ... }

Get information from Register (ABIInternal)

Get goid

To trace Go applications effectively, we need a persistent way to identify a Goroutine as it moves across different OS threads. This is where the Goid comes in. Extracting it requires two pieces of information: the location of the "g" pointer and the memory offset of the goid field.

Finding the G-Pointer via R14
In the x86_64 architecture, Go’s ABIInternal convention stores the pointer to the current Goroutine structure (runtime.g) in the R14 register.
- The Logic: Whenever an eBPF probe is triggered, we read the value of R14 to get the memory address of the current Goroutine's "home" in memory.

Using pahole to Locate the Offset

pahole -C "runtime.g" ./amf

the output will be like:

        struct runtime.g {
            runtime.stack              stack;                /*     0    16 */
            uintptr                    stackguard0;          /*    16     8 */
            uintptr                    stackguard1;          /*    24     8 */
            runtime._panic *           _panic;               /*    32     8 */
            runtime._defer *           _defer;               /*    40     8 */
            runtime.m *                m;                    /*    48     8 */
            runtime.gobuf              sched;                /*    56    48 */
            /* --- cacheline 1 boundary (64 bytes) was 40 bytes ago --- */
            uintptr                    syscallsp;            /*   104     8 */
            uintptr                    syscallpc;            /*   112     8 */
            uintptr                    syscallbp;            /*   120     8 */
            /* --- cacheline 2 boundary (128 bytes) --- */
            uintptr                    stktopsp;             /*   128     8 */
            void *                     param;                /*   136     8 */
            internal/runtime/atomic.Uint32 atomicstatus;     /*   144     4 */
            uint32                     stackLock;            /*   148     4 */
            uint64                     goid;                 /*   152     8 */
            runtime.guintptr           schedlink;            /*   160     8 */
            int64                      waitsince;            /*   168     8 */
            runtime.waitReason         waitreason;           /*   176     1 */
            bool                       preempt;              /*   177     1 */
            bool                       preemptStop;          /*   178     1 */
                                        .
                                        .
                                        .
            uint32                     sig;                  /*   220     4 */
            struct []uint8             writebuf;             /*   224    24 */
            uintptr                    sigcode0;             /*   248     8 */
            /* --- cacheline 4 boundary (256 bytes) --- */
            uintptr                    sigcode1;             /*   256     8 */
            uintptr                    sigpc;                /*   264     8 */
            uint64                     parentGoid;           /*   272     8 */
            uintptr                    gopc;                 /*   280     8 */
                                        .
                                        .
                                        .
            /* --- cacheline 6 boundary (384 bytes) --- */
            runtime.synctestBubble *   bubble;               /*   384     8 */
            runtime.gTraceState        trace;                /*   392    32 */
            int64                      gcAssistBytes;        /*   424     8 */
            uintptr                    valgrindStackID;      /*   432     8 */

            /* size: 440, cachelines: 7, members: 61 */
            /* sum members: 439, holes: 1, sum holes: 1 */
            /* last cacheline: 56 bytes */
    };

We can find that the offset of goid is 152

uint64                     goid;                 /*   152     8 */

> Note: This offset is version-dependent and should be verified using pahole for your specific Go runtime.
This tells us that goid is located 152 bytes from the start of the g struct. Therefore, the logic for our BPF program is: $\text{Goid} = \text{Value at } (R14 + 152)$

static __always_inline u64 get_goid(struct pt_regs *ctx) {
    void *g_ptr = (void *)(ctx->r14);
    u64 goid;
    bpf_probe_read_user(&goid, sizeof(goid), (void *)(g_ptr + 152));
    return goid;
}

Note: Go Runtime and all its Goroutines operate entirely in User Space. So we should use bpf_probe_read_user to ensures that the data is copied safely from the User Space process into a kernel buffer before we attempt to analyze it.

Tracing State Transitions: Capturing casgstatus Arguments

To calculate the actual "CPU Time" spent on a task, we must know when a Goroutine is actively running (_Grunning) and when it switch to other state like _Gwaiting. This transition is handled by the internal function runtime.casgstatus.

Function Signature and Register Mapping
The function is defined in the Go runtime as:
```
func casgstatus(gp *g, oldval, newval uint32)
```
Because Go uses a Register-based ABI, these arguments are not on the stack but are passed through CPU registers. On x86_64, the mapping is as follows:

gp (The G being modified): Stored in RAX.
oldval (The previous state): Stored in RBX.
newval (The target state): Stored in RCX.
2. The Implementation Method
When we attach a Uprobe to runtime.casgstatus, our BPF C code uses specific macros to pull these values directly from the registers:
```
SEC("uprobe/runtime.casgstatus")
int uprobe_casgstatus(struct pt_regs *ctx) {
    void *gp = (void *)ctx->ax;; 
    u32 oldval = (u32)ctx->bx;
    u32 newval = (u32)ctx->cx;

    u64 target_goid = bpf_probe_read_user(&goid, sizeof(goid), (void *)((u64)gp + 152));

    // handle logic
    return 0;
}
```
By monitoring these transitions, we can distinguish between Wall Time and CPU Time (the time actually spent processing), providing a much clearer picture of where the function is struggling.

Implementation

The following is the source code to trace the latency of 5G Registration Procedure.

Note: all the following contents are based on x86_64.

Source Code

The header file of eBPF

//trace.h
#ifndef __TRACER_H__
#define __TRACER_H__

#define user_pt_regs pt_regs

#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>

// Go runtime status
#define G_IDLE 0
#define G_RUNNABLE 1
#define G_RUNNING 2
#define G_SYSCALL 3
#define G_WAITING 4
#define G_MORIBUND_UNUSED 5
#define G_DEAD 6
#define G_ENQUEUE_UNUSED 7
#define G_COPYSTACK 8
#define G_PREEMPTED 9

// Offset
#define OFFSET_GOID    152
#define OFFSET_STARTPC 296
#define OFFSET_GOPC    280

// Register (x86-64 Go ABI)
#define GO_G_REG(ctx) (ctx->r14)
#define GO_ARG1(ctx)  (ctx->ax)
#define GO_ARG2(ctx)  (ctx->bx)
#define GO_ARG3(ctx)  (ctx->cx)
#define GO_ARG4(ctx)  (ctx->di)
#define GO_ARG5(ctx)  (ctx->si)

#endif

The eBPF tracer source code

// tracer.bpf.c
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>

typedef __u64 u64;
typedef __u32 u32;
typedef __u8  u8;

// key for tracking function calls per goroutine
struct proc_key {
    u64 goid;
    u64 func_id;
};

// goroutine CPU time tracking structure
struct goid_clock {
    u64 total_cpu_ns;
    u64 last_start_ns;
    u8  is_on_cpu;
};

// event structure for ring buffer
struct event_t {
    u64 goid;
    u64 total_cpu_ns;
    u32 event_type; 
    u64 func_id; 
};

#define EVENT_ENTER 1
#define EVENT_EXIT  2

#include "tracer.h" 

// global map to track goid CPU time
struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1024);
    __type(key, u64); // goid
    __type(value, struct goid_clock);
} goid_clocks SEC(".maps");

// map to store entry snapshots
struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 2048);
    __type(key, struct proc_key); // goid + func_id 
    __type(value, u64);           // snapshot vtime
} g_stats SEC(".maps");

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 256 * 1024);
} events SEC(".maps");

// calculate accumulated vtime for a given goid
static __always_inline u64 get_accumulated_vtime(u64 goid) {
    struct goid_clock *c = bpf_map_lookup_elem(&goid_clocks, &goid);
    if (!c) return 0;
    u64 total = c->total_cpu_ns;
    if (c->is_on_cpu) {
        total += (bpf_ktime_get_ns() - c->last_start_ns);
    }
    return total;
}
// get goid from pt_regs
static __always_inline u64 get_goid(struct pt_regs *ctx) {
    void *g_ptr = (void *)GO_G_REG(ctx);
    u64 goid;
    bpf_probe_read_user(&goid, sizeof(goid), (void *)(g_ptr + OFFSET_GOID));
    return goid;
}
// get goid from gp pointer
static __always_inline u64 get_goid_from_gp(void *gp) {
    u64 goid;
    bpf_probe_read_user(&goid, sizeof(goid), (void *)((u64)gp + OFFSET_GOID));
    return goid;
}
// enter uprobe
SEC("uprobe/handle_entry")
int handle_entry(struct pt_regs *ctx) {
    u64 goid = get_goid(ctx);
    u64 now = bpf_ktime_get_ns();
    u64 cookie = bpf_get_attach_cookie(ctx); 

    // initialize goid tracking map if not exists
    struct goid_clock *c = bpf_map_lookup_elem(&goid_clocks, &goid);
    if (!c) {
        struct goid_clock init_c = { .last_start_ns = now, .is_on_cpu = 1, .total_cpu_ns = 0 };
        bpf_map_update_elem(&goid_clocks, &goid, &init_c, BPF_ANY);
    }

    // store entry snapshot
    struct proc_key key = { .goid = goid, .func_id = cookie };
    u64 vtime_snapshot = get_accumulated_vtime(goid);
    bpf_map_update_elem(&g_stats, &key, &vtime_snapshot, BPF_ANY);
    // submit event
    struct event_t *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
    if (e) {
        e->goid = goid;
        e->event_type = EVENT_ENTER;
        e->total_cpu_ns = 0;
        e->func_id = cookie; 
        bpf_ringbuf_submit(e, 0);
    }
    return 0;
}
// enter uprobe for runtime.casgstatus
SEC("uprobe/runtime.casgstatus")
int uprobe_casgstatus(struct pt_regs *ctx) {
    void *gp = (void *)GO_ARG1(ctx);
    u32 oldval = (u32)GO_ARG2(ctx);
    u32 newval = (u32)GO_ARG3(ctx);
    u64 now = bpf_ktime_get_ns();
    u64 target_goid = get_goid_from_gp(gp);
    // lookup the goid tracking map
    struct goid_clock *c = bpf_map_lookup_elem(&goid_clocks, &target_goid);
    if (!c) return 0;

    // update CPU time tracking based on state transition
    if (oldval == G_RUNNING && newval != G_RUNNING) {
        if (c->is_on_cpu) {
            c->total_cpu_ns += (now - c->last_start_ns);
            c->is_on_cpu = 0;
        }
    } else if (oldval != G_RUNNING && newval == G_RUNNING) {
        c->last_start_ns = now;
        c->is_on_cpu = 1;
    }
    return 0;
}
// exit uprobe
SEC("uprobe/handle_exit")
int handle_exit(struct pt_regs *ctx) {
    u64 goid = get_goid(ctx);
    u64 cookie = bpf_get_attach_cookie(ctx); 
    struct proc_key key = { .goid = goid, .func_id = cookie };

    // retrieve the entry snapshot
    u64 *entry_vtime = bpf_map_lookup_elem(&g_stats, &key);
    if (!entry_vtime) return 0;

    // calculate CPU usage
    u64 current_vtime = get_accumulated_vtime(goid);
    u64 cpu_usage = current_vtime - *entry_vtime;
    // submit event
    struct event_t *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
    if (e) {
        e->goid = goid;
        e->total_cpu_ns = cpu_usage;
        e->event_type = EVENT_EXIT;
        e->func_id = cookie; 
        bpf_ringbuf_submit(e, 0);
    }

    // delete the snapshot
    bpf_map_delete_elem(&g_stats, &key);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

The sample yaml file. Only some key functions are shown here for illustrative purposes.

# config.yaml
binaries:
  - path: "../../free5gc/bin/amf"  # binary path
    symbols:
      - name: "github.com/free5gc/amf/internal/gmm.HandleRegistrationRequest"
        fun_id: 101 # use as cookie
        entry: 0xcabf80 # entry address   
        rets: # return addresses
          - 0xcac2a6
          - 0xcac2af
          - 0xcac3e5
          - 0xcac4fd
          - 0xcac6c4
          - 0xcac852
          - 0xcac8d2
          - 0xcac9db
          - 0xcacb13
          - 0xcad326
          - 0xcad385
          - 0xcad3a6
          - 0xcad3c7
          - 0xcad59d
          - 0xcad5fa
          - 0xcad64f
      - name: "github.com/free5gc/amf/internal/gmm.HandleAuthenticationResponse"
        fun_id: 102
        entry: 0xcb71a0
        rets:
          - 0xcb7422
          - 0xcb7443
          - 0xcb7464
          - 0xcb75e3
          - 0xcb7765
          - 0xcb7b1a
          - 0xcb7b74
          - 0xcb7b83
          - 0xcb7ba4
          - 0xcb7bb1
          - 0xcb7d91
          - 0xcb7f05
          - 0xcb8063
          - 0xcb81db
          - 0xcb84d0
          - 0xcb852a
          - 0xcb8539
      - name: "github.com/free5gc/amf/internal/gmm/message.SendRegistrationAccept"
        fun_id: 103
        entry: 0xca8440
        rets:
          - 0xca8938
          - 0xca89d8
          - 0xca8a4e
          - 0xca8ac4
          - 0xca8ad2
      - name: "github.com/free5gc/amf/internal/gmm/message.SendDeregistrationAccept"
        fun_id: 104
        entry: 0xca80a0
        rets:
          - 0xca8232
          - 0xca826b
          - 0xca82d2
          - 0xca8340
          - 0xca83b0
          - 0xca83be
      - name: "github.com/free5gc/amf/internal/gmm.handleRequestedNssai"
        fun_id: 105
        entry: 0xcb06e0
        rets:
          - 0xcb080d
          - 0xcb0c19
          - 0xcb0ec5
          - 0xcb1a7d
          - 0xcb1a8a
          - 0xcb1b37
      - name: "github.com/free5gc/amf/internal/gmm.HandleInitialRegistration"
        fun_id: 106
        entry: 0xcadae0
        rets:
          - 0xcadf77
          - 0xcadfc4
          - 0xcae019
          - 0xcae022
          - 0xcae512
      - name: "github.com/free5gc/amf/internal/gmm.AuthenticationProcedure"
        fun_id: 107
        entry: 0xcb4e60
        rets:
          - 0xcb4f98
          - 0xcb50bc
          - 0xcb5113
          - 0xcb5354
          - 0xcb541b
          - 0xcb554e
          - 0xcb561c
      - name: "github.com/free5gc/amf/internal/gmm.contextTransferFromOldAmf"
        fun_id: 108
        entry: 0xcad6c0
        rets:
          - 0xcad99a
          - 0xcad9db
          - 0xcada2d
          - 0xcada73
          - 0xcada7c
  - path: "../../free5gc/bin/ausf"
    symbols:
      - name: "github.com/free5gc/ausf/internal/sbi/processor.(*Processor).UeAuthPostRequestProcedure"
        fun_id: 203
        entry: 0xb8fda0
        rets:
          - 0xb902ce
          - 0xb90a57
          - 0xb90b6d
          - 0xb90c14
          - 0xb90e9e
          - 0xb911b7

Note: In a real production environment, this list can be expanded to a large number of functions across all 5GC NFs (SMF, UPF, PCF, etc.).

Python script for configuration generation.

# gen_config.py
import subprocess
import re
import sys
import shlex

def get_offsets(binary, search_pattern):
    try:
        nm_cmd = f"nm -n {binary} | grep {shlex.quote(search_pattern)}"
        nm_output = subprocess.check_output(nm_cmd, shell=True).decode().splitlines()

        if not nm_output:
            print(f"Error: Symbol matching '{search_pattern}' not found.")
            return None

        target_line = nm_output[0].split()
        entry_addr_hex = target_line[0]
        full_symbol_name = target_line[2]
        entry_addr = int(entry_addr_hex, 16)

        print(f"[*] Found Full Symbol: {full_symbol_name}")
        print(f"[*] Entry Address: 0x{entry_addr:x}")

        obj_cmd = f"objdump -d {binary} --disassemble={shlex.quote(full_symbol_name)}"
        dump_out = subprocess.check_output(obj_cmd, shell=True).decode()
        rets = re.findall(r'^\s*([0-9a-f]+):\s+(?:c3|c2|cb|ca)\s+ret', dump_out, re.MULTILINE)

        ret_addrs = [int(r, 16) for r in rets]

        return entry_addr, ret_addrs, full_symbol_name

    except subprocess.CalledProcessError as e:
        print(f"Exec error: {e}")
        return None


binary_path = "~/free5gc/bin/ausf" # binary path
target_function = "UeAuthPostRequestProcedure" # function name to search
fun_id = 103  # function ID for tracing

result = get_offsets(binary_path, target_function)

if result:
    entry, rets, full_name = result
    print(f"- name: \"{full_name}\"")
    print(f"  fun_id: {fun_id}")
    print(f"  entry: 0x{entry:x}")
    print(f"  rets:")
    for r in rets:
        print(f"    - 0x{r:x}")

A Go-based eBPF loader for probe orchestration and event processing.

// main.go
package main
//go:generate bpf2go -target amd64 bpf ../tracer.bpf.c
import (
    "bytes"
    "encoding/binary"
    "errors"
    "fmt"
    "log"
    "os"
    "os/signal"
    "syscall"

    "github.com/cilium/ebpf/link"
    "github.com/cilium/ebpf/ringbuf"
    "github.com/cilium/ebpf/rlimit"
    "gopkg.in/yaml.v3" 
)

type Config struct {
    Binaries []BinaryConfig `yaml:"binaries"`
}
// BinaryConfig defines a binary to trace and its symbols.
type BinaryConfig struct {
    Path    string         `yaml:"path"`
    Symbols []SymbolConfig `yaml:"symbols"`
}
// SymbolConfig defines a function to trace within a binary.
type SymbolConfig struct {
    Name  string   `yaml:"name"`
    FunID uint64   `yaml:"fun_id"`
    Entry uint64   `yaml:"entry"`
    Rets  []uint64 `yaml:"rets"`
}
// Event types
type bpfEventT struct {
    Goid       uint64 // Goroutine ID
    TotalCpuNs uint64 // valid only for exit events
    EventType  uint32 // 1 for enter, 2 for exit
    _          [4]byte // Padding to align to 8 bytes
    FuncId     uint64  // Function ID
}
const (
    EventEnter = 1
    EventExit  = 2
)

func main() {
    if err := rlimit.RemoveMemlock(); err != nil {
        log.Fatal(err)
    }

    // load config.yaml
    configFile, err := os.ReadFile("config.yaml") // path to your config file
    if err != nil {
        log.Fatalf("failed to read config: %v", err)
    }
    var config Config
    if err := yaml.Unmarshal(configFile, &config); err != nil {
        log.Fatalf("failed to parse yaml: %v", err)
    }

    // load BPF Objects
    var objs bpfObjects
    if err := loadBpfObjects(&objs, nil); err != nil {
        log.Fatalf("loading objects: %v", err)
    }
    defer objs.Close()

    // create function ID to name map
    idToName := make(map[uint64]string)
    var links []link.Link
    defer func() {
        for _, l := range links {
            l.Close()
        }
    }()

    // loop through binaries and symbols to attach uprobes
    for _, bin := range config.Binaries {
        ex, err := link.OpenExecutable(bin.Path)
        if err != nil {
            log.Printf("Warning: opening executable %s failed: %v", bin.Path, err)
            continue
        }

        for _, sym := range bin.Symbols {
            // create function ID to name mapping
            idToName[sym.FunID] = sym.Name

            // attach Entry Uprobe with Cookie
            enL, err := ex.Uprobe(sym.Name, objs.HandleEntry, &link.UprobeOptions{
                Offset: 0x0,
                Cookie: sym.FunID,
            })
            if err != nil {
                log.Fatalf("failed to attach entry for %s: %v", sym.Name, err)
            }
            links = append(links, enL)

            // attach Exit Uprobes with Cookie
            for _, retAddr := range sym.Rets {
                retOffset := retAddr - sym.Entry
                exL, err := ex.Uprobe(sym.Name, objs.HandleExit, &link.UprobeOptions{
                    Offset: retOffset,
                    Cookie: sym.FunID,
                })
                if err != nil {
                    log.Fatalf("failed to attach exit at 0x%x for %s: %v", retOffset, sym.Name, err)
                }
                links = append(links, exL)
            }
        }

        // attach casgstatus uprobe
        casgL, err := ex.Uprobe("runtime.casgstatus", objs.UprobeCasgstatus, nil)
        if err != nil {
            log.Printf("Warning: casgstatus attach failed for %s: %v", bin.Path, err)
        } else {
            links = append(links, casgL)
        }
    }

    log.Printf("Tracing 5G Core... Listening for events (Ctrl+C to stop)")

    // read events from ring buffer
    rd, err := ringbuf.NewReader(objs.Events)
    if err != nil {
        log.Fatalf("creating ringbuf reader: %v", err)
    }
    defer rd.Close()

    sig := make(chan os.Signal, 1)
    signal.Notify(sig, os.Interrupt, syscall.SIGTERM)
    go func() {
        <-sig
        log.Println("Received interrupt, shutting down...")
        rd.Close()
    }()

    for {
        record, err := rd.Read()
        if err != nil {
            if errors.Is(err, ringbuf.ErrClosed) {
                return
            }
            log.Printf("reading from ringbuf: %v", err)
            continue
        }

        var event bpfEventT
        if err := binary.Read(bytes.NewBuffer(record.RawSample), binary.LittleEndian, &event); err != nil {
            log.Printf("parsing event: %v", err)
            continue
        }

        // print event info
        funcName := idToName[event.FuncId]
        if funcName == "" {
            funcName = "Unknown_Function"
        }

        switch event.EventType {
        case EventEnter:
            fmt.Printf("[%s] ENTER | Goid: %d\n", funcName, event.Goid)
        case EventExit:
            fmt.Printf("[%s] EXIT  | Goid: %d | Total CPU: %d ns (%.3f ms)\n",
                funcName, event.Goid, event.TotalCpuNs, float64(event.TotalCpuNs)/1e6)
        }
    }
}

How to use

Generate vmlinux.h

bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

Generate bpf_bpfel.o and bpf_bpfel.go
By running go generate, we produce bpf_bpfel.o (the compiled bytecode) and bpf_bpfel.go (the Go bindings). These files act as the bridge, allowing our Go application to load the eBPF program into the kernel.
```
go generate
```
Build eBPF loader
```
go build -o tracer
```
Execute eBPF loader
```
sudo ./tracer
```

Result

Conclusion

In this article, we have navigated the intricate intersection of eBPF technology and the Go runtime to address one of the challenges in 5G core network optimization: transparent, high-precision latency tracing.

References

About Me
Hi, I'm Chia-Hui Chen. I'm currently diving into 5G technology and the free5gc project. I hope you find this blog valuable! Feel free to reach out if you have any feedback or would like to discuss anything further.

GitHub: chchen7