Linux追踪技术ftrace的使用方法（一）

发布时间：2023-02-27 10:22:37 所属栏目：Linux 来源：

导读：前言
一、Single thread tracing
通过写入 set_ftrace_pid，您可以跟踪单个线程，比如：

echo 0 > tracing_on
echo nop > current_tracer
echo > trace
追踪 pid = 411870 的任务：

echo 411870 > set_f

前言
一、Single thread tracing
通过写入 set_ftrace_pid，您可以跟踪单个线程，比如：

echo 0 > tracing_on
echo nop > current_tracer
echo > trace
追踪 pid = 411870 的任务：

echo 411870 > set_ftrace_pid
echo function > current_tracer
echo 1 > tracing_on
cat trace | more

二、Ftrace function_graph
2.1 Graph Tracing
此跟踪器类似于函数跟踪器，只是它在函数的入口和出口处探测函数。这是通过在每个 task_struct 中使用动态分配的返回地址堆栈来完成的。在函数进入时，跟踪器会覆盖每个跟踪函数的返回地址以设置自定义探针。因此原始返回地址存储在task_struct中的返回地址堆栈中。

在函数的两端进行探测会导致特殊功能，例如：
（1）衡量函数执行时间。
（2）有一个可靠的调用栈来绘制函数调用图。

这个跟踪器在以下几种情况下是有用的:
（1）想找到内核异常行为的原因，需要详细了解在任何区域(或特定区域)发生了什么。
（2）正在经历不寻常的延迟，但很难找到它的起源。
（3）快速找到特定功能所采用的路径
（4）查看运行中的内核，看看那里发生了什么。

有几列可以动态启用/禁用。您可以根据需要使用所需的每种选项组合：

function_graph 跟踪器打印函数的调用图，展示其代码流。

只跟踪一个函数及其所有子函数，将其名称echo 到 set_graph_function 中：
echo do_nanosleep > set_graph_function
下面使用 do_nanosleep() 函数上的 function_graph 跟踪器来显示其子函数调用：

echo do_nanosleep > set_graph_function
echo function_graph > current_tracer

echo nop > current_tracer
echo > set_graph_function
输出显示子调用和代码流：do_nanosleep() 调用 hrtimer_start_range_ns().左侧的列显示了 cpu（在此输出中，主要是 cpu 3）和函数的持续时间，以便可以识别延迟.高延迟包括一个字符符号以帮助您注意它们，在此输出中，延迟为 1000337 微秒（1.0 秒）旁边的“$”。

$: Greater than 1 second
@: Greater than 100 ms
*: Greater than 10 ms
#: Greater than 1 ms
!: Greater than 100 μs
+: Greater than 10 μs
这个例子特意没有设置函数过滤器（set_ftrace_filter），这样所有子调用都可以看到。但是，这确实会产生一些开销，从而夸大了报告的持续时间。它通常对于定位高延迟的来源仍然很有用，这会使增加的开销相形见绌。当您希望给定函数的时间更准确时，可以使用函数过滤器来减少跟踪的函数。例如，仅跟踪 do_nanosleep()：

echo do_nanosleep > set_ftrace_filter
以 do_sys_open 函数为例子来测试function_graph：

echo do_sys_open > set_graph_function
echo function_graph > current_tracer
echo funcgraph-proc > trace_options

cpu 字段给出了执行函数的 cpu 号，本例中为2 号 cpu
TASK/PID 地段给出进程名称和进程PID号
DURATION 字段给出了函数执行的时间长度，以 us 为单位。
FUNCTION CALLS 则给出了调用的函数，并显示了调用流程。
需要注意：
对于不调用其它函数的函数，其对应行以“;”结尾，而且对应的 DURATION 字段给出其运行时长；

对于调用其它函数的函数，则在其“}”对应行给出了运行时长，该时间是一个累加值，包括了其内部调用的函数的执行时长。DURATION 字段给出的时长并不是精确的，它还包含了执行 ftrace 自身的代码所耗费的时间，所以示例中将内部函数时长累加得到的结果会与对应的外围调用函数的执行时长并不一致；不过通过该字段还是可以大致了解函数在时间上的运行开销的。

备注：function graph trace 实际是在要跟踪函数的入口处和返回处分别放置了钩子函数，在要跟踪函数的入口处插入钩子函数ftrace_caller，在要跟踪函数的返回处插入钩子函数return_to_handler。因此function_graph tracer即可以跟踪到函数的入口还可以跟踪到函数的返回

2.2 Options
选项可用于更改输出，可以在选项options目录中列出：

ls options/funcgraph-*

这些调整输出并且可以包括或排除详细信息，例如 cpu ID (funcgraph-cpu)、进程名称 (funcgraph-proc)、函数持续时间 (funcgraph-duration) 和延迟标记 (funcgraph-overhead)。
（1）
执行函数的 cpu 编号默认启用。有时最好只跟踪一个 cpu，或者您有时可能会在 cpu 跟踪切换时看到无序的函数调用。

hide: echo nofuncgraph-cpu > trace_options
show: echo funcgraph-cpu > trace_options
（2）
持续时间（函数的执行时间）显示在函数的右括号行上，或者在叶子函数的情况下显示在与当前函数相同的行上。它是默认启用的。

hide: echo nofuncgraph-duration > trace_options
show: echo funcgraph-duration > trace_options

叶子函数：没有调用其他的函数。
（3）
task/pid 字段显示执行该函数的线程 cmdline 和 pid。默认禁用。

hide: echo nofuncgraph-proc > trace_options
show: echo funcgraph-proc > trace_options
比如：

# tracer: function_graph
#
# cpu TASK/PID DURATION FUNCTION CALLS
# | | | | | | | | |
0) sh-4802 | | d_free() {
0) sh-4802 | | call_rcu() {
0) sh-4802 | | __call_rcu() {
0) sh-4802 | 0.616 us | rcu_process_gp_end();
0) sh-4802 | 0.586 us | check_for_new_grace_period();
0) sh-4802 | 2.899 us | }
0) sh-4802 | 4.040 us | }
0) sh-4802 | 5.151 us | }
0) sh-4802 | + 49.370 us | }
(4)
在达到持续时间阈值的情况下，开销字段在持续时间字段之前。

hide: echo nofuncgraph-overhead > trace_options
show: echo funcgraph-overhead > trace_options
depends on: funcgraph-duration
比如：

3) # 1837.709 us | } /* __switch_to */
3) | finish_task_switch() {
3) 0.313 us | _raw_spin_unlock_irq();
3) 3.177 us | }
3) # 1889.063 us | } /* __schedule */
3) ! 140.417 us | } /* __schedule */
3) # 2034.948 us | } /* schedule */
3) * 33998.59 us | } /* schedule_preempt_disabled */
[...]
1) 0.260 us | msecs_to_jiffies();
1) 0.313 us | __rcu_read_unlock();
1) + 61.770 us | }
1) + 64.479 us | }
1) 0.313 us | rcu_bh_qs();
1) 0.313 us | __local_bh_enable();
1) ! 217.240 us | }
1) 0.365 us | idle_cpu();
1) | rcu_irq_exit() {
1) 0.417 us | rcu_eqs_enter_common.isra.47();
1) 3.125 us | }
1) ! 227.812 us | }
1) ! 457.395 us | }
1) @ 119760.2 us | }
[...]
2) | handle_IPI() {
1) 6.979 us | }
2) 0.417 us | scheduler_ipi();
1) 9.791 us | }
1) + 12.917 us | }
2) 3.490 us | }
1) + 15.729 us | }
1) + 18.542 us | }
2) $ 3594274 us | }
+ means that the function exceeded 10 usecs.
! means that the function exceeded 100 usecs.
# means that the function exceeded 1000 usecs.
* means that the function exceeded 10 msecs.
@ means that the function exceeded 100 msecs.
$ means that the function exceeded 1 sec.
（5）

对于开始在跟踪缓冲区中的函数，可以在右括号（the closing bracket）后显示函数名称，从而可以更轻松地使用 grep 搜索函数持续时间。默认禁用。

hide: echo nofuncgraph-tail > trace_options
show: echo funcgraph-tail > trace_options
nofuncgraph-tail 示例（默认）：

0) | putname() {
0) | kmem_cache_free() {
0) 0.518 us | __phys_addr();
0) 1.757 us | }
0) 2.861 us | }
funcgraph-tail 示例：

0) | putname() {
0) | kmem_cache_free() {
0) 0.518 us | __phys_addr();
0) 1.757 us | } /* kmem_cache_free() */ 右括号（代表函数结束）显示函数名称
0) 2.861 us | } /* putname() */ --右括号（代表函数结束）显示函数名称
（6）

可以使用 trace_printk() 对特定函数 function() 添加一些注释例如，如果您想在 function() 函数中添加注释，您只需包含 <linux/ftrace.h> 并在 function() 中调用 trace_printk() ：

trace_printk("I'm a comment!\n")
将产生：

1) | function() {
1) | /* I'm a comment! */
1) 1.449 us | }

（编辑：汽车网）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!