View Source tprof (tools v4.0)

Process Tracing Profiling Tool

tprof provides convenience helpers for Erlang process profiling using the trace BIFs.

Warning
This module aims to replace eprof and cprof into a unified API for measuring call count, time, and allocation. It is experimental in Erlang/OTP 27.0.

It is possible to analyze the number of calls, the time spent by function, and heap allocations by function. Profiling can be done ad-hoc or run in a server-aided mode for deeper introspection of the code running in production.

Warning
Avoid hot code reloading for modules participating in the tracing. Reloading a module disables tracing and discards the accumulated statistics. The tprof results will probably be incorrect when the profiled code was reloading during a profiling session.

There are three kinds of profiling supported by this module:

call_count
call_time
call_memory

The default is call_count, which has the smallest peformance impact and memory footprint, but it does not support per-process profiling.

Erlang terms that do not fit in a single machine word are allocated on the process heap. For example, a function returning a tuple of two elements needs to allocate the tuple on the process heap. The actual consumption is three words, because the runtime systems also need an extra word to store the tuple size.

Note
Expect a slowdown in the program execution when profiling is enabled.
For profiling convenience, measurements are accumulated for functions that are not enabled in some trace pattern. Consider this call stack example:
top_traced_function(...)
not_traced_function()
bottom_traced_function()
Allocations that happened within not_traced_function will be added to the allocations for top_traced_function. However, allocations that occurred within bottom_traced_function are not included in the top_traced_function. To only keep track of each function own allocations, it is necessary to trace all functions.

Ad-hoc profiling

Ad-hoc profiling is convenient for profiling a single function call.

For example:

1> tprof:profile(lists, seq, [1, 16], #{type => call_memory}).

****** Process <0.179.0>    -- 100.00 % of total allocations ***
FUNCTION          CALLS  WORDS  PER CALL  [     %]
lists:seq_loop/3      5     32         6  [100.00]
32            [ 100.0]
ok

By default tracing is enabled for all functions in all modules. When funs are created in the interactive shell, parts of shell code are also traced:

1> tprof:profile(fun() -> lists:seq(1, 16) end, #{type => call_memory}).

****** Process <0.224.0>    -- 100.00 % of total allocations ***
FUNCTION                   CALLS  WORDS  PER CALL  [    %]
erl_eval:match_list/6          1      3         3  [ 3.19]
erl_eval:do_apply/7            1      3         3  [ 3.19]
lists:reverse/1                1      4         4  [ 4.26]
erl_eval:add_bindings/2        1      5         5  [ 5.32]
erl_eval:expr_list/7           3      7         2  [ 7.45]
erl_eval:ret_expr/3            4     16         4  [17.02]
erl_eval:merge_bindings/4      3     24         8  [25.53]
lists:seq_loop/3               5     32         6  [34.04]

ok

However, it is possible to limit the trace to specific functions or modules:

>>>>>>> c3fe8f12d4 (fixup! WIP: Polish `tools` documentation after migration to ExDoc)
2> tprof:profile(fun() -> lists:seq(1, 16) end,
                 #{type => call_memory, pattern => [{lists, seq_loop, '_'}]}).
****** Process <0.247.0>    -- 100.00 % of total allocations ***
FUNCTION          CALLS  WORDS  PER CALL  [     %]
lists:seq_loop/3      5     32         6  [100.00]

ok

Ad-hoc profiling results can be printed in a few different ways. The following examples use the test module defined like this:

-module(test).
-export([test_spawn/0]).
test_spawn() ->
    {Pid, MRef} = spawn_monitor(fun () -> lists:seq(1, 32) end),
    receive
        {'DOWN', MRef, process, Pid, normal} ->
            done
    end.

By default per-process statistics is shown:

1> tprof:profile(test, test_spawn, [], #{type => call_memory}).

****** Process <0.176.0>    -- 23.66 % of total allocations ***
FUNCTION                CALLS  WORDS  PER CALL  [    %]
erlang:spawn_monitor/1      1      2         2  [ 9.09]
erlang:spawn_opt/4          1      6         6  [27.27]
test:test_spawn/0           1     14        14  [63.64]
                                    22            [100.0]

****** Process <0.177.0>    -- 76.34 % of total allocations ***
FUNCTION           CALLS  WORDS  PER CALL  [    %]
erlang:apply/2         1      7         7  [ 9.86]
lists:seq_loop/3       9     64         7  [90.14]
                             71            [100.0]

The following example prints the combined memory allocation of all processes, sorted by the total number of allocated words in descending order:

2> tprof:profile(test, test_spawn, [],
                 #{type => call_memory, report => {total, {measurement, descending}}}).

FUNCTION                CALLS  WORDS  PER CALL  [    %]
lists:seq_loop/3            9     64         7  [68.82]
test:test_spawn/0           1     14        14  [15.05]
erlang:apply/2              1      7         7  [ 7.53]
erlang:spawn_opt/4          1      6         6  [ 6.45]
erlang:spawn_monitor/1      1      2         2  [ 2.15]
                                  93            [100.0]

The profiling data can also be collected for further inspection:

3> {done, ProfileData} = tprof:profile(fun test:test_spawn/0,
                                       #{type => call_memory, report => return}).
<...>
4> tprof:format(tprof:inspect(ProfileData, process, {percent, descending})).

****** Process <0.223.0>    -- 23.66 % of total allocations ***
FUNCTION                CALLS  WORDS  PER CALL  [    %]
test:test_spawn/0           1     14        14  [63.64]
erlang:spawn_opt/4          1      6         6  [27.27]
erlang:spawn_monitor/1      1      2         2  [ 9.09]
22            [100.0]

****** Process <0.224.0>    -- 76.34 % of total allocations ***
FUNCTION           CALLS  WORDS  PER CALL  [    %]
lists:seq_loop/3       9     64         7  [90.14]
erlang:apply/2         1      7         7  [ 9.86]
71            [100.0]

Which processes that are profiled depends on the profiling type.

call_count (default) counts calls in all processes.
call_time and call_memory limits the profiling to the processes spawned from the user-provided function (using the set_on_spawn option for erlang:trace/3).

call_time and call_memory can be restricted to profile a single process:

2> tprof:profile(test, test_spawn, [],
                 #{type => call_memory, set_on_spawn => false}).

****** Process <0.183.0>    -- 100.00 % of total allocations ***
FUNCTION                CALLS  WORDS  PER CALL  [    %]
erlang:spawn_monitor/1      1      2         2  [ 9.09]
erlang:spawn_opt/4          1      6         6  [27.27]
test:test_spawn/0           1     14        14  [63.64]

Erlang programs can perform expensive operations in other processes than the original one. You can include multiple, new, or even all processes in the trace when measuring time or memory:

7> pg:start_link().
{ok,<0.252.0>}
8> tprof:profile(fun() -> pg:join(group, self()) end,
                 #{type => call_memory, rootset => [pg]}).
****** Process <0.252.0>    -- 52.86 % of total allocations ***
FUNCTION                      CALLS  WORDS  PER CALL  [    %]
pg:leave_local_update_ets/5       1      2         2  [ 1.80]
gen:reply/2                       1      3         3  [ 2.70]
erlang:monitor/2                  1      3         3  [ 2.70]
gen_server:try_handle_call/4      1      3         3  [ 2.70]
gen_server:try_dispatch/4         1      3         3  [ 2.70]
maps:iterator/1                   2      4         2  [ 3.60]
maps:take/2                       1      6         6  [ 5.41]
pg:join_local_update_ets/5        1      8         8  [ 7.21]
pg:handle_info/2                  1      8         8  [ 7.21]
pg:handle_call/3                  1      9         9  [ 8.11]
gen_server:loop/7                 2      9         4  [ 8.11]
ets:lookup/2                      2     10         5  [ 9.01]
pg:join_local/3                   1     11        11  [ 9.91]
pg:notify_group/5                 2     16         8  [14.41]
erlang:setelement/3               2     16         8  [14.41]
111            [100.0]

****** Process <0.255.0>    -- 47.14 % of total allocations ***
FUNCTION                   CALLS  WORDS  PER CALL  [    %]
erl_eval:match_list/6          1      3         3  [ 3.03]
erlang:monitor/2               1      3         3  [ 3.03]
lists:reverse/1                2      4         2  [ 4.04]
pg:join/3                      1      4         4  [ 4.04]
erl_eval:add_bindings/2        1      5         5  [ 5.05]
erl_eval:do_apply/7            2      6         3  [ 6.06]
gen:call/4                     1      8         8  [ 8.08]
erl_eval:expr_list/7           4     10         2  [10.10]
gen:do_call/4                  1     16        16  [16.16]
erl_eval:ret_expr/3            4     16         4  [16.16]
erl_eval:merge_bindings/4      3     24         8  [24.24]
99            [100.0]

By default, there is no limit for the profiling time. For ac-hoc profiling, is is possible to configure a time limit. If the profiled function does not return before that time expires, the process is terminated with reason kill. Any unlinked children processes started by the user-supplied function are kept; it is the responsibility of the developer to take care of such processes.

9> tprof:profile(timer, sleep, [100000], #{timeout => 1000}).

By default, only one ad-hoc or server-aided profiling session is allowed at any point in time. It is possible to force multiple ad-hoc sessions concurrently, but it is the responsibility of the developer to ensure that trace patterns do not overlap:

1> tprof:profile(fun() -> lists:seq(1, 32) end,
    #{registered => false, pattern => [{lists, '_', '_'}]}).

Server-aided profiling

Server-aided profiling can be done on a system that is up and running. To do that, start the tprof server, and then add trace patterns and processes to trace while the system handles actual traffic. Data can extracted, inspected, and printed at any time. The following example traces activity of all processes supervised by the Kernel supervisor:

1> tprof:start(#{type => call_memory}).
{ok,<0.200.0>}
2> tprof:enable_trace({all_children, kernel_sup}).
34
3> tprof:set_pattern('_', '_' , '_').
16728
4> Sample = tprof:collect().
[{gen_server,try_dispatch,4,[{<0.154.0>,2,6}]},
{erlang,iolist_to_iovec,1,[{<0.161.0>,1,8}]},
<...>
5 > tprof:format(tprof:inspect(Sample)).

****** Process <0.154.0>    -- 14.21 % of total allocations ***
FUNCTION                   CALLS  WORDS  PER CALL  [    %]
maps:iterator/1                2      4         2  [15.38]
gen_server:try_dispatch/4      2      6         3  [23.08]
net_kernel:handle_info/2       2     16         8  [61.54]
                                     26            [100.0]

****** Process <0.161.0>    -- 85.79 % of total allocations ***
FUNCTION                        CALLS  WORDS  PER CALL  [    %]
disk_log:handle/2                   2      2         1  [ 1.27]
disk_log_1:maybe_start_timer/1      1      3         3  [ 1.91]
disk_log_1:mf_write_cache/1         1      3         3  [ 1.91]
<...>

It is possible to profile the entire running system, and then examine individual processes:

1> tprof:start(#{type => call_memory}).
2> tprof:enable_trace(processes), tprof:set_pattern('_', '_' , '_').
9041
3> timer:sleep(10000), tprof:disable_trace(processes), Sample = tprof:collect().
[{user_drv,server,3,[{<0.64.0>,12,136}]},
{user_drv,contains_ctrl_g_or_ctrl_c,1,[{<0.64.0>,80,10}]},
<...>
4> Inspected = tprof:inspect(Sample, process, words), Shell = maps:get(self(), Inspected).
{2743,
[{shell,{enc,0},1,2,2,0.07291286912139992},
<...>
5> tprof:format(Shell).

FUNCTION                           CALLS  WORDS  PER CALL  [    %]
<...>
erl_lint:start/2                       2    300       150  [10.94]
shell:used_records/1                 114    342         3  [12.47]

Summary

Types

column()

Column to sort by inspect/3 or profile/4.

process()

A process identifier (pid) or a registered process name.

profile_line()

Inspected data for a single function of the specified Module.

profile_options()

Ad-hoc profiler options; see profile/4.

profile_result()

Profile of a single process, or combined profile of multiple processes, sorted by a selected column.

rootset()

sort_by()

trace_info()

Raw data extracted from tracing BIFs.

trace_map()

Traced functions (with their arities) grouped by module name, or all if all code is traced.

trace_options()

Options for enabling profiling of the selected processes; see enable_trace/2.

trace_pattern()

trace_type()

Functions

clear_pattern(Mod, Fun, Arity)

Disables tracing functions matching the supplied pattern.

continue()

Resumes previously paused profiling.

disable_trace(Rootset)

Equivalent to disable_trace(Spec, #{set_on_spawn => true}).

disable_trace/2

Stops accumulating traces for specified processes.

enable_trace(Rootset)

Equivalent to enable_trace(Spec, #{set_on_spawn => true}).

enable_trace/2

Similar to erlang:trace/3, but supports a few more options for tracing convenience.

format(Inspected)

Formats profile data transformed with inspect/3, outputting to the default output device.

format(IoDevice, Inspected)

Formats profile transformed with inspect/3, outputting to device IoDevice.

get_trace_map()

Returns a map of module names to functions with their arities.

inspect(Profile)

Equivalent to inspect(Profile, process, percent).

inspect(Profile, Type, SortBy)

Transforms raw data returned by tracing BIFs into a form convenient for subsequent analysis and formatting.

pause()

Pauses trace collection for all currently traced functions, retaining existing traces.

profile(Fun)

Equivalent to profile(Fun, #{}).

profile(Fun, Options)

Does ad-hoc profiling of the call Fun().

profile(Module, Function, Args)

Equivalent to profile(Module, Function, Args, #{}).

profile(Module, Function, Args, Options)

Does ad-hoc profiling for the call apply(Module, Function, Args).

restart()

Clears accumulated profiles and starts profiling if it was paused.

set_pattern(Mod, Fun, Arity)

Enables tracing for all functions matching the supplied pattern.

start()

Starts the server, not supervised.

start_link()

Starts the server supervised by the calling process.

stop()

Stops the tprof server and disable tracing enabled by the server.

Types

column()