tree: e86e4d871bb2b49b253be566cbe9924919ce6eae [path history] [tgz]
  1. analyze/
  2. main.go
  3. Makefile
  4. README.md
contrib/gfx/perf-tools/profile-analysis/README.md

Analyze

Analyze is a command-line tool for analyzing profiles produced by apitrace. It is used to analyze a single profile or pairs of profiles produced by running apitrace on the same trace on different platforms.

Producing profiles

Profiles are generated by running the apitrace command glretrace on a trace file. Analyze can parse profile info with GPU and CPU timing information. To produce such files, you must run glretrace with options --pcpu and --pgpu. Most of the time, you'll also want to use option --min-cpu-time=0 to ensure that the profile includes all calls. Without it, only calls taking 1 microsecond or more will be captured.

Example:

glretrace --pcpu --pgpu --min-cpu-time=0 traces_linux_10181_payday2_release.trace > profile_data.txt

Beware! When generating profiles by running traces in a Virgl environment, generating GPU timing can significantly skew the CPU timing. In such cases, it is better to capture CPU and GPU timings in separate profiles. (For the curious mind, this is because glretrace insert query operations around each call. Round-tripping this query ops through Virgl adds to the CPU time.)

Building Analyze

  1. Make sure the go tools are installed on your machine.
  2. Within directory .../src/platform/dev/contrib/gfx/perf-tools/profile-analysis run make.

The binary for analyze is dropped in .../src/platform/dev/contrib/gfx/perf-tools/profile-analysis/bin

Running Analyze

Once you have a profile and you‘ve successfully build analyze, you’re ready to analyze your first profile. Run analyze from within the bin directory as follows:

./bin/analyze <path to profile_data.txt>

Profiles can be very large and it might take a few minutes to load the profile. Once that is done, analyze will enter console mode:

Reading profile_data.txt....
6222 frames read from profile_data.txt
->

To get started, let's type a command to show the 5 most expensive frame by average CPU time:

-> show-frames n=5 s=bycpuavg
Profile: trime_trace_virgl.txt
  frame num   calls           GPU total            CPU total
---------------------------------------------------------------------
          3   64629             353.7 mS                3.6 S
       1119   18811             331.3 mS             513.4 mS
        912   35751             754.5 uS             348.5 mS
       5986    9316               4.7 mS             301.8 mS
       5861    7192               3.6 mS             243.9 mS
->

A few more useful tidbits about the console:

  1. Type help to get basic help and help <command> to get more detailed help for that command.
  2. The console supports history. Use the up/down arrows to move back and forth between older commands.
  3. The console has limited support for tab completion.
  4. Type quit to exit analyze. (Command history is preserved across runs.)

Analyzing dual profiles

Analyze is even more useful when processing two related profiles together. Two profiles are “related” if they were generated from the same trace, albeit on different platforms. To do so, run analyze with the two profiles as follows:

./bin/analyze <profile_data1.txt> <profile_data2.txt>

After loading the two profiles, analyze will check that they are compatible. (It does so by ensuring that matching calls in the two profiles call into the same GL function.)

Here's a sample command that works on two profiles simultaneously:

-> show-frame-details 1646 gt=0 ct=500000
Frame details for frames #1646 to 1646
                                                   GPU       CPU     GPU 1  CPU 1         GPU       CPU      GPU 2   CPU 2
 frame                      call name    call #    Prof 1    Prof 1    %      %           Prof 2    Prof 2     %       %
----------------------------------------------------------------------------------------------------------------------------
  1646                glCompileShader  10442713    0.0 nS    1.8 mS   0.0%   3.8%         0.0 nS    1.2 mS   0.0%   0.9%
  1646                  glLinkProgram  10442733    0.0 nS    7.5 mS   0.0%  15.8%         0.0 nS   20.9 mS   0.0%  14.9%
  1646            glDrawRangeElements  10442826   30.8 uS  733.6 uS   0.8%   1.5%        62.2 uS   38.1 uS   0.4%   0.0%
  1646                glCompileShader  10442841    0.0 nS    1.7 mS   0.0%   3.6%         0.0 nS    1.1 mS   0.0%   0.8%
  1646                glCompileShader  10442845    0.0 nS    4.2 mS   0.0%   8.9%         0.0 nS   24.6 uS   0.0%   0.0%
  1646                  glLinkProgram  10442866    0.0 nS    7.4 mS   0.0%  15.5%         0.0 nS   25.8 mS   0.0%  18.4%
  1646            glDrawRangeElements  10442953   29.4 uS  772.0 uS   0.8%   1.6%        64.6 uS   31.5 uS   0.4%   0.0%
  1646                        glFlush  10443734    0.0 nS   21.2 mS   0.0%  44.6%         0.0 nS  125.2 uS   0.0%   0.1%
1 frames out of 1 shown, or 100.0%

When analyzing two related profiles, it is even more important to use option --min-cpu-time=0 when generating the profiles with glretrace. That is because, without it, glretrace will filter out different calls on each platform. You will likely end up with non-matching subsets of GL calls in each profile. They will still load in analyze successfully, but they will be harder to compare accurately.