Profile-information.html 9.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1988-2017 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Funding Free Software", the Front-Cover
  8. Texts being (a) (see below), and with the Back-Cover Texts being (b)
  9. (see below). A copy of the license is included in the section entitled
  10. "GNU Free Documentation License".
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <title>GNU Compiler Collection (GCC) Internals: Profile information</title>
  20. <meta name="description" content="GNU Compiler Collection (GCC) Internals: Profile information">
  21. <meta name="keywords" content="GNU Compiler Collection (GCC) Internals: Profile information">
  22. <meta name="resource-type" content="document">
  23. <meta name="distribution" content="global">
  24. <meta name="Generator" content="makeinfo">
  25. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  26. <link href="index.html#Top" rel="start" title="Top">
  27. <link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="Control-Flow.html#Control-Flow" rel="up" title="Control Flow">
  30. <link href="Maintaining-the-CFG.html#Maintaining-the-CFG" rel="next" title="Maintaining the CFG">
  31. <link href="Edges.html#Edges" rel="prev" title="Edges">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.smallquotation {font-size: smaller}
  36. div.display {margin-left: 3.2em}
  37. div.example {margin-left: 3.2em}
  38. div.indentedblock {margin-left: 3.2em}
  39. div.lisp {margin-left: 3.2em}
  40. div.smalldisplay {margin-left: 3.2em}
  41. div.smallexample {margin-left: 3.2em}
  42. div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
  43. div.smalllisp {margin-left: 3.2em}
  44. kbd {font-style:oblique}
  45. pre.display {font-family: inherit}
  46. pre.format {font-family: inherit}
  47. pre.menu-comment {font-family: serif}
  48. pre.menu-preformatted {font-family: serif}
  49. pre.smalldisplay {font-family: inherit; font-size: smaller}
  50. pre.smallexample {font-size: smaller}
  51. pre.smallformat {font-family: inherit; font-size: smaller}
  52. pre.smalllisp {font-size: smaller}
  53. span.nocodebreak {white-space:nowrap}
  54. span.nolinebreak {white-space:nowrap}
  55. span.roman {font-family:serif; font-weight:normal}
  56. span.sansserif {font-family:sans-serif; font-weight:normal}
  57. ul.no-bullet {list-style: none}
  58. -->
  59. </style>
  60. </head>
  61. <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
  62. <a name="Profile-information"></a>
  63. <div class="header">
  64. <p>
  65. Next: <a href="Maintaining-the-CFG.html#Maintaining-the-CFG" accesskey="n" rel="next">Maintaining the CFG</a>, Previous: <a href="Edges.html#Edges" accesskey="p" rel="prev">Edges</a>, Up: <a href="Control-Flow.html#Control-Flow" accesskey="u" rel="up">Control Flow</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  66. </div>
  67. <hr>
  68. <a name="Profile-information-1"></a>
  69. <h3 class="section">14.3 Profile information</h3>
  70. <a name="index-profile-representation"></a>
  71. <p>In many cases a compiler must make a choice whether to trade speed in
  72. one part of code for speed in another, or to trade code size for code
  73. speed. In such cases it is useful to know information about how often
  74. some given block will be executed. That is the purpose for
  75. maintaining profile within the flow graph.
  76. GCC can handle profile information obtained through <em>profile
  77. feedback</em>, but it can also estimate branch probabilities based on
  78. statics and heuristics.
  79. </p>
  80. <a name="index-profile-feedback"></a>
  81. <p>The feedback based profile is produced by compiling the program with
  82. instrumentation, executing it on a train run and reading the numbers
  83. of executions of basic blocks and edges back to the compiler while
  84. re-compiling the program to produce the final executable. This method
  85. provides very accurate information about where a program spends most
  86. of its time on the train run. Whether it matches the average run of
  87. course depends on the choice of train data set, but several studies
  88. have shown that the behavior of a program usually changes just
  89. marginally over different data sets.
  90. </p>
  91. <a name="index-Static-profile-estimation"></a>
  92. <a name="index-branch-prediction"></a>
  93. <a name="index-predict_002edef"></a>
  94. <p>When profile feedback is not available, the compiler may be asked to
  95. attempt to predict the behavior of each branch in the program using a
  96. set of heuristics (see <samp>predict.def</samp> for details) and compute
  97. estimated frequencies of each basic block by propagating the
  98. probabilities over the graph.
  99. </p>
  100. <a name="index-frequency_002c-count_002c-BB_005fFREQ_005fBASE"></a>
  101. <p>Each <code>basic_block</code> contains two integer fields to represent
  102. profile information: <code>frequency</code> and <code>count</code>. The
  103. <code>frequency</code> is an estimation how often is basic block executed
  104. within a function. It is represented as an integer scaled in the
  105. range from 0 to <code>BB_FREQ_BASE</code>. The most frequently executed
  106. basic block in function is initially set to <code>BB_FREQ_BASE</code> and
  107. the rest of frequencies are scaled accordingly. During optimization,
  108. the frequency of the most frequent basic block can both decrease (for
  109. instance by loop unrolling) or grow (for instance by cross-jumping
  110. optimization), so scaling sometimes has to be performed multiple
  111. times.
  112. </p>
  113. <a name="index-gcov_005ftype"></a>
  114. <p>The <code>count</code> contains hard-counted numbers of execution measured
  115. during training runs and is nonzero only when profile feedback is
  116. available. This value is represented as the host&rsquo;s widest integer
  117. (typically a 64 bit integer) of the special type <code>gcov_type</code>.
  118. </p>
  119. <p>Most optimization passes can use only the frequency information of a
  120. basic block, but a few passes may want to know hard execution counts.
  121. The frequencies should always match the counts after scaling, however
  122. during updating of the profile information numerical error may
  123. accumulate into quite large errors.
  124. </p>
  125. <a name="index-REG_005fBR_005fPROB_005fBASE_002c-EDGE_005fFREQUENCY"></a>
  126. <p>Each edge also contains a branch probability field: an integer in the
  127. range from 0 to <code>REG_BR_PROB_BASE</code>. It represents probability of
  128. passing control from the end of the <code>src</code> basic block to the
  129. <code>dest</code> basic block, i.e. the probability that control will flow
  130. along this edge. The <code>EDGE_FREQUENCY</code> macro is available to
  131. compute how frequently a given edge is taken. There is a <code>count</code>
  132. field for each edge as well, representing same information as for a
  133. basic block.
  134. </p>
  135. <p>The basic block frequencies are not represented in the instruction
  136. stream, but in the RTL representation the edge frequencies are
  137. represented for conditional jumps (via the <code>REG_BR_PROB</code>
  138. macro) since they are used when instructions are output to the
  139. assembly file and the flow graph is no longer maintained.
  140. </p>
  141. <a name="index-reverse-probability"></a>
  142. <p>The probability that control flow arrives via a given edge to its
  143. destination basic block is called <em>reverse probability</em> and is not
  144. directly represented, but it may be easily computed from frequencies
  145. of basic blocks.
  146. </p>
  147. <a name="index-redirect_005fedge_005fand_005fbranch"></a>
  148. <p>Updating profile information is a delicate task that can unfortunately
  149. not be easily integrated with the CFG manipulation API. Many of the
  150. functions and hooks to modify the CFG, such as
  151. <code>redirect_edge_and_branch</code>, do not have enough information to
  152. easily update the profile, so updating it is in the majority of cases
  153. left up to the caller. It is difficult to uncover bugs in the profile
  154. updating code, because they manifest themselves only by producing
  155. worse code, and checking profile consistency is not possible because
  156. of numeric error accumulation. Hence special attention needs to be
  157. given to this issue in each pass that modifies the CFG.
  158. </p>
  159. <a name="index-REG_005fBR_005fPROB_005fBASE_002c-BB_005fFREQ_005fBASE_002c-count"></a>
  160. <p>It is important to point out that <code>REG_BR_PROB_BASE</code> and
  161. <code>BB_FREQ_BASE</code> are both set low enough to be possible to compute
  162. second power of any frequency or probability in the flow graph, it is
  163. not possible to even square the <code>count</code> field, as modern CPUs are
  164. fast enough to execute $2^32$ operations quickly.
  165. </p>
  166. <hr>
  167. <div class="header">
  168. <p>
  169. Next: <a href="Maintaining-the-CFG.html#Maintaining-the-CFG" accesskey="n" rel="next">Maintaining the CFG</a>, Previous: <a href="Edges.html#Edges" accesskey="p" rel="prev">Edges</a>, Up: <a href="Control-Flow.html#Control-Flow" accesskey="u" rel="up">Control Flow</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  170. </div>
  171. </body>
  172. </html>