Processor-pipeline-description.html 28 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1988-2017 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Funding Free Software", the Front-Cover
  8. Texts being (a) (see below), and with the Back-Cover Texts being (b)
  9. (see below). A copy of the license is included in the section entitled
  10. "GNU Free Documentation License".
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <title>GNU Compiler Collection (GCC) Internals: Processor pipeline description</title>
  20. <meta name="description" content="GNU Compiler Collection (GCC) Internals: Processor pipeline description">
  21. <meta name="keywords" content="GNU Compiler Collection (GCC) Internals: Processor pipeline description">
  22. <meta name="resource-type" content="document">
  23. <meta name="distribution" content="global">
  24. <meta name="Generator" content="makeinfo">
  25. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  26. <link href="index.html#Top" rel="start" title="Top">
  27. <link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="Insn-Attributes.html#Insn-Attributes" rel="up" title="Insn Attributes">
  30. <link href="Conditional-Execution.html#Conditional-Execution" rel="next" title="Conditional Execution">
  31. <link href="Delay-Slots.html#Delay-Slots" rel="prev" title="Delay Slots">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.smallquotation {font-size: smaller}
  36. div.display {margin-left: 3.2em}
  37. div.example {margin-left: 3.2em}
  38. div.indentedblock {margin-left: 3.2em}
  39. div.lisp {margin-left: 3.2em}
  40. div.smalldisplay {margin-left: 3.2em}
  41. div.smallexample {margin-left: 3.2em}
  42. div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
  43. div.smalllisp {margin-left: 3.2em}
  44. kbd {font-style:oblique}
  45. pre.display {font-family: inherit}
  46. pre.format {font-family: inherit}
  47. pre.menu-comment {font-family: serif}
  48. pre.menu-preformatted {font-family: serif}
  49. pre.smalldisplay {font-family: inherit; font-size: smaller}
  50. pre.smallexample {font-size: smaller}
  51. pre.smallformat {font-family: inherit; font-size: smaller}
  52. pre.smalllisp {font-size: smaller}
  53. span.nocodebreak {white-space:nowrap}
  54. span.nolinebreak {white-space:nowrap}
  55. span.roman {font-family:serif; font-weight:normal}
  56. span.sansserif {font-family:sans-serif; font-weight:normal}
  57. ul.no-bullet {list-style: none}
  58. -->
  59. </style>
  60. </head>
  61. <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
  62. <a name="Processor-pipeline-description"></a>
  63. <div class="header">
  64. <p>
  65. Previous: <a href="Delay-Slots.html#Delay-Slots" accesskey="p" rel="prev">Delay Slots</a>, Up: <a href="Insn-Attributes.html#Insn-Attributes" accesskey="u" rel="up">Insn Attributes</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  66. </div>
  67. <hr>
  68. <a name="Specifying-processor-pipeline-description"></a>
  69. <h4 class="subsection">16.19.9 Specifying processor pipeline description</h4>
  70. <a name="index-processor-pipeline-description"></a>
  71. <a name="index-processor-functional-units"></a>
  72. <a name="index-instruction-latency-time"></a>
  73. <a name="index-interlock-delays"></a>
  74. <a name="index-data-dependence-delays"></a>
  75. <a name="index-reservation-delays"></a>
  76. <a name="index-pipeline-hazard-recognizer"></a>
  77. <a name="index-automaton-based-pipeline-description"></a>
  78. <a name="index-regular-expressions"></a>
  79. <a name="index-deterministic-finite-state-automaton"></a>
  80. <a name="index-automaton-based-scheduler"></a>
  81. <a name="index-RISC"></a>
  82. <a name="index-VLIW"></a>
  83. <p>To achieve better performance, most modern processors
  84. (super-pipelined, superscalar <acronym>RISC</acronym>, and <acronym>VLIW</acronym>
  85. processors) have many <em>functional units</em> on which several
  86. instructions can be executed simultaneously. An instruction starts
  87. execution if its issue conditions are satisfied. If not, the
  88. instruction is stalled until its conditions are satisfied. Such
  89. <em>interlock (pipeline) delay</em> causes interruption of the fetching
  90. of successor instructions (or demands nop instructions, e.g. for some
  91. MIPS processors).
  92. </p>
  93. <p>There are two major kinds of interlock delays in modern processors.
  94. The first one is a data dependence delay determining <em>instruction
  95. latency time</em>. The instruction execution is not started until all
  96. source data have been evaluated by prior instructions (there are more
  97. complex cases when the instruction execution starts even when the data
  98. are not available but will be ready in given time after the
  99. instruction execution start). Taking the data dependence delays into
  100. account is simple. The data dependence (true, output, and
  101. anti-dependence) delay between two instructions is given by a
  102. constant. In most cases this approach is adequate. The second kind
  103. of interlock delays is a reservation delay. The reservation delay
  104. means that two instructions under execution will be in need of shared
  105. processors resources, i.e. buses, internal registers, and/or
  106. functional units, which are reserved for some time. Taking this kind
  107. of delay into account is complex especially for modern <acronym>RISC</acronym>
  108. processors.
  109. </p>
  110. <p>The task of exploiting more processor parallelism is solved by an
  111. instruction scheduler. For a better solution to this problem, the
  112. instruction scheduler has to have an adequate description of the
  113. processor parallelism (or <em>pipeline description</em>). GCC
  114. machine descriptions describe processor parallelism and functional
  115. unit reservations for groups of instructions with the aid of
  116. <em>regular expressions</em>.
  117. </p>
  118. <p>The GCC instruction scheduler uses a <em>pipeline hazard recognizer</em> to
  119. figure out the possibility of the instruction issue by the processor
  120. on a given simulated processor cycle. The pipeline hazard recognizer is
  121. automatically generated from the processor pipeline description. The
  122. pipeline hazard recognizer generated from the machine description
  123. is based on a deterministic finite state automaton (<acronym>DFA</acronym>):
  124. the instruction issue is possible if there is a transition from one
  125. automaton state to another one. This algorithm is very fast, and
  126. furthermore, its speed is not dependent on processor
  127. complexity<a name="DOCF5" href="#FOOT5"><sup>5</sup></a>.
  128. </p>
  129. <a name="index-automaton-based-pipeline-description-1"></a>
  130. <p>The rest of this section describes the directives that constitute
  131. an automaton-based processor pipeline description. The order of
  132. these constructions within the machine description file is not
  133. important.
  134. </p>
  135. <a name="index-define_005fautomaton"></a>
  136. <a name="index-pipeline-hazard-recognizer-1"></a>
  137. <p>The following optional construction describes names of automata
  138. generated and used for the pipeline hazards recognition. Sometimes
  139. the generated finite state automaton used by the pipeline hazard
  140. recognizer is large. If we use more than one automaton and bind functional
  141. units to the automata, the total size of the automata is usually
  142. less than the size of the single automaton. If there is no one such
  143. construction, only one finite state automaton is generated.
  144. </p>
  145. <div class="smallexample">
  146. <pre class="smallexample">(define_automaton <var>automata-names</var>)
  147. </pre></div>
  148. <p><var>automata-names</var> is a string giving names of the automata. The
  149. names are separated by commas. All the automata should have unique names.
  150. The automaton name is used in the constructions <code>define_cpu_unit</code> and
  151. <code>define_query_cpu_unit</code>.
  152. </p>
  153. <a name="index-define_005fcpu_005funit"></a>
  154. <a name="index-processor-functional-units-1"></a>
  155. <p>Each processor functional unit used in the description of instruction
  156. reservations should be described by the following construction.
  157. </p>
  158. <div class="smallexample">
  159. <pre class="smallexample">(define_cpu_unit <var>unit-names</var> [<var>automaton-name</var>])
  160. </pre></div>
  161. <p><var>unit-names</var> is a string giving the names of the functional units
  162. separated by commas. Don&rsquo;t use name &lsquo;<samp>nothing</samp>&rsquo;, it is reserved
  163. for other goals.
  164. </p>
  165. <p><var>automaton-name</var> is a string giving the name of the automaton with
  166. which the unit is bound. The automaton should be described in
  167. construction <code>define_automaton</code>. You should give
  168. <em>automaton-name</em>, if there is a defined automaton.
  169. </p>
  170. <p>The assignment of units to automata are constrained by the uses of the
  171. units in insn reservations. The most important constraint is: if a
  172. unit reservation is present on a particular cycle of an alternative
  173. for an insn reservation, then some unit from the same automaton must
  174. be present on the same cycle for the other alternatives of the insn
  175. reservation. The rest of the constraints are mentioned in the
  176. description of the subsequent constructions.
  177. </p>
  178. <a name="index-define_005fquery_005fcpu_005funit"></a>
  179. <a name="index-querying-function-unit-reservations"></a>
  180. <p>The following construction describes CPU functional units analogously
  181. to <code>define_cpu_unit</code>. The reservation of such units can be
  182. queried for an automaton state. The instruction scheduler never
  183. queries reservation of functional units for given automaton state. So
  184. as a rule, you don&rsquo;t need this construction. This construction could
  185. be used for future code generation goals (e.g. to generate
  186. <acronym>VLIW</acronym> insn templates).
  187. </p>
  188. <div class="smallexample">
  189. <pre class="smallexample">(define_query_cpu_unit <var>unit-names</var> [<var>automaton-name</var>])
  190. </pre></div>
  191. <p><var>unit-names</var> is a string giving names of the functional units
  192. separated by commas.
  193. </p>
  194. <p><var>automaton-name</var> is a string giving the name of the automaton with
  195. which the unit is bound.
  196. </p>
  197. <a name="index-define_005finsn_005freservation"></a>
  198. <a name="index-instruction-latency-time-1"></a>
  199. <a name="index-regular-expressions-1"></a>
  200. <a name="index-data-bypass"></a>
  201. <p>The following construction is the major one to describe pipeline
  202. characteristics of an instruction.
  203. </p>
  204. <div class="smallexample">
  205. <pre class="smallexample">(define_insn_reservation <var>insn-name</var> <var>default_latency</var>
  206. <var>condition</var> <var>regexp</var>)
  207. </pre></div>
  208. <p><var>default_latency</var> is a number giving latency time of the
  209. instruction. There is an important difference between the old
  210. description and the automaton based pipeline description. The latency
  211. time is used for all dependencies when we use the old description. In
  212. the automaton based pipeline description, the given latency time is only
  213. used for true dependencies. The cost of anti-dependencies is always
  214. zero and the cost of output dependencies is the difference between
  215. latency times of the producing and consuming insns (if the difference
  216. is negative, the cost is considered to be zero). You can always
  217. change the default costs for any description by using the target hook
  218. <code>TARGET_SCHED_ADJUST_COST</code> (see <a href="Scheduling.html#Scheduling">Scheduling</a>).
  219. </p>
  220. <p><var>insn-name</var> is a string giving the internal name of the insn. The
  221. internal names are used in constructions <code>define_bypass</code> and in
  222. the automaton description file generated for debugging. The internal
  223. name has nothing in common with the names in <code>define_insn</code>. It is a
  224. good practice to use insn classes described in the processor manual.
  225. </p>
  226. <p><var>condition</var> defines what RTL insns are described by this
  227. construction. You should remember that you will be in trouble if
  228. <var>condition</var> for two or more different
  229. <code>define_insn_reservation</code> constructions is TRUE for an insn. In
  230. this case what reservation will be used for the insn is not defined.
  231. Such cases are not checked during generation of the pipeline hazards
  232. recognizer because in general recognizing that two conditions may have
  233. the same value is quite difficult (especially if the conditions
  234. contain <code>symbol_ref</code>). It is also not checked during the
  235. pipeline hazard recognizer work because it would slow down the
  236. recognizer considerably.
  237. </p>
  238. <p><var>regexp</var> is a string describing the reservation of the cpu&rsquo;s functional
  239. units by the instruction. The reservations are described by a regular
  240. expression according to the following syntax:
  241. </p>
  242. <div class="smallexample">
  243. <pre class="smallexample"> regexp = regexp &quot;,&quot; oneof
  244. | oneof
  245. oneof = oneof &quot;|&quot; allof
  246. | allof
  247. allof = allof &quot;+&quot; repeat
  248. | repeat
  249. repeat = element &quot;*&quot; number
  250. | element
  251. element = cpu_function_unit_name
  252. | reservation_name
  253. | result_name
  254. | &quot;nothing&quot;
  255. | &quot;(&quot; regexp &quot;)&quot;
  256. </pre></div>
  257. <ul>
  258. <li> &lsquo;<samp>,</samp>&rsquo; is used for describing the start of the next cycle in
  259. the reservation.
  260. </li><li> &lsquo;<samp>|</samp>&rsquo; is used for describing a reservation described by the first
  261. regular expression <strong>or</strong> a reservation described by the second
  262. regular expression <strong>or</strong> etc.
  263. </li><li> &lsquo;<samp>+</samp>&rsquo; is used for describing a reservation described by the first
  264. regular expression <strong>and</strong> a reservation described by the
  265. second regular expression <strong>and</strong> etc.
  266. </li><li> &lsquo;<samp>*</samp>&rsquo; is used for convenience and simply means a sequence in which
  267. the regular expression are repeated <var>number</var> times with cycle
  268. advancing (see &lsquo;<samp>,</samp>&rsquo;).
  269. </li><li> &lsquo;<samp>cpu_function_unit_name</samp>&rsquo; denotes reservation of the named
  270. functional unit.
  271. </li><li> &lsquo;<samp>reservation_name</samp>&rsquo; &mdash; see description of construction
  272. &lsquo;<samp>define_reservation</samp>&rsquo;.
  273. </li><li> &lsquo;<samp>nothing</samp>&rsquo; denotes no unit reservations.
  274. </li></ul>
  275. <a name="index-define_005freservation"></a>
  276. <p>Sometimes unit reservations for different insns contain common parts.
  277. In such case, you can simplify the pipeline description by describing
  278. the common part by the following construction
  279. </p>
  280. <div class="smallexample">
  281. <pre class="smallexample">(define_reservation <var>reservation-name</var> <var>regexp</var>)
  282. </pre></div>
  283. <p><var>reservation-name</var> is a string giving name of <var>regexp</var>.
  284. Functional unit names and reservation names are in the same name
  285. space. So the reservation names should be different from the
  286. functional unit names and can not be the reserved name &lsquo;<samp>nothing</samp>&rsquo;.
  287. </p>
  288. <a name="index-define_005fbypass"></a>
  289. <a name="index-instruction-latency-time-2"></a>
  290. <a name="index-data-bypass-1"></a>
  291. <p>The following construction is used to describe exceptions in the
  292. latency time for given instruction pair. This is so called bypasses.
  293. </p>
  294. <div class="smallexample">
  295. <pre class="smallexample">(define_bypass <var>number</var> <var>out_insn_names</var> <var>in_insn_names</var>
  296. [<var>guard</var>])
  297. </pre></div>
  298. <p><var>number</var> defines when the result generated by the instructions
  299. given in string <var>out_insn_names</var> will be ready for the
  300. instructions given in string <var>in_insn_names</var>. Each of these
  301. strings is a comma-separated list of filename-style globs and
  302. they refer to the names of <code>define_insn_reservation</code>s.
  303. For example:
  304. </p><div class="smallexample">
  305. <pre class="smallexample">(define_bypass 1 &quot;cpu1_load_*, cpu1_store_*&quot; &quot;cpu1_load_*&quot;)
  306. </pre></div>
  307. <p>defines a bypass between instructions that start with
  308. &lsquo;<samp>cpu1_load_</samp>&rsquo; or &lsquo;<samp>cpu1_store_</samp>&rsquo; and those that start with
  309. &lsquo;<samp>cpu1_load_</samp>&rsquo;.
  310. </p>
  311. <p><var>guard</var> is an optional string giving the name of a C function which
  312. defines an additional guard for the bypass. The function will get the
  313. two insns as parameters. If the function returns zero the bypass will
  314. be ignored for this case. The additional guard is necessary to
  315. recognize complicated bypasses, e.g. when the consumer is only an address
  316. of insn &lsquo;<samp>store</samp>&rsquo; (not a stored value).
  317. </p>
  318. <p>If there are more one bypass with the same output and input insns, the
  319. chosen bypass is the first bypass with a guard in description whose
  320. guard function returns nonzero. If there is no such bypass, then
  321. bypass without the guard function is chosen.
  322. </p>
  323. <a name="index-exclusion_005fset"></a>
  324. <a name="index-presence_005fset"></a>
  325. <a name="index-final_005fpresence_005fset"></a>
  326. <a name="index-absence_005fset"></a>
  327. <a name="index-final_005fabsence_005fset"></a>
  328. <a name="index-VLIW-1"></a>
  329. <a name="index-RISC-1"></a>
  330. <p>The following five constructions are usually used to describe
  331. <acronym>VLIW</acronym> processors, or more precisely, to describe a placement
  332. of small instructions into <acronym>VLIW</acronym> instruction slots. They
  333. can be used for <acronym>RISC</acronym> processors, too.
  334. </p>
  335. <div class="smallexample">
  336. <pre class="smallexample">(exclusion_set <var>unit-names</var> <var>unit-names</var>)
  337. (presence_set <var>unit-names</var> <var>patterns</var>)
  338. (final_presence_set <var>unit-names</var> <var>patterns</var>)
  339. (absence_set <var>unit-names</var> <var>patterns</var>)
  340. (final_absence_set <var>unit-names</var> <var>patterns</var>)
  341. </pre></div>
  342. <p><var>unit-names</var> is a string giving names of functional units
  343. separated by commas.
  344. </p>
  345. <p><var>patterns</var> is a string giving patterns of functional units
  346. separated by comma. Currently pattern is one unit or units
  347. separated by white-spaces.
  348. </p>
  349. <p>The first construction (&lsquo;<samp>exclusion_set</samp>&rsquo;) means that each
  350. functional unit in the first string can not be reserved simultaneously
  351. with a unit whose name is in the second string and vice versa. For
  352. example, the construction is useful for describing processors
  353. (e.g. some SPARC processors) with a fully pipelined floating point
  354. functional unit which can execute simultaneously only single floating
  355. point insns or only double floating point insns.
  356. </p>
  357. <p>The second construction (&lsquo;<samp>presence_set</samp>&rsquo;) means that each
  358. functional unit in the first string can not be reserved unless at
  359. least one of pattern of units whose names are in the second string is
  360. reserved. This is an asymmetric relation. For example, it is useful
  361. for description that <acronym>VLIW</acronym> &lsquo;<samp>slot1</samp>&rsquo; is reserved after
  362. &lsquo;<samp>slot0</samp>&rsquo; reservation. We could describe it by the following
  363. construction
  364. </p>
  365. <div class="smallexample">
  366. <pre class="smallexample">(presence_set &quot;slot1&quot; &quot;slot0&quot;)
  367. </pre></div>
  368. <p>Or &lsquo;<samp>slot1</samp>&rsquo; is reserved only after &lsquo;<samp>slot0</samp>&rsquo; and unit &lsquo;<samp>b0</samp>&rsquo;
  369. reservation. In this case we could write
  370. </p>
  371. <div class="smallexample">
  372. <pre class="smallexample">(presence_set &quot;slot1&quot; &quot;slot0 b0&quot;)
  373. </pre></div>
  374. <p>The third construction (&lsquo;<samp>final_presence_set</samp>&rsquo;) is analogous to
  375. &lsquo;<samp>presence_set</samp>&rsquo;. The difference between them is when checking is
  376. done. When an instruction is issued in given automaton state
  377. reflecting all current and planned unit reservations, the automaton
  378. state is changed. The first state is a source state, the second one
  379. is a result state. Checking for &lsquo;<samp>presence_set</samp>&rsquo; is done on the
  380. source state reservation, checking for &lsquo;<samp>final_presence_set</samp>&rsquo; is
  381. done on the result reservation. This construction is useful to
  382. describe a reservation which is actually two subsequent reservations.
  383. For example, if we use
  384. </p>
  385. <div class="smallexample">
  386. <pre class="smallexample">(presence_set &quot;slot1&quot; &quot;slot0&quot;)
  387. </pre></div>
  388. <p>the following insn will be never issued (because &lsquo;<samp>slot1</samp>&rsquo; requires
  389. &lsquo;<samp>slot0</samp>&rsquo; which is absent in the source state).
  390. </p>
  391. <div class="smallexample">
  392. <pre class="smallexample">(define_reservation &quot;insn_and_nop&quot; &quot;slot0 + slot1&quot;)
  393. </pre></div>
  394. <p>but it can be issued if we use analogous &lsquo;<samp>final_presence_set</samp>&rsquo;.
  395. </p>
  396. <p>The forth construction (&lsquo;<samp>absence_set</samp>&rsquo;) means that each functional
  397. unit in the first string can be reserved only if each pattern of units
  398. whose names are in the second string is not reserved. This is an
  399. asymmetric relation (actually &lsquo;<samp>exclusion_set</samp>&rsquo; is analogous to
  400. this one but it is symmetric). For example it might be useful in a
  401. <acronym>VLIW</acronym> description to say that &lsquo;<samp>slot0</samp>&rsquo; cannot be reserved
  402. after either &lsquo;<samp>slot1</samp>&rsquo; or &lsquo;<samp>slot2</samp>&rsquo; have been reserved. This
  403. can be described as:
  404. </p>
  405. <div class="smallexample">
  406. <pre class="smallexample">(absence_set &quot;slot0&quot; &quot;slot1, slot2&quot;)
  407. </pre></div>
  408. <p>Or &lsquo;<samp>slot2</samp>&rsquo; can not be reserved if &lsquo;<samp>slot0</samp>&rsquo; and unit &lsquo;<samp>b0</samp>&rsquo;
  409. are reserved or &lsquo;<samp>slot1</samp>&rsquo; and unit &lsquo;<samp>b1</samp>&rsquo; are reserved. In
  410. this case we could write
  411. </p>
  412. <div class="smallexample">
  413. <pre class="smallexample">(absence_set &quot;slot2&quot; &quot;slot0 b0, slot1 b1&quot;)
  414. </pre></div>
  415. <p>All functional units mentioned in a set should belong to the same
  416. automaton.
  417. </p>
  418. <p>The last construction (&lsquo;<samp>final_absence_set</samp>&rsquo;) is analogous to
  419. &lsquo;<samp>absence_set</samp>&rsquo; but checking is done on the result (state)
  420. reservation. See comments for &lsquo;<samp>final_presence_set</samp>&rsquo;.
  421. </p>
  422. <a name="index-automata_005foption"></a>
  423. <a name="index-deterministic-finite-state-automaton-1"></a>
  424. <a name="index-nondeterministic-finite-state-automaton"></a>
  425. <a name="index-finite-state-automaton-minimization"></a>
  426. <p>You can control the generator of the pipeline hazard recognizer with
  427. the following construction.
  428. </p>
  429. <div class="smallexample">
  430. <pre class="smallexample">(automata_option <var>options</var>)
  431. </pre></div>
  432. <p><var>options</var> is a string giving options which affect the generated
  433. code. Currently there are the following options:
  434. </p>
  435. <ul>
  436. <li> <em>no-minimization</em> makes no minimization of the automaton. This is
  437. only worth to do when we are debugging the description and need to
  438. look more accurately at reservations of states.
  439. </li><li> <em>time</em> means printing time statistics about the generation of
  440. automata.
  441. </li><li> <em>stats</em> means printing statistics about the generated automata
  442. such as the number of DFA states, NDFA states and arcs.
  443. </li><li> <em>v</em> means a generation of the file describing the result automata.
  444. The file has suffix &lsquo;<samp>.dfa</samp>&rsquo; and can be used for the description
  445. verification and debugging.
  446. </li><li> <em>w</em> means a generation of warning instead of error for
  447. non-critical errors.
  448. </li><li> <em>no-comb-vect</em> prevents the automaton generator from generating
  449. two data structures and comparing them for space efficiency. Using
  450. a comb vector to represent transitions may be better, but it can be
  451. very expensive to construct. This option is useful if the build
  452. process spends an unacceptably long time in genautomata.
  453. </li><li> <em>ndfa</em> makes nondeterministic finite state automata. This affects
  454. the treatment of operator &lsquo;<samp>|</samp>&rsquo; in the regular expressions. The
  455. usual treatment of the operator is to try the first alternative and,
  456. if the reservation is not possible, the second alternative. The
  457. nondeterministic treatment means trying all alternatives, some of them
  458. may be rejected by reservations in the subsequent insns.
  459. </li><li> <em>collapse-ndfa</em> modifies the behavior of the generator when
  460. producing an automaton. An additional state transition to collapse a
  461. nondeterministic <acronym>NDFA</acronym> state to a deterministic <acronym>DFA</acronym>
  462. state is generated. It can be triggered by passing <code>const0_rtx</code> to
  463. state_transition. In such an automaton, cycle advance transitions are
  464. available only for these collapsed states. This option is useful for
  465. ports that want to use the <code>ndfa</code> option, but also want to use
  466. <code>define_query_cpu_unit</code> to assign units to insns issued in a cycle.
  467. </li><li> <em>progress</em> means output of a progress bar showing how many states
  468. were generated so far for automaton being processed. This is useful
  469. during debugging a <acronym>DFA</acronym> description. If you see too many
  470. generated states, you could interrupt the generator of the pipeline
  471. hazard recognizer and try to figure out a reason for generation of the
  472. huge automaton.
  473. </li></ul>
  474. <p>As an example, consider a superscalar <acronym>RISC</acronym> machine which can
  475. issue three insns (two integer insns and one floating point insn) on
  476. the cycle but can finish only two insns. To describe this, we define
  477. the following functional units.
  478. </p>
  479. <div class="smallexample">
  480. <pre class="smallexample">(define_cpu_unit &quot;i0_pipeline, i1_pipeline, f_pipeline&quot;)
  481. (define_cpu_unit &quot;port0, port1&quot;)
  482. </pre></div>
  483. <p>All simple integer insns can be executed in any integer pipeline and
  484. their result is ready in two cycles. The simple integer insns are
  485. issued into the first pipeline unless it is reserved, otherwise they
  486. are issued into the second pipeline. Integer division and
  487. multiplication insns can be executed only in the second integer
  488. pipeline and their results are ready correspondingly in 8 and 4
  489. cycles. The integer division is not pipelined, i.e. the subsequent
  490. integer division insn can not be issued until the current division
  491. insn finished. Floating point insns are fully pipelined and their
  492. results are ready in 3 cycles. Where the result of a floating point
  493. insn is used by an integer insn, an additional delay of one cycle is
  494. incurred. To describe all of this we could specify
  495. </p>
  496. <div class="smallexample">
  497. <pre class="smallexample">(define_cpu_unit &quot;div&quot;)
  498. (define_insn_reservation &quot;simple&quot; 2 (eq_attr &quot;type&quot; &quot;int&quot;)
  499. &quot;(i0_pipeline | i1_pipeline), (port0 | port1)&quot;)
  500. (define_insn_reservation &quot;mult&quot; 4 (eq_attr &quot;type&quot; &quot;mult&quot;)
  501. &quot;i1_pipeline, nothing*2, (port0 | port1)&quot;)
  502. (define_insn_reservation &quot;div&quot; 8 (eq_attr &quot;type&quot; &quot;div&quot;)
  503. &quot;i1_pipeline, div*7, div + (port0 | port1)&quot;)
  504. (define_insn_reservation &quot;float&quot; 3 (eq_attr &quot;type&quot; &quot;float&quot;)
  505. &quot;f_pipeline, nothing, (port0 | port1))
  506. (define_bypass 4 &quot;float&quot; &quot;simple,mult,div&quot;)
  507. </pre></div>
  508. <p>To simplify the description we could describe the following reservation
  509. </p>
  510. <div class="smallexample">
  511. <pre class="smallexample">(define_reservation &quot;finish&quot; &quot;port0|port1&quot;)
  512. </pre></div>
  513. <p>and use it in all <code>define_insn_reservation</code> as in the following
  514. construction
  515. </p>
  516. <div class="smallexample">
  517. <pre class="smallexample">(define_insn_reservation &quot;simple&quot; 2 (eq_attr &quot;type&quot; &quot;int&quot;)
  518. &quot;(i0_pipeline | i1_pipeline), finish&quot;)
  519. </pre></div>
  520. <div class="footnote">
  521. <hr>
  522. <h4 class="footnotes-heading">Footnotes</h4>
  523. <h3><a name="FOOT5" href="#DOCF5">(5)</a></h3>
  524. <p>However, the size of the automaton depends on
  525. processor complexity. To limit this effect, machine descriptions
  526. can split orthogonal parts of the machine description among several
  527. automata: but then, since each of these must be stepped independently,
  528. this does cause a small decrease in the algorithm&rsquo;s performance.</p>
  529. </div>
  530. <hr>
  531. <div class="header">
  532. <p>
  533. Previous: <a href="Delay-Slots.html#Delay-Slots" accesskey="p" rel="prev">Delay Slots</a>, Up: <a href="Insn-Attributes.html#Insn-Attributes" accesskey="u" rel="up">Insn Attributes</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  534. </div>
  535. </body>
  536. </html>