Vector-Extensions.html 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1988-2017 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Funding Free Software", the Front-Cover
  8. Texts being (a) (see below), and with the Back-Cover Texts being (b)
  9. (see below). A copy of the license is included in the section entitled
  10. "GNU Free Documentation License".
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <title>Using the GNU Compiler Collection (GCC): Vector Extensions</title>
  20. <meta name="description" content="Using the GNU Compiler Collection (GCC): Vector Extensions">
  21. <meta name="keywords" content="Using the GNU Compiler Collection (GCC): Vector Extensions">
  22. <meta name="resource-type" content="document">
  23. <meta name="distribution" content="global">
  24. <meta name="Generator" content="makeinfo">
  25. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  26. <link href="index.html#Top" rel="start" title="Top">
  27. <link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="C-Extensions.html#C-Extensions" rel="up" title="C Extensions">
  30. <link href="Offsetof.html#Offsetof" rel="next" title="Offsetof">
  31. <link href="Return-Address.html#Return-Address" rel="prev" title="Return Address">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.smallquotation {font-size: smaller}
  36. div.display {margin-left: 3.2em}
  37. div.example {margin-left: 3.2em}
  38. div.indentedblock {margin-left: 3.2em}
  39. div.lisp {margin-left: 3.2em}
  40. div.smalldisplay {margin-left: 3.2em}
  41. div.smallexample {margin-left: 3.2em}
  42. div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
  43. div.smalllisp {margin-left: 3.2em}
  44. kbd {font-style:oblique}
  45. pre.display {font-family: inherit}
  46. pre.format {font-family: inherit}
  47. pre.menu-comment {font-family: serif}
  48. pre.menu-preformatted {font-family: serif}
  49. pre.smalldisplay {font-family: inherit; font-size: smaller}
  50. pre.smallexample {font-size: smaller}
  51. pre.smallformat {font-family: inherit; font-size: smaller}
  52. pre.smalllisp {font-size: smaller}
  53. span.nocodebreak {white-space:nowrap}
  54. span.nolinebreak {white-space:nowrap}
  55. span.roman {font-family:serif; font-weight:normal}
  56. span.sansserif {font-family:sans-serif; font-weight:normal}
  57. ul.no-bullet {list-style: none}
  58. -->
  59. </style>
  60. </head>
  61. <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
  62. <a name="Vector-Extensions"></a>
  63. <div class="header">
  64. <p>
  65. Next: <a href="Offsetof.html#Offsetof" accesskey="n" rel="next">Offsetof</a>, Previous: <a href="Return-Address.html#Return-Address" accesskey="p" rel="prev">Return Address</a>, Up: <a href="C-Extensions.html#C-Extensions" accesskey="u" rel="up">C Extensions</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  66. </div>
  67. <hr>
  68. <a name="Using-Vector-Instructions-through-Built_002din-Functions"></a>
  69. <h3 class="section">6.50 Using Vector Instructions through Built-in Functions</h3>
  70. <p>On some targets, the instruction set contains SIMD vector instructions which
  71. operate on multiple values contained in one large register at the same time.
  72. For example, on the x86 the MMX, 3DNow! and SSE extensions can be used
  73. this way.
  74. </p>
  75. <p>The first step in using these extensions is to provide the necessary data
  76. types. This should be done using an appropriate <code>typedef</code>:
  77. </p>
  78. <div class="smallexample">
  79. <pre class="smallexample">typedef int v4si __attribute__ ((vector_size (16)));
  80. </pre></div>
  81. <p>The <code>int</code> type specifies the base type, while the attribute specifies
  82. the vector size for the variable, measured in bytes. For example, the
  83. declaration above causes the compiler to set the mode for the <code>v4si</code>
  84. type to be 16 bytes wide and divided into <code>int</code> sized units. For
  85. a 32-bit <code>int</code> this means a vector of 4 units of 4 bytes, and the
  86. corresponding mode of <code>foo</code> is <acronym>V4SI</acronym>.
  87. </p>
  88. <p>The <code>vector_size</code> attribute is only applicable to integral and
  89. float scalars, although arrays, pointers, and function return values
  90. are allowed in conjunction with this construct. Only sizes that are
  91. a power of two are currently allowed.
  92. </p>
  93. <p>All the basic integer types can be used as base types, both as signed
  94. and as unsigned: <code>char</code>, <code>short</code>, <code>int</code>, <code>long</code>,
  95. <code>long long</code>. In addition, <code>float</code> and <code>double</code> can be
  96. used to build floating-point vector types.
  97. </p>
  98. <p>Specifying a combination that is not valid for the current architecture
  99. causes GCC to synthesize the instructions using a narrower mode.
  100. For example, if you specify a variable of type <code>V4SI</code> and your
  101. architecture does not allow for this specific SIMD type, GCC
  102. produces code that uses 4 <code>SIs</code>.
  103. </p>
  104. <p>The types defined in this manner can be used with a subset of normal C
  105. operations. Currently, GCC allows using the following operators
  106. on these types: <code>+, -, *, /, unary minus, ^, |, &amp;, ~, %</code>.
  107. </p>
  108. <p>The operations behave like C++ <code>valarrays</code>. Addition is defined as
  109. the addition of the corresponding elements of the operands. For
  110. example, in the code below, each of the 4 elements in <var>a</var> is
  111. added to the corresponding 4 elements in <var>b</var> and the resulting
  112. vector is stored in <var>c</var>.
  113. </p>
  114. <div class="smallexample">
  115. <pre class="smallexample">typedef int v4si __attribute__ ((vector_size (16)));
  116. v4si a, b, c;
  117. c = a + b;
  118. </pre></div>
  119. <p>Subtraction, multiplication, division, and the logical operations
  120. operate in a similar manner. Likewise, the result of using the unary
  121. minus or complement operators on a vector type is a vector whose
  122. elements are the negative or complemented values of the corresponding
  123. elements in the operand.
  124. </p>
  125. <p>It is possible to use shifting operators <code>&lt;&lt;</code>, <code>&gt;&gt;</code> on
  126. integer-type vectors. The operation is defined as following: <code>{a0,
  127. a1, &hellip;, an} &gt;&gt; {b0, b1, &hellip;, bn} == {a0 &gt;&gt; b0, a1 &gt;&gt; b1,
  128. &hellip;, an &gt;&gt; bn}</code>. Vector operands must have the same number of
  129. elements.
  130. </p>
  131. <p>For convenience, it is allowed to use a binary vector operation
  132. where one operand is a scalar. In that case the compiler transforms
  133. the scalar operand into a vector where each element is the scalar from
  134. the operation. The transformation happens only if the scalar could be
  135. safely converted to the vector-element type.
  136. Consider the following code.
  137. </p>
  138. <div class="smallexample">
  139. <pre class="smallexample">typedef int v4si __attribute__ ((vector_size (16)));
  140. v4si a, b, c;
  141. long l;
  142. a = b + 1; /* a = b + {1,1,1,1}; */
  143. a = 2 * b; /* a = {2,2,2,2} * b; */
  144. a = l + a; /* Error, cannot convert long to int. */
  145. </pre></div>
  146. <p>Vectors can be subscripted as if the vector were an array with
  147. the same number of elements and base type. Out of bound accesses
  148. invoke undefined behavior at run time. Warnings for out of bound
  149. accesses for vector subscription can be enabled with
  150. <samp>-Warray-bounds</samp>.
  151. </p>
  152. <p>Vector comparison is supported with standard comparison
  153. operators: <code>==, !=, &lt;, &lt;=, &gt;, &gt;=</code>. Comparison operands can be
  154. vector expressions of integer-type or real-type. Comparison between
  155. integer-type vectors and real-type vectors are not supported. The
  156. result of the comparison is a vector of the same width and number of
  157. elements as the comparison operands with a signed integral element
  158. type.
  159. </p>
  160. <p>Vectors are compared element-wise producing 0 when comparison is false
  161. and -1 (constant of the appropriate type where all bits are set)
  162. otherwise. Consider the following example.
  163. </p>
  164. <div class="smallexample">
  165. <pre class="smallexample">typedef int v4si __attribute__ ((vector_size (16)));
  166. v4si a = {1,2,3,4};
  167. v4si b = {3,2,1,4};
  168. v4si c;
  169. c = a &gt; b; /* The result would be {0, 0,-1, 0} */
  170. c = a == b; /* The result would be {0,-1, 0,-1} */
  171. </pre></div>
  172. <p>In C++, the ternary operator <code>?:</code> is available. <code>a?b:c</code>, where
  173. <code>b</code> and <code>c</code> are vectors of the same type and <code>a</code> is an
  174. integer vector with the same number of elements of the same size as <code>b</code>
  175. and <code>c</code>, computes all three arguments and creates a vector
  176. <code>{a[0]?b[0]:c[0], a[1]?b[1]:c[1], &hellip;}</code>. Note that unlike in
  177. OpenCL, <code>a</code> is thus interpreted as <code>a != 0</code> and not <code>a &lt; 0</code>.
  178. As in the case of binary operations, this syntax is also accepted when
  179. one of <code>b</code> or <code>c</code> is a scalar that is then transformed into a
  180. vector. If both <code>b</code> and <code>c</code> are scalars and the type of
  181. <code>true?b:c</code> has the same size as the element type of <code>a</code>, then
  182. <code>b</code> and <code>c</code> are converted to a vector type whose elements have
  183. this type and with the same number of elements as <code>a</code>.
  184. </p>
  185. <p>In C++, the logic operators <code>!, &amp;&amp;, ||</code> are available for vectors.
  186. <code>!v</code> is equivalent to <code>v == 0</code>, <code>a &amp;&amp; b</code> is equivalent to
  187. <code>a!=0 &amp; b!=0</code> and <code>a || b</code> is equivalent to <code>a!=0 | b!=0</code>.
  188. For mixed operations between a scalar <code>s</code> and a vector <code>v</code>,
  189. <code>s &amp;&amp; v</code> is equivalent to <code>s?v!=0:0</code> (the evaluation is
  190. short-circuit) and <code>v &amp;&amp; s</code> is equivalent to <code>v!=0 &amp; (s?-1:0)</code>.
  191. </p>
  192. <a name="index-_005f_005fbuiltin_005fshuffle"></a>
  193. <p>Vector shuffling is available using functions
  194. <code>__builtin_shuffle (vec, mask)</code> and
  195. <code>__builtin_shuffle (vec0, vec1, mask)</code>.
  196. Both functions construct a permutation of elements from one or two
  197. vectors and return a vector of the same type as the input vector(s).
  198. The <var>mask</var> is an integral vector with the same width (<var>W</var>)
  199. and element count (<var>N</var>) as the output vector.
  200. </p>
  201. <p>The elements of the input vectors are numbered in memory ordering of
  202. <var>vec0</var> beginning at 0 and <var>vec1</var> beginning at <var>N</var>. The
  203. elements of <var>mask</var> are considered modulo <var>N</var> in the single-operand
  204. case and modulo <em>2*<var>N</var></em> in the two-operand case.
  205. </p>
  206. <p>Consider the following example,
  207. </p>
  208. <div class="smallexample">
  209. <pre class="smallexample">typedef int v4si __attribute__ ((vector_size (16)));
  210. v4si a = {1,2,3,4};
  211. v4si b = {5,6,7,8};
  212. v4si mask1 = {0,1,1,3};
  213. v4si mask2 = {0,4,2,5};
  214. v4si res;
  215. res = __builtin_shuffle (a, mask1); /* res is {1,2,2,4} */
  216. res = __builtin_shuffle (a, b, mask2); /* res is {1,5,3,6} */
  217. </pre></div>
  218. <p>Note that <code>__builtin_shuffle</code> is intentionally semantically
  219. compatible with the OpenCL <code>shuffle</code> and <code>shuffle2</code> functions.
  220. </p>
  221. <p>You can declare variables and use them in function calls and returns, as
  222. well as in assignments and some casts. You can specify a vector type as
  223. a return type for a function. Vector types can also be used as function
  224. arguments. It is possible to cast from one vector type to another,
  225. provided they are of the same size (in fact, you can also cast vectors
  226. to and from other datatypes of the same size).
  227. </p>
  228. <p>You cannot operate between vectors of different lengths or different
  229. signedness without a cast.
  230. </p>
  231. <hr>
  232. <div class="header">
  233. <p>
  234. Next: <a href="Offsetof.html#Offsetof" accesskey="n" rel="next">Offsetof</a>, Previous: <a href="Return-Address.html#Return-Address" accesskey="p" rel="prev">Return Address</a>, Up: <a href="C-Extensions.html#C-Extensions" accesskey="u" rel="up">C Extensions</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  235. </div>
  236. </body>
  237. </html>