OpenACC-Library-Interoperability.html 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 2006-2017 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Funding Free Software", the Front-Cover
  8. texts being (a) (see below), and with the Back-Cover Texts being (b)
  9. (see below). A copy of the license is included in the section entitled
  10. "GNU Free Documentation License".
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <title>GNU libgomp: OpenACC Library Interoperability</title>
  20. <meta name="description" content="GNU libgomp: OpenACC Library Interoperability">
  21. <meta name="keywords" content="GNU libgomp: OpenACC Library Interoperability">
  22. <meta name="resource-type" content="document">
  23. <meta name="distribution" content="global">
  24. <meta name="Generator" content="makeinfo">
  25. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  26. <link href="index.html#Top" rel="start" title="Top">
  27. <link href="Library-Index.html#Library-Index" rel="index" title="Library Index">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="index.html#Top" rel="up" title="Top">
  30. <link href="The-libgomp-ABI.html#The-libgomp-ABI" rel="next" title="The libgomp ABI">
  31. <link href="CUDA-Streams-Usage.html#CUDA-Streams-Usage" rel="prev" title="CUDA Streams Usage">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.smallquotation {font-size: smaller}
  36. div.display {margin-left: 3.2em}
  37. div.example {margin-left: 3.2em}
  38. div.indentedblock {margin-left: 3.2em}
  39. div.lisp {margin-left: 3.2em}
  40. div.smalldisplay {margin-left: 3.2em}
  41. div.smallexample {margin-left: 3.2em}
  42. div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
  43. div.smalllisp {margin-left: 3.2em}
  44. kbd {font-style:oblique}
  45. pre.display {font-family: inherit}
  46. pre.format {font-family: inherit}
  47. pre.menu-comment {font-family: serif}
  48. pre.menu-preformatted {font-family: serif}
  49. pre.smalldisplay {font-family: inherit; font-size: smaller}
  50. pre.smallexample {font-size: smaller}
  51. pre.smallformat {font-family: inherit; font-size: smaller}
  52. pre.smalllisp {font-size: smaller}
  53. span.nocodebreak {white-space:nowrap}
  54. span.nolinebreak {white-space:nowrap}
  55. span.roman {font-family:serif; font-weight:normal}
  56. span.sansserif {font-family:sans-serif; font-weight:normal}
  57. ul.no-bullet {list-style: none}
  58. -->
  59. </style>
  60. </head>
  61. <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
  62. <a name="OpenACC-Library-Interoperability"></a>
  63. <div class="header">
  64. <p>
  65. Next: <a href="The-libgomp-ABI.html#The-libgomp-ABI" accesskey="n" rel="next">The libgomp ABI</a>, Previous: <a href="CUDA-Streams-Usage.html#CUDA-Streams-Usage" accesskey="p" rel="prev">CUDA Streams Usage</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Library-Index.html#Library-Index" title="Index" rel="index">Index</a>]</p>
  66. </div>
  67. <hr>
  68. <a name="OpenACC-Library-Interoperability-1"></a>
  69. <h2 class="chapter">8 OpenACC Library Interoperability</h2>
  70. <a name="Introduction-1"></a>
  71. <h3 class="section">8.1 Introduction</h3>
  72. <p>The OpenACC library uses the CUDA Driver API, and may interact with
  73. programs that use the Runtime library directly, or another library
  74. based on the Runtime library, e.g., CUBLAS<a name="DOCF2" href="#FOOT2"><sup>2</sup></a>.
  75. This chapter describes the use cases and what changes are
  76. required in order to use both the OpenACC library and the CUBLAS and Runtime
  77. libraries within a program.
  78. </p>
  79. <a name="First-invocation_003a-NVIDIA-CUBLAS-library-API"></a>
  80. <h3 class="section">8.2 First invocation: NVIDIA CUBLAS library API</h3>
  81. <p>In this first use case (see below), a function in the CUBLAS library is called
  82. prior to any of the functions in the OpenACC library. More specifically, the
  83. function <code>cublasCreate()</code>.
  84. </p>
  85. <p>When invoked, the function initializes the library and allocates the
  86. hardware resources on the host and the device on behalf of the caller. Once
  87. the initialization and allocation has completed, a handle is returned to the
  88. caller. The OpenACC library also requires initialization and allocation of
  89. hardware resources. Since the CUBLAS library has already allocated the
  90. hardware resources for the device, all that is left to do is to initialize
  91. the OpenACC library and acquire the hardware resources on the host.
  92. </p>
  93. <p>Prior to calling the OpenACC function that initializes the library and
  94. allocate the host hardware resources, you need to acquire the device number
  95. that was allocated during the call to <code>cublasCreate()</code>. The invoking of the
  96. runtime library function <code>cudaGetDevice()</code> accomplishes this. Once
  97. acquired, the device number is passed along with the device type as
  98. parameters to the OpenACC library function <code>acc_set_device_num()</code>.
  99. </p>
  100. <p>Once the call to <code>acc_set_device_num()</code> has completed, the OpenACC
  101. library uses the context that was created during the call to
  102. <code>cublasCreate()</code>. In other words, both libraries will be sharing the
  103. same context.
  104. </p>
  105. <div class="smallexample">
  106. <pre class="smallexample"> /* Create the handle */
  107. s = cublasCreate(&amp;h);
  108. if (s != CUBLAS_STATUS_SUCCESS)
  109. {
  110. fprintf(stderr, &quot;cublasCreate failed %d\n&quot;, s);
  111. exit(EXIT_FAILURE);
  112. }
  113. /* Get the device number */
  114. e = cudaGetDevice(&amp;dev);
  115. if (e != cudaSuccess)
  116. {
  117. fprintf(stderr, &quot;cudaGetDevice failed %d\n&quot;, e);
  118. exit(EXIT_FAILURE);
  119. }
  120. /* Initialize OpenACC library and use device 'dev' */
  121. acc_set_device_num(dev, acc_device_nvidia);
  122. </pre></div>
  123. <div align="center">Use Case 1
  124. </div>
  125. <a name="First-invocation_003a-OpenACC-library-API"></a>
  126. <h3 class="section">8.3 First invocation: OpenACC library API</h3>
  127. <p>In this second use case (see below), a function in the OpenACC library is
  128. called prior to any of the functions in the CUBLAS library. More specificially,
  129. the function <code>acc_set_device_num()</code>.
  130. </p>
  131. <p>In the use case presented here, the function <code>acc_set_device_num()</code>
  132. is used to both initialize the OpenACC library and allocate the hardware
  133. resources on the host and the device. In the call to the function, the
  134. call parameters specify which device to use and what device
  135. type to use, i.e., <code>acc_device_nvidia</code>. It should be noted that this
  136. is but one method to initialize the OpenACC library and allocate the
  137. appropriate hardware resources. Other methods are available through the
  138. use of environment variables and these will be discussed in the next section.
  139. </p>
  140. <p>Once the call to <code>acc_set_device_num()</code> has completed, other OpenACC
  141. functions can be called as seen with multiple calls being made to
  142. <code>acc_copyin()</code>. In addition, calls can be made to functions in the
  143. CUBLAS library. In the use case a call to <code>cublasCreate()</code> is made
  144. subsequent to the calls to <code>acc_copyin()</code>.
  145. As seen in the previous use case, a call to <code>cublasCreate()</code>
  146. initializes the CUBLAS library and allocates the hardware resources on the
  147. host and the device. However, since the device has already been allocated,
  148. <code>cublasCreate()</code> will only initialize the CUBLAS library and allocate
  149. the appropriate hardware resources on the host. The context that was created
  150. as part of the OpenACC initialization is shared with the CUBLAS library,
  151. similarly to the first use case.
  152. </p>
  153. <div class="smallexample">
  154. <pre class="smallexample"> dev = 0;
  155. acc_set_device_num(dev, acc_device_nvidia);
  156. /* Copy the first set to the device */
  157. d_X = acc_copyin(&amp;h_X[0], N * sizeof (float));
  158. if (d_X == NULL)
  159. {
  160. fprintf(stderr, &quot;copyin error h_X\n&quot;);
  161. exit(EXIT_FAILURE);
  162. }
  163. /* Copy the second set to the device */
  164. d_Y = acc_copyin(&amp;h_Y1[0], N * sizeof (float));
  165. if (d_Y == NULL)
  166. {
  167. fprintf(stderr, &quot;copyin error h_Y1\n&quot;);
  168. exit(EXIT_FAILURE);
  169. }
  170. /* Create the handle */
  171. s = cublasCreate(&amp;h);
  172. if (s != CUBLAS_STATUS_SUCCESS)
  173. {
  174. fprintf(stderr, &quot;cublasCreate failed %d\n&quot;, s);
  175. exit(EXIT_FAILURE);
  176. }
  177. /* Perform saxpy using CUBLAS library function */
  178. s = cublasSaxpy(h, N, &amp;alpha, d_X, 1, d_Y, 1);
  179. if (s != CUBLAS_STATUS_SUCCESS)
  180. {
  181. fprintf(stderr, &quot;cublasSaxpy failed %d\n&quot;, s);
  182. exit(EXIT_FAILURE);
  183. }
  184. /* Copy the results from the device */
  185. acc_memcpy_from_device(&amp;h_Y1[0], d_Y, N * sizeof (float));
  186. </pre></div>
  187. <div align="center">Use Case 2
  188. </div>
  189. <a name="OpenACC-library-and-environment-variables"></a>
  190. <h3 class="section">8.4 OpenACC library and environment variables</h3>
  191. <p>There are two environment variables associated with the OpenACC library
  192. that may be used to control the device type and device number:
  193. <code>ACC_DEVICE_TYPE</code> and <code>ACC_DEVICE_NUM</code>, respecively. These two
  194. environement variables can be used as an alternative to calling
  195. <code>acc_set_device_num()</code>. As seen in the second use case, the device
  196. type and device number were specified using <code>acc_set_device_num()</code>.
  197. If however, the aforementioned environment variables were set, then the
  198. call to <code>acc_set_device_num()</code> would not be required.
  199. </p>
  200. <p>The use of the environment variables is only relevant when an OpenACC function
  201. is called prior to a call to <code>cudaCreate()</code>. If <code>cudaCreate()</code>
  202. is called prior to a call to an OpenACC function, then you must call
  203. <code>acc_set_device_num()</code><a name="DOCF3" href="#FOOT3"><sup>3</sup></a>
  204. </p>
  205. <div class="footnote">
  206. <hr>
  207. <h4 class="footnotes-heading">Footnotes</h4>
  208. <h3><a name="FOOT2" href="#DOCF2">(2)</a></h3>
  209. <p>See section 2.26,
  210. &quot;Interactions with the CUDA Driver API&quot; in
  211. &quot;CUDA Runtime API&quot;, Version 5.5, and section 2.27, &quot;VDPAU
  212. Interoperability&quot;, in &quot;CUDA Driver API&quot;, TRM-06703-001, Version 5.5,
  213. for additional information on library interoperability.</p>
  214. <h3><a name="FOOT3" href="#DOCF3">(3)</a></h3>
  215. <p>More complete information
  216. about <code>ACC_DEVICE_TYPE</code> and <code>ACC_DEVICE_NUM</code> can be found in
  217. sections 4.1 and 4.2 of the <a href="http://www.openacc.org/">OpenACC</a>
  218. Application Programming Interface”, Version 2.0.</p>
  219. </div>
  220. <hr>
  221. <div class="header">
  222. <p>
  223. Next: <a href="The-libgomp-ABI.html#The-libgomp-ABI" accesskey="n" rel="next">The libgomp ABI</a>, Previous: <a href="CUDA-Streams-Usage.html#CUDA-Streams-Usage" accesskey="p" rel="prev">CUDA Streams Usage</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Library-Index.html#Library-Index" title="Index" rel="index">Index</a>]</p>
  224. </div>
  225. </body>
  226. </html>