Internals.html 21 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 2011-2017 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.2 or
  6. any later version published by the Free Software Foundation; with no
  7. Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
  8. A copy of the license is included in the section entitled "GNU
  9. Free Documentation License". -->
  10. <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ -->
  11. <head>
  12. <title>GNU libitm: Internals</title>
  13. <meta name="description" content="GNU libitm: Internals">
  14. <meta name="keywords" content="GNU libitm: Internals">
  15. <meta name="resource-type" content="document">
  16. <meta name="distribution" content="global">
  17. <meta name="Generator" content="makeinfo">
  18. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  19. <link href="index.html#Top" rel="start" title="Top">
  20. <link href="Library-Index.html#Library-Index" rel="index" title="Library Index">
  21. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  22. <link href="index.html#Top" rel="up" title="Top">
  23. <link href="GNU-Free-Documentation-License.html#GNU-Free-Documentation-License" rel="next" title="GNU Free Documentation License">
  24. <link href="The-libitm-ABI.html#The-libitm-ABI" rel="prev" title="The libitm ABI">
  25. <style type="text/css">
  26. <!--
  27. a.summary-letter {text-decoration: none}
  28. blockquote.smallquotation {font-size: smaller}
  29. div.display {margin-left: 3.2em}
  30. div.example {margin-left: 3.2em}
  31. div.indentedblock {margin-left: 3.2em}
  32. div.lisp {margin-left: 3.2em}
  33. div.smalldisplay {margin-left: 3.2em}
  34. div.smallexample {margin-left: 3.2em}
  35. div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
  36. div.smalllisp {margin-left: 3.2em}
  37. kbd {font-style:oblique}
  38. pre.display {font-family: inherit}
  39. pre.format {font-family: inherit}
  40. pre.menu-comment {font-family: serif}
  41. pre.menu-preformatted {font-family: serif}
  42. pre.smalldisplay {font-family: inherit; font-size: smaller}
  43. pre.smallexample {font-size: smaller}
  44. pre.smallformat {font-family: inherit; font-size: smaller}
  45. pre.smalllisp {font-size: smaller}
  46. span.nocodebreak {white-space:nowrap}
  47. span.nolinebreak {white-space:nowrap}
  48. span.roman {font-family:serif; font-weight:normal}
  49. span.sansserif {font-family:sans-serif; font-weight:normal}
  50. ul.no-bullet {list-style: none}
  51. -->
  52. </style>
  53. </head>
  54. <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
  55. <a name="Internals"></a>
  56. <div class="header">
  57. <p>
  58. Next: <a href="GNU-Free-Documentation-License.html#GNU-Free-Documentation-License" accesskey="n" rel="next">GNU Free Documentation License</a>, Previous: <a href="The-libitm-ABI.html#The-libitm-ABI" accesskey="p" rel="prev">The libitm ABI</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Library-Index.html#Library-Index" title="Index" rel="index">Index</a>]</p>
  59. </div>
  60. <hr>
  61. <a name="Internals-1"></a>
  62. <h2 class="chapter">4 Internals</h2>
  63. <a name="TM-methods-and-method-groups"></a>
  64. <h3 class="section">4.1 TM methods and method groups</h3>
  65. <p>libitm supports several ways of synchronizing transactions with each other.
  66. These TM methods (or TM algorithms) are implemented in the form of
  67. subclasses of <code>abi_dispatch</code>, which provide methods for
  68. transactional loads and stores as well as callbacks for rollback and commit.
  69. All methods that are compatible with each other (i.e., that let concurrently
  70. running transactions still synchronize correctly even if different methods
  71. are used) belong to the same TM method group. Pointers to TM methods can be
  72. obtained using the factory methods prefixed with <code>dispatch_</code> in
  73. <samp>libitm_i.h</samp>. There are two special methods, <code>dispatch_serial</code> and
  74. <code>dispatch_serialirr</code>, that are compatible with all methods because they
  75. run transactions completely in serial mode.
  76. </p>
  77. <a name="TM-method-life-cycle"></a>
  78. <h4 class="subsection">4.1.1 TM method life cycle</h4>
  79. <p>The state of TM methods does not change after construction, but they do alter
  80. the state of transactions that use this method. However, because
  81. per-transaction data gets used by several methods, <code>gtm_thread</code> is
  82. responsible for setting an initial state that is useful for all methods.
  83. After that, methods are responsible for resetting/clearing this state on each
  84. rollback or commit (of outermost transactions), so that the transaction
  85. executed next is not affected by the previous transaction.
  86. </p>
  87. <p>There is also global state associated with each method group, which is
  88. initialized and shut down (<code>method_group::init()</code> and <code>fini()</code>)
  89. when switching between method groups (see <samp>retry.cc</samp>).
  90. </p>
  91. <a name="Selecting-the-default-method"></a>
  92. <h4 class="subsection">4.1.2 Selecting the default method</h4>
  93. <p>The default method that libitm uses for freshly started transactions (but
  94. not necessarily for restarted transactions) can be set via an environment
  95. variable (<code>ITM_DEFAULT_METHOD</code>), whose value should be equal to the name
  96. of one of the factory methods returning abi_dispatch subclasses but without
  97. the &quot;dispatch_&quot; prefix (e.g., &quot;serialirr&quot; instead of
  98. <code>GTM::dispatch_serialirr()</code>).
  99. </p>
  100. <p>Note that this environment variable is only a hint for libitm and might not
  101. be supported in the future.
  102. </p>
  103. <a name="Nesting_003a-flat-vs_002e-closed"></a>
  104. <h3 class="section">4.2 Nesting: flat vs. closed</h3>
  105. <p>We support two different kinds of nesting of transactions. In the case of
  106. <em>flat nesting</em>, the nesting structure is flattened and all nested
  107. transactions are subsumed by the enclosing transaction. In contrast,
  108. with <em>closed nesting</em>, nested transactions that have not yet committed
  109. can be rolled back separately from the enclosing transactions; when they
  110. commit, they are subsumed by the enclosing transaction, and their effects
  111. will be finally committed when the outermost transaction commits.
  112. <em>Open nesting</em> (where nested transactions can commit independently of the
  113. enclosing transactions) are not supported.
  114. </p>
  115. <p>Flat nesting is the default nesting mode, but closed nesting is supported and
  116. used when transactions contain user-controlled aborts
  117. (<code>__transaction_cancel</code> statements). We assume that user-controlled
  118. aborts are rare in typical code and used mostly in exceptional situations.
  119. Thus, it makes more sense to use flat nesting by default to avoid the
  120. performance overhead of the additional checkpoints required for closed
  121. nesting. User-controlled aborts will correctly abort the innermost enclosing
  122. transaction, whereas the whole (i.e., outermost) transaction will be restarted
  123. otherwise (e.g., when a transaction encounters data conflicts during
  124. optimistic execution).
  125. </p>
  126. <a name="Locking-conventions"></a>
  127. <h3 class="section">4.3 Locking conventions</h3>
  128. <p>This section documents the locking scheme and rules for all uses of locking
  129. in libitm. We have to support serial(-irrevocable) mode, which is implemented
  130. using a global lock as explained next (called the <em>serial lock</em>). To
  131. simplify the overall design, we use the same lock as catch-all locking
  132. mechanism for other infrequent tasks such as (de)registering clone tables or
  133. threads. Besides the serial lock, there are <em>per-method-group locks</em> that
  134. are managed by specific method groups (i.e., groups of similar TM concurrency
  135. control algorithms), and lock-like constructs for quiescence-based operations
  136. such as ensuring privatization safety.
  137. </p>
  138. <p>Thus, the actions that participate in the libitm-internal locking are either
  139. <em>active transactions</em> that do not run in serial mode, <em>serial
  140. transactions</em> (which (are about to) run in serial mode), and management tasks
  141. that do not execute within a transaction but have acquired the serial mode
  142. like a serial transaction would do (e.g., to be able to register threads with
  143. libitm). Transactions become active as soon as they have successfully used the
  144. serial lock to announce this globally (see <a href="#serial_002dlock_002dimpl">Serial lock
  145. implementation</a>). Likewise, transactions become serial transactions as soon as
  146. they have acquired the exclusive rights provided by the serial lock (i.e.,
  147. serial mode, which also means that there are no other concurrent active or
  148. serial transactions). Note that active transactions can become serial
  149. transactions when they enter serial mode during the runtime of the
  150. transaction.
  151. </p>
  152. <a name="State_002dto_002dlock-mapping"></a>
  153. <h4 class="subsection">4.3.1 State-to-lock mapping</h4>
  154. <p>Application data is protected by the serial lock if there is a serial
  155. transaction and no concurrently running active transaction (i.e., non-serial).
  156. Otherwise, application data is protected by the currently selected method
  157. group, which might use per-method-group locks or other mechanisms. Also note
  158. that application data that is about to be privatized might not be allowed to be
  159. accessed by nontransactional code until privatization safety has been ensured;
  160. the details of this are handled by the current method group.
  161. </p>
  162. <p>libitm-internal state is either protected by the serial lock or accessed
  163. through custom concurrent code. The latter applies to the public/shared part
  164. of a transaction object and most typical method-group-specific state.
  165. </p>
  166. <p>The former category (protected by the serial lock) includes:
  167. </p><ul>
  168. <li> The list of active threads that have used transactions.
  169. </li><li> The tables that map functions to their transactional clones.
  170. </li><li> The current selection of which method group to use.
  171. </li><li> Some method-group-specific data, or invariants of this data. For example,
  172. resetting a method group to its initial state is handled by switching to the
  173. same method group, so the serial lock protects such resetting as well.
  174. </li></ul>
  175. <p>In general, such state is immutable whenever there exists an active
  176. (non-serial) transaction. If there is no active transaction, a serial
  177. transaction (or a thread that is not currently executing a transaction but has
  178. acquired the serial lock) is allowed to modify this state (but must of course
  179. be careful to not surprise the current method group&rsquo;s implementation with such
  180. modifications).
  181. </p>
  182. <a name="Lock-acquisition-order"></a>
  183. <h4 class="subsection">4.3.2 Lock acquisition order</h4>
  184. <p>To prevent deadlocks, locks acquisition must happen in a globally agreed-upon
  185. order. Note that this applies to other forms of blocking too, but does not
  186. necessarily apply to lock acquisitions that do not block (e.g., trylock()
  187. calls that do not get retried forever). Note that serial transactions are
  188. never return back to active transactions until the transaction has committed.
  189. Likewise, active transactions stay active until they have committed.
  190. Per-method-group locks are typically also not released before commit.
  191. </p>
  192. <p>Lock acquisition / blocking rules:
  193. </p><ul>
  194. <li> Transactions must become active or serial before they are allowed to
  195. use method-group-specific locks or blocking (i.e., the serial lock must be
  196. acquired before those other locks, either in serial or nonserial mode).
  197. </li><li> Any number of threads that do not currently run active transactions can
  198. block while trying to get the serial lock in exclusive mode. Note that active
  199. transactions must not block when trying to upgrade to serial mode unless there
  200. is no other transaction that is trying that (the latter is ensured by the
  201. serial lock implementation.
  202. </li><li> Method groups must prevent deadlocks on their locks. In particular, they
  203. must also be prepared for another active transaction that has acquired
  204. method-group-specific locks but is blocked during an attempt to upgrade to
  205. being a serial transaction. See below for details.
  206. </li><li> Serial transactions can acquire method-group-specific locks because there
  207. will be no other active nor serial transaction.
  208. </li></ul>
  209. <p>There is no single rule for per-method-group blocking because this depends on
  210. when a TM method might acquire locks. If no active transaction can upgrade to
  211. being a serial transaction after it has acquired per-method-group locks (e.g.,
  212. when those locks are only acquired during an attempt to commit), then the TM
  213. method does not need to consider a potential deadlock due to serial mode.
  214. </p>
  215. <p>If there can be upgrades to serial mode after the acquisition of
  216. per-method-group locks, then TM methods need to avoid those deadlocks:
  217. </p><ul>
  218. <li> When upgrading to a serial transaction, after acquiring exclusive rights
  219. to the serial lock but before waiting for concurrent active transactions to
  220. finish (see <a href="#serial_002dlock_002dimpl">Serial lock implementation</a> for details),
  221. we have to wake up all active transactions waiting on the upgrader&rsquo;s
  222. per-method-group locks.
  223. </li><li> Active transactions blocking on per-method-group locks need to check the
  224. serial lock and abort if there is a pending serial transaction.
  225. </li><li> Lost wake-ups have to be prevented (e.g., by changing a bit in each
  226. per-method-group lock before doing the wake-up, and only blocking on this lock
  227. using a futex if this bit is not group).
  228. </li></ul>
  229. <p><strong>TODO</strong>: Can reuse serial lock for gl-*? And if we can, does it make
  230. sense to introduce further complexity in the serial lock? For gl-*, we can
  231. really only avoid an abort if we do -wb and -vbv.
  232. </p>
  233. <a name="Serial-lock-implementation"></a>
  234. <h4 class="subsection">4.3.3 Serial lock implementation</h4>
  235. <a name="serial_002dlock_002dimpl"></a>
  236. <p>The serial lock implementation is optimized towards assuming that serial
  237. transactions are infrequent and not the common case. However, the performance
  238. of entering serial mode can matter because when only few transactions are run
  239. concurrently or if there are few threads, then it can be efficient to run
  240. transactions serially.
  241. </p>
  242. <p>The serial lock is similar to a multi-reader-single-writer lock in that there
  243. can be several active transactions but only one serial transaction. However,
  244. we do want to avoid contention (in the lock implementation) between active
  245. transactions, so we split up the reader side of the lock into per-transaction
  246. flags that are true iff the transaction is active. The exclusive writer side
  247. remains a shared single flag, which is acquired using a CAS, for example.
  248. On the fast-path, the serial lock then works similar to Dekker&rsquo;s algorithm but
  249. with several reader flags that a serial transaction would have to check.
  250. A serial transaction thus requires a list of all threads with potentially
  251. active transactions; we can use the serial lock itself to protect this list
  252. (i.e., only threads that have acquired the serial lock can modify this list).
  253. </p>
  254. <p>We want starvation-freedom for the serial lock to allow for using it to ensure
  255. progress for potentially starved transactions (see <a href="#progress_002dguarantees">Progress Guarantees</a> for details). However, this is currently not enforced by
  256. the implementation of the serial lock.
  257. </p>
  258. <p>Here is pseudo-code for the read/write fast paths of acquiring the serial
  259. lock (read-to-write upgrade is similar to write_lock:
  260. </p><div class="example">
  261. <pre class="example">// read_lock:
  262. tx-&gt;shared_state |= active;
  263. __sync_synchronize(); // or STLD membar, or C++0x seq-cst fence
  264. while (!serial_lock.exclusive)
  265. if (spinning_for_too_long) goto slowpath;
  266. // write_lock:
  267. if (CAS(&amp;serial_lock.exclusive, 0, this) != 0)
  268. goto slowpath; // writer-writer contention
  269. // need a membar here, but CAS already has full membar semantics
  270. bool need_blocking = false;
  271. for (t: all txns)
  272. {
  273. for (;t-&gt;shared_state &amp; active;)
  274. if (spinning_for_too_long) { need_blocking = true; break; }
  275. }
  276. if (need_blocking) goto slowpath;
  277. </pre></div>
  278. <p>Releasing a lock in this spin-lock version then just consists of resetting
  279. <code>tx-&gt;shared_state</code> to inactive or clearing <code>serial_lock.exclusive</code>.
  280. </p>
  281. <p>However, we can&rsquo;t rely on a pure spinlock because we need to get the OS
  282. involved at some time (e.g., when there are more threads than CPUs to run on).
  283. Therefore, the real implementation falls back to a blocking slow path, either
  284. based on pthread mutexes or Linux futexes.
  285. </p>
  286. <a name="Reentrancy"></a>
  287. <h4 class="subsection">4.3.4 Reentrancy</h4>
  288. <p>libitm has to consider the following cases of reentrancy:
  289. </p><ul>
  290. <li> Transaction calls unsafe code that starts a new transaction: The outer
  291. transaction will become a serial transaction before executing unsafe code.
  292. Therefore, nesting within serial transactions must work, even if the nested
  293. transaction is called from within uninstrumented code.
  294. </li><li> Transaction calls either a transactional wrapper or safe code, which in
  295. turn starts a new transaction: It is not yet defined in the specification
  296. whether this is allowed. Thus, it is undefined whether libitm supports this.
  297. </li><li> Code that starts new transactions might be called from within any part
  298. of libitm: This kind of reentrancy would likely be rather complex and can
  299. probably be avoided. Therefore, it is not supported.
  300. </li></ul>
  301. <a name="Privatization-safety"></a>
  302. <h4 class="subsection">4.3.5 Privatization safety</h4>
  303. <p>Privatization safety is ensured by libitm using a quiescence-based approach.
  304. Basically, a privatizing transaction waits until all concurrent active
  305. transactions will either have finished (are not active anymore) or operate on
  306. a sufficiently recent snapshot to not access the privatized data anymore. This
  307. happens after the privatizing transaction has stopped being an active
  308. transaction, so waiting for quiescence does not contribute to deadlocks.
  309. </p>
  310. <p>In method groups that need to ensure publication safety explicitly, active
  311. transactions maintain a flag or timestamp in the public/shared part of the
  312. transaction descriptor. Before blocking, privatizers need to let the other
  313. transactions know that they should wake up the privatizer.
  314. </p>
  315. <p><strong>TODO</strong> Ho to implement the waiters? Should those flags be
  316. per-transaction or at a central place? We want to avoid one wake/wait call
  317. per active transactions, so we might want to use either a tree or combining
  318. to reduce the syscall overhead, or rather spin for a long amount of time
  319. instead of doing blocking. Also, it would be good if only the last transaction
  320. that the privatizer waits for would do the wake-up.
  321. </p>
  322. <a name="Progress-guarantees"></a>
  323. <h4 class="subsection">4.3.6 Progress guarantees</h4>
  324. <a name="progress_002dguarantees"></a>
  325. <p>Transactions that do not make progress when using the current TM method will
  326. eventually try to execute in serial mode. Thus, the serial lock&rsquo;s progress
  327. guarantees determine the progress guarantees of the whole TM. Obviously, we at
  328. least need deadlock-freedom for the serial lock, but it would also be good to
  329. provide starvation-freedom (informally, all threads will finish executing a
  330. transaction eventually iff they get enough cycles).
  331. </p>
  332. <p>However, the scheduling of transactions (e.g., thread scheduling by the OS)
  333. also affects the handling of progress guarantees by the TM. First, the TM
  334. can only guarantee deadlock-freedom if threads do not get stopped. Likewise,
  335. low-priority threads can starve if they do not get scheduled when other
  336. high-priority threads get those cycles instead.
  337. </p>
  338. <p>If all threads get scheduled eventually, correct lock implementations will
  339. provide deadlock-freedom, but might not provide starvation-freedom. We can
  340. either enforce the latter in the TM&rsquo;s lock implementation, or assume that
  341. the scheduling is sufficiently random to yield a probabilistic guarantee that
  342. no thread will starve (because eventually, a transaction will encounter a
  343. scheduling that will allow it to run). This can indeed work well in practice
  344. but is not necessarily guaranteed to work (e.g., simple spin locks can be
  345. pretty efficient).
  346. </p>
  347. <p>Because enforcing stronger progress guarantees in the TM has a higher runtime
  348. overhead, we focus on deadlock-freedom right now and assume that the threads
  349. will get scheduled eventually by the OS (but don&rsquo;t consider threads with
  350. different priorities). We should support starvation-freedom for serial
  351. transactions in the future. Everything beyond that is highly related to proper
  352. contention management across all of the TM (including with TM method to
  353. choose), and is future work.
  354. </p>
  355. <p><strong>TODO</strong> Handling thread priorities: We want to avoid priority inversion
  356. but it&rsquo;s unclear how often that actually matters in practice. Workloads that
  357. have threads with different priorities will likely also require lower latency
  358. or higher throughput for high-priority threads. Therefore, it probably makes
  359. not that much sense (except for eventual progress guarantees) to use
  360. priority inheritance until the TM has priority-aware contention management.
  361. </p>
  362. <hr>
  363. <div class="header">
  364. <p>
  365. Next: <a href="GNU-Free-Documentation-License.html#GNU-Free-Documentation-License" accesskey="n" rel="next">GNU Free Documentation License</a>, Previous: <a href="The-libitm-ABI.html#The-libitm-ABI" accesskey="p" rel="prev">The libitm ABI</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Library-Index.html#Library-Index" title="Index" rel="index">Index</a>]</p>
  366. </div>
  367. </body>
  368. </html>