Character-Sets.html 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1988-2020 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Free Software" and "Free Software Needs
  8. Free Documentation", with the Front-Cover Texts being "A GNU Manual,"
  9. and with the Back-Cover Texts as in (a) below.
  10. (a) The FSF's Back-Cover Text is: "You are free to copy and modify
  11. this GNU Manual. Buying copies from GNU Press supports the FSF in
  12. developing GNU and promoting software freedom." -->
  13. <!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ -->
  14. <head>
  15. <title>Debugging with GDB: Character Sets</title>
  16. <meta name="description" content="Debugging with GDB: Character Sets">
  17. <meta name="keywords" content="Debugging with GDB: Character Sets">
  18. <meta name="resource-type" content="document">
  19. <meta name="distribution" content="global">
  20. <meta name="Generator" content="makeinfo">
  21. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  22. <link href="index.html#Top" rel="start" title="Top">
  23. <link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
  24. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  25. <link href="Data.html#Data" rel="up" title="Data">
  26. <link href="Caching-Target-Data.html#Caching-Target-Data" rel="next" title="Caching Target Data">
  27. <link href="Core-File-Generation.html#Core-File-Generation" rel="previous" title="Core File Generation">
  28. <style type="text/css">
  29. <!--
  30. a.summary-letter {text-decoration: none}
  31. blockquote.smallquotation {font-size: smaller}
  32. div.display {margin-left: 3.2em}
  33. div.example {margin-left: 3.2em}
  34. div.indentedblock {margin-left: 3.2em}
  35. div.lisp {margin-left: 3.2em}
  36. div.smalldisplay {margin-left: 3.2em}
  37. div.smallexample {margin-left: 3.2em}
  38. div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
  39. div.smalllisp {margin-left: 3.2em}
  40. kbd {font-style:oblique}
  41. pre.display {font-family: inherit}
  42. pre.format {font-family: inherit}
  43. pre.menu-comment {font-family: serif}
  44. pre.menu-preformatted {font-family: serif}
  45. pre.smalldisplay {font-family: inherit; font-size: smaller}
  46. pre.smallexample {font-size: smaller}
  47. pre.smallformat {font-family: inherit; font-size: smaller}
  48. pre.smalllisp {font-size: smaller}
  49. span.nocodebreak {white-space:nowrap}
  50. span.nolinebreak {white-space:nowrap}
  51. span.roman {font-family:serif; font-weight:normal}
  52. span.sansserif {font-family:sans-serif; font-weight:normal}
  53. ul.no-bullet {list-style: none}
  54. -->
  55. </style>
  56. </head>
  57. <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
  58. <a name="Character-Sets"></a>
  59. <div class="header">
  60. <p>
  61. Next: <a href="Caching-Target-Data.html#Caching-Target-Data" accesskey="n" rel="next">Caching Target Data</a>, Previous: <a href="Core-File-Generation.html#Core-File-Generation" accesskey="p" rel="previous">Core File Generation</a>, Up: <a href="Data.html#Data" accesskey="u" rel="up">Data</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
  62. </div>
  63. <hr>
  64. <a name="Character-Sets-1"></a>
  65. <h3 class="section">10.20 Character Sets</h3>
  66. <a name="index-character-sets"></a>
  67. <a name="index-charset"></a>
  68. <a name="index-translating-between-character-sets"></a>
  69. <a name="index-host-character-set"></a>
  70. <a name="index-target-character-set"></a>
  71. <p>If the program you are debugging uses a different character set to
  72. represent characters and strings than the one <small>GDB</small> uses itself,
  73. <small>GDB</small> can automatically translate between the character sets for
  74. you. The character set <small>GDB</small> uses we call the <em>host
  75. character set</em>; the one the inferior program uses we call the
  76. <em>target character set</em>.
  77. </p>
  78. <p>For example, if you are running <small>GDB</small> on a <small>GNU</small>/Linux system, which
  79. uses the ISO Latin 1 character set, but you are using <small>GDB</small>&rsquo;s
  80. remote protocol (see <a href="Remote-Debugging.html#Remote-Debugging">Remote Debugging</a>) to debug a program
  81. running on an IBM mainframe, which uses the <small>EBCDIC</small> character set,
  82. then the host character set is Latin-1, and the target character set is
  83. <small>EBCDIC</small>. If you give <small>GDB</small> the command <code>set
  84. target-charset EBCDIC-US</code>, then <small>GDB</small> translates between
  85. <small>EBCDIC</small> and Latin 1 as you print character or string values, or use
  86. character and string literals in expressions.
  87. </p>
  88. <p><small>GDB</small> has no way to automatically recognize which character set
  89. the inferior program uses; you must tell it, using the <code>set
  90. target-charset</code> command, described below.
  91. </p>
  92. <p>Here are the commands for controlling <small>GDB</small>&rsquo;s character set
  93. support:
  94. </p>
  95. <dl compact="compact">
  96. <dt><code>set target-charset <var>charset</var></code></dt>
  97. <dd><a name="index-set-target_002dcharset"></a>
  98. <p>Set the current target character set to <var>charset</var>. To display the
  99. list of supported target character sets, type
  100. <kbd>set&nbsp;<span class="nolinebreak">target-charset</span>&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>.
  101. </p>
  102. </dd>
  103. <dt><code>set host-charset <var>charset</var></code></dt>
  104. <dd><a name="index-set-host_002dcharset"></a>
  105. <p>Set the current host character set to <var>charset</var>.
  106. </p>
  107. <p>By default, <small>GDB</small> uses a host character set appropriate to the
  108. system it is running on; you can override that default using the
  109. <code>set host-charset</code> command. On some systems, <small>GDB</small> cannot
  110. automatically determine the appropriate host character set. In this
  111. case, <small>GDB</small> uses &lsquo;<samp>UTF-8</samp>&rsquo;.
  112. </p>
  113. <p><small>GDB</small> can only use certain character sets as its host character
  114. set. If you type <kbd>set&nbsp;<span class="nolinebreak">host-charset</span>&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>,
  115. <small>GDB</small> will list the host character sets it supports.
  116. </p>
  117. </dd>
  118. <dt><code>set charset <var>charset</var></code></dt>
  119. <dd><a name="index-set-charset"></a>
  120. <p>Set the current host and target character sets to <var>charset</var>. As
  121. above, if you type <kbd>set&nbsp;charset&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>,
  122. <small>GDB</small> will list the names of the character sets that can be used
  123. for both host and target.
  124. </p>
  125. </dd>
  126. <dt><code>show charset</code></dt>
  127. <dd><a name="index-show-charset"></a>
  128. <p>Show the names of the current host and target character sets.
  129. </p>
  130. </dd>
  131. <dt><code>show host-charset</code></dt>
  132. <dd><a name="index-show-host_002dcharset"></a>
  133. <p>Show the name of the current host character set.
  134. </p>
  135. </dd>
  136. <dt><code>show target-charset</code></dt>
  137. <dd><a name="index-show-target_002dcharset"></a>
  138. <p>Show the name of the current target character set.
  139. </p>
  140. </dd>
  141. <dt><code>set target-wide-charset <var>charset</var></code></dt>
  142. <dd><a name="index-set-target_002dwide_002dcharset"></a>
  143. <p>Set the current target&rsquo;s wide character set to <var>charset</var>. This is
  144. the character set used by the target&rsquo;s <code>wchar_t</code> type. To
  145. display the list of supported wide character sets, type
  146. <kbd>set&nbsp;<span class="nolinebreak">target-wide-charset</span>&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>.
  147. </p>
  148. </dd>
  149. <dt><code>show target-wide-charset</code></dt>
  150. <dd><a name="index-show-target_002dwide_002dcharset"></a>
  151. <p>Show the name of the current target&rsquo;s wide character set.
  152. </p></dd>
  153. </dl>
  154. <p>Here is an example of <small>GDB</small>&rsquo;s character set support in action.
  155. Assume that the following source code has been placed in the file
  156. <samp>charset-test.c</samp>:
  157. </p>
  158. <div class="smallexample">
  159. <pre class="smallexample">#include &lt;stdio.h&gt;
  160. char ascii_hello[]
  161. = {72, 101, 108, 108, 111, 44, 32, 119,
  162. 111, 114, 108, 100, 33, 10, 0};
  163. char ibm1047_hello[]
  164. = {200, 133, 147, 147, 150, 107, 64, 166,
  165. 150, 153, 147, 132, 90, 37, 0};
  166. main ()
  167. {
  168. printf (&quot;Hello, world!\n&quot;);
  169. }
  170. </pre></div>
  171. <p>In this program, <code>ascii_hello</code> and <code>ibm1047_hello</code> are arrays
  172. containing the string &lsquo;<samp>Hello, world!</samp>&rsquo; followed by a newline,
  173. encoded in the <small>ASCII</small> and <small>IBM1047</small> character sets.
  174. </p>
  175. <p>We compile the program, and invoke the debugger on it:
  176. </p>
  177. <div class="smallexample">
  178. <pre class="smallexample">$ gcc -g charset-test.c -o charset-test
  179. $ gdb -nw charset-test
  180. GNU gdb 2001-12-19-cvs
  181. Copyright 2001 Free Software Foundation, Inc.
  182. &hellip;
  183. (gdb)
  184. </pre></div>
  185. <p>We can use the <code>show charset</code> command to see what character sets
  186. <small>GDB</small> is currently using to interpret and display characters and
  187. strings:
  188. </p>
  189. <div class="smallexample">
  190. <pre class="smallexample">(gdb) show charset
  191. The current host and target character set is `ISO-8859-1'.
  192. (gdb)
  193. </pre></div>
  194. <p>For the sake of printing this manual, let&rsquo;s use <small>ASCII</small> as our
  195. initial character set:
  196. </p><div class="smallexample">
  197. <pre class="smallexample">(gdb) set charset ASCII
  198. (gdb) show charset
  199. The current host and target character set is `ASCII'.
  200. (gdb)
  201. </pre></div>
  202. <p>Let&rsquo;s assume that <small>ASCII</small> is indeed the correct character set for our
  203. host system &mdash; in other words, let&rsquo;s assume that if <small>GDB</small> prints
  204. characters using the <small>ASCII</small> character set, our terminal will display
  205. them properly. Since our current target character set is also
  206. <small>ASCII</small>, the contents of <code>ascii_hello</code> print legibly:
  207. </p>
  208. <div class="smallexample">
  209. <pre class="smallexample">(gdb) print ascii_hello
  210. $1 = 0x401698 &quot;Hello, world!\n&quot;
  211. (gdb) print ascii_hello[0]
  212. $2 = 72 'H'
  213. (gdb)
  214. </pre></div>
  215. <p><small>GDB</small> uses the target character set for character and string
  216. literals you use in expressions:
  217. </p>
  218. <div class="smallexample">
  219. <pre class="smallexample">(gdb) print '+'
  220. $3 = 43 '+'
  221. (gdb)
  222. </pre></div>
  223. <p>The <small>ASCII</small> character set uses the number 43 to encode the &lsquo;<samp>+</samp>&rsquo;
  224. character.
  225. </p>
  226. <p><small>GDB</small> relies on the user to tell it which character set the
  227. target program uses. If we print <code>ibm1047_hello</code> while our target
  228. character set is still <small>ASCII</small>, we get jibberish:
  229. </p>
  230. <div class="smallexample">
  231. <pre class="smallexample">(gdb) print ibm1047_hello
  232. $4 = 0x4016a8 &quot;\310\205\223\223\226k@\246\226\231\223\204Z%&quot;
  233. (gdb) print ibm1047_hello[0]
  234. $5 = 200 '\310'
  235. (gdb)
  236. </pre></div>
  237. <p>If we invoke the <code>set target-charset</code> followed by <tt class="key">TAB</tt><tt class="key">TAB</tt>,
  238. <small>GDB</small> tells us the character sets it supports:
  239. </p>
  240. <div class="smallexample">
  241. <pre class="smallexample">(gdb) set target-charset
  242. ASCII EBCDIC-US IBM1047 ISO-8859-1
  243. (gdb) set target-charset
  244. </pre></div>
  245. <p>We can select <small>IBM1047</small> as our target character set, and examine the
  246. program&rsquo;s strings again. Now the <small>ASCII</small> string is wrong, but
  247. <small>GDB</small> translates the contents of <code>ibm1047_hello</code> from the
  248. target character set, <small>IBM1047</small>, to the host character set,
  249. <small>ASCII</small>, and they display correctly:
  250. </p>
  251. <div class="smallexample">
  252. <pre class="smallexample">(gdb) set target-charset IBM1047
  253. (gdb) show charset
  254. The current host character set is `ASCII'.
  255. The current target character set is `IBM1047'.
  256. (gdb) print ascii_hello
  257. $6 = 0x401698 &quot;\110\145%%?\054\040\167?\162%\144\041\012&quot;
  258. (gdb) print ascii_hello[0]
  259. $7 = 72 '\110'
  260. (gdb) print ibm1047_hello
  261. $8 = 0x4016a8 &quot;Hello, world!\n&quot;
  262. (gdb) print ibm1047_hello[0]
  263. $9 = 200 'H'
  264. (gdb)
  265. </pre></div>
  266. <p>As above, <small>GDB</small> uses the target character set for character and
  267. string literals you use in expressions:
  268. </p>
  269. <div class="smallexample">
  270. <pre class="smallexample">(gdb) print '+'
  271. $10 = 78 '+'
  272. (gdb)
  273. </pre></div>
  274. <p>The <small>IBM1047</small> character set uses the number 78 to encode the &lsquo;<samp>+</samp>&rsquo;
  275. character.
  276. </p>
  277. <hr>
  278. <div class="header">
  279. <p>
  280. Next: <a href="Caching-Target-Data.html#Caching-Target-Data" accesskey="n" rel="next">Caching Target Data</a>, Previous: <a href="Core-File-Generation.html#Core-File-Generation" accesskey="p" rel="previous">Core File Generation</a>, Up: <a href="Data.html#Data" accesskey="u" rel="up">Data</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
  281. </div>
  282. </body>
  283. </html>