Formatting information

A beginner's introduction to typesetting with LATEX

Appendix C — The ASCII character set

Peter Flynn

Silmaril Consultants
Textual Therapy Division

v. 3.6 (March 2005)


  1. Installing TEX and LATEX
  2. Using your editor to create documents
  3. Basic document structures
  4. Typesetting, viewing and printing
  5. CTAN, packages, and online help
  6. Other document structures
  7. Textual tools
  8. Fonts and layouts
  9. Programmability (macros)
  10. Compatibility with other systems
  1. Configuring TEX search paths
  2. TEX Users Group membership
  3. The ASCII character set
  4. GNU Free Documentation License

This edition of Formatting Information was prompted by the generous help I have received from TEX users too numerous to mention individually. Shortly after TUGboat published the November 2003 edition, I was reminded by a spate of email of the fragility of documentation for a system like LATEX which is constantly under development. There have been revisions to packages; issues of new distributions, new tools, and new interfaces; new books and other new documents; corrections to my own errors; suggestions for rewording; and in one or two cases mild abuse for having omitted package X which the author felt to be indispensable to users. ¶ I am grateful as always to the people who sent me corrections and suggestions for improvement. Please keep them coming: only this way can this book reflect what people want to learn. The same limitation still applies, however: no mathematics, as there are already a dozen or more excellent books on the market — as well as other online documents — dealing with mathematical typesetting in TEX and LATEX in finer and better detail than I am capable of. ¶ The structure remains the same, but I have revised and rephrased a lot of material, especially in the earlier chapters where a new user cannot be expected yet to have acquired any depth of knowledge. Many of the screenshots have been updated, and most of the examples and code fragments have been retested. ¶ As I was finishing this edition, I was asked to review an article for The PracTEX Journal, which grew out of the Practical TEX Conference in 2004. The author specifically took the writers of documentation to task for failing to explain things more clearly, and as I read more, I found myself agreeing, and resolving to clear up some specific problems areas as far as possible. It is very difficult for people who write technical documentation to remember how they struggled to learn what has now become a familiar system. So much of what we do is second nature, and a lot of it actually has nothing to do with the software, but more with the way in which we view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge, please let me know so that I can correct it.

Peter Flynn is author of The HTML Handbook and Understanding SGML and XML Tools, and editor of The XML FAQ.



The ASCII character set



    The American Standard Code for Information Interchange was invented in 1963, and after some redevelopment settled down in 1984 as standard X3.4 of American National Standards Institute (ANSI). It represents the 95 basic codes for the unaccented printable characters and punctuation of the Latin alphabet, plus 33 internal ‘control characters’ originally intended for the control of computers, programs, and external devices like printers and screens.

    Many other character sets (strictly speaking, ‘character repertoires’) have been standardised for accented Latin characters and for all other non-Latin writing systems, but these are intended for representing the symbols people use when writing text on computers. Most programs and computers use ASCII internally for all their coding, the exceptions being XML-based languages like XSLT, which are inherently designed to be usable with any writing system, and a few specialist systems like APL.

    Although the TEX and LATEX file formats can easily be used with many other encoding systems (see the discussion of the inputenc in section 2.7), they are based on ASCII. It is therefore important to know where to find all 95 of the printable characters, as some of them are not often used in other text-formatting systems. The following table shows all 128 characters, with their decimal, octal (base-8), and hexadecimal (base-16) code numbers.

    Table 1The ASCII characters
    Oct 0 1 2 3 4 5 6 7 Hex
    '01↑ BS HT LF VT FF CR SO SI ''0↓
    '02↑ DLE DC1 DC2 DC3 DC4 NAK SYN ETB ''1↑
    '03↑ CAN EM SUB ESC FS GS RS US ''1↓
    '04↑ ! " # $ % & ' ''2↑
    '05↑ ( ) * + , - . / ''2↓
    '06↑ 0 1 2 3 4 5 6 7 ''3↑
    '07↑ 8 9 : ; < = > ? ''3↓
    '10↑ @ A B C D E F G ''4↑
    '11↑ H I J K L M N O ''4↓
    '12↑ P Q R S T U V W ''5↑
    '13↑ X Y Z [ \ ] ^ _ ''5↓
    '14↑ ` a b c d e f g ''6↑
    '15↑ h i j k l m n o ''6↓
    '16↑ p q r s t u v w ''7↑
    '17↑ x y z { | } ˜ DEL ''7↓
    8 9 A B C D E F

    The index numbers in the first and last columns are for finding the octal (base-8) and hexadecimal (base-16) values respectively. Replace the arrow with the number or letter from the top of the column (if the arrow points up) from the bottom of the column(if the arrow points down).

    Example: The Escape character (ESC) is octal '033 (03 for the row, 3 for the number at the top of the column because the arrow points up) or hexadecimal "1B (1 for the row, B for the letter at the bottom of the column because the arrow points down).

    For the decimal value, multiply the Octal row number by eight and add the column number from the top line (that makes ESC 27).

    Previous Top Next