Section 6.1: Data Representation (Frame 2) [prev][home][next]

To translate, just find and replace all letters with their coded sequence, using either the raw binary bit patterns, or the decimal number that corresponds to those patterns. For example, "Java!" would be:

   J           a           v           a           !
   74          97          118         97          33
   01001010    01100001    01110110    01100001    00100001

ASCII is just one of many codes developed for representing character sets; however it is so widely used that it is almost unthinkable to build a computer using any other code now. EBCDIC is another code that IBM mainframe computers use. EBCDIC stands for Extended Binary Coded Decimal Interchange Code. Other computer manufacturers had their own codes in the early days, such as CDC (Control Data Corporation) which had a 6-bit code.

A new code is becoming quite popular -- Unicode, which is an internationalized form of ASCII. All of the ASCII code is contained in Unicode. Since there are 16 bits in a single Unicode character, the total number of possible characters is 65,536. This allows other languages to include the special symbols for their alphabets.

ASCII, EBCDIC and other codes use a fixed sized for their codewords. Today's ASCII uses 8 bits per codeword, although its predecessor used only 7 bits. EBCDIC always was a 8-bit code. A codeword is the smallest chunk of information that is encoded using one of these systems, usually a single character or digit. Codewords are not broken up nor do they have any internal structure.