Section 18.2
How I/O actually works

In this section we will see how a computer that has external ports connected to peripherals and IN and OUT instructions actually accomplishes I/O transfers.

Fig. 18.2.1 shows a CPU is connected to a tape reader. Four ports are needed for proper communication: two for control, one for commands and a third for actual data.

Ports 0 and 1 are the status ports, port 2 is the command port, and port 3 is the data port. Only the tape drive will put values on the data port, and only the CPU will put commands on the command port, but both the CPU and the tape drive will read and write the two status ports.

Fig. 18.2.1: CPU attached to a tape reader, showing four ports

The reason there are two status ports is because one is under the command of the CPU while the second is controlled by the tape reader. Each port should have exactly one writer, although it could have many readers.

A port is a special kind of register that sits at the "edge" of a chip. The word "port" comes from the Latin word for window, as it provides a window into the state of the chip. Each port has a number of flip-flops that latch a value into it, keeping it stable while the values are propagated over to another device, the reader. The reader will latch these signals into its copy of the port, namely the flip-flops on its end that will hold stable the values.

One thing that makes ports different from internal registers of the CPU is that there are timing considerations that must be taken into account. Since the various chips or devices in a communications system have their own internal clock, we cannot count on them being synchronized. Thus, they have to obey a protocol for when to read the values from the port's flip-flops or when to change them. This timing protocol is one of the trickiest aspects of all computer hardware.

When the CPU wishes to send a command to the peripheral, it writes a value into the command port. There may be many possible commands. Here are the most important, encoded using both numbers and words:

001	START	begin reading the tape and transferring data to the CPU
010	HALT	stop reading the tape
011	REWIND	physically rewind the tape back to the beginning
100	ALERT	flash a light on the external case of the tape record to alert the human who is operating the device
101	QUERY	have the tape reader tell the CPU what state it is in
110	PAUSE	suspend sending data until further notice

The peripheral constantly transmits status information back to the CPU via the status register, such as whether the operation completed or experienced an error, or how much data was transferred. If the CPU explicitly asks what the tape reader is doing, perhaps because the reader hasn't responded with data in a while, the reader is obliged to tell it using the status register. In addition, the status register may be used to pace the CPU and the peripheral so that they do not get out of step with each other and lose data.

In our example, the two status ports are two bits long each. Remember, one is for the CPU to give status information to the tape reader and the other is for the opposite.

Here are the codes that the two devices put into their status ports. (TR is short for Tape Reader.)

CPU status port (which the TR reads)

00   CPU is quiescent
01   CPU has issued IN; now waiting for TR to respond
10   CPU has read the TR's result byte and accepted it

TR status port (which the CPU reads)

00   TR is ready to accept a command from the CPU
01   TR is getting the next byte
10   TR has gotten the next byte and it is ready to be read

For each byte read and sent to the CPU, the tape reader and the CPU communicate via these two ports. When the CPU issues an IN command to start the TR working, it sets its port to 01. The TR reads this and knows that the CPU is waiting for it to fetch the byte. While it fetches the next byte off the tape, the CPU continually reads the TR's status port, waiting for it to change from 01 to 10.

Finally the byte is ready and TR puts that onto the data port. Then it sets its status port to 10. The order of these events is critical because if the CPU is told that the data port has the correct value but the TR hasn't yet put that there, the CPU will read the wrong bit pattern.

While the CPU copies the value from its side of the data port into internal registers (such as the main accumulator), the TR spins in a loop looking at the CPU's status register. Until that changes to 10, the TR must wait. Finally the CPU sets its status port to 10, meaning it has taken the data byte out of the port and is done. The TR can go back to square 1 and wait for the next command.

This constant monitoring of a wire, waiting for an event, is called polling. All computer devices that have separate clocks employ polling to coordinate their actions. It is kind of like two people who are working on different parts of the same project. One is always asking the other, "Are you ready?" or "Are you done with that and ready for the next thing?" Inside the CPU's innermost mechanism, every event is synchronized to a common clock and all events have a well-defined order so polling is not necessary inside the CPU itself. Only when there is no common pulse signal and no clear ordering of events is polling absolutely necessary.

Fig. 18.2.2 shows a simple program in the style of the CSC-1 computer that does the CPU's end of the transfer. This program is actually too simple, since it has no provision for stopping and the STD instruction is extremely unrealistic since it stores the bytes into the same variable, overwriting the old values. A real program would probably put the data bytes into an array.

          LDI  001b         ;form the code for the START command
          OUT  2            ;send START command to the tape reader
          LDI  01b          ;put CPU's status into its status port
          OUT  0            ;which the TR reads as "Go read for me now!"

WHILE0:   NOP               ;get data bytes until done (see below...)

                            ;a spin loop to wait until the TR has delivered next byte
WHILE1:   IN   1            ;read the tape reader's status
          SUB  DATAREADY    ;compare to 10b, which is data is ready to pickup
          JZ   ENDWHILE1    ;if equal, then done
          JMP  WHILE1       ;else go back to top of loop
ENDWHILE1:NOP

          IN   3            ;read data byte on port 3 from the tape reader
          STD  X            ;store into main memory somewhere (X)
          LDI  10b          ;get ready to write 1 into data accepted
          OUT  0            ;write to status port so tape reader sees it
                            ;the tape reader is now spinning, waiting for CPU
          LDI  100          ;pause the loop to allow the tape reader
                            ;to catch up

WHILE2:   JZ   ENDWHILE2    ;busy loop to waste 100 time units here because TR is slower than CPU
          SUB  ONE          ;It counts down from 100 to 0 by subtracing 1
          JMP  WHILE2       ;...sometime inside this loop, the tape reader sets the status
                            ;   register back to "00" so the CPU will see it
ENDWHILE2:NOP

          JMP  WHILE0       ;do it all over again to read next byte
ENDWHILE0:HLT

DATAREADY:NUM  10b          ;"10" is the code for the Tape Reader saying data is ready to get
ONE:      NUM  1

Fig. 18.2.2: Simple polling program to read bytes from a device

To stop the whole program, we would have to ask the tape reader if it just sent the last byte, which is called the end of file flag. This end of data flag could either be a status code that is sent through port 1 to communicate to the CPU, or it could be a special data value that signals "no more data." The ASCII system has a number of control characters (between 0 and 31, inclusive) and many of them have mnemonic abbreviations intended for data transmission protocols. For example, the byte value 00000100 (which is decimal 4) is EOT, or "End Of Transmission." It could signal the the data stream is finished.

Alternative, the CPU could set up a counter and get exactly 1024 bytes and then stop. However, the tape reader might not have 1024 bytes to read, so it would still have to signal an "end of data" to the CPU. Likewise, an error or malfunction would require sending this data to the CPU to tell it to abort the loop. If the CPU ever gets into an infinite loop waiting for a peripheral, the whole computer will freeze and the user would see something odious like the hated "blue screen of death."

We have left unanswered many other things, such as techniques to detect end of file on the tape or input errors. Communicating through the status register is the main way this is done. The status register from the tape reader to the CPU would have to have more than two bits to accommodate codes for all possible conditions. Rules for codes and when these codes are set makes I/O programming quite complicated and often messy.