Section 9.4
A program with a loop

The next program we will look at is a version of Euclid's algorithm for finding the greatest common divisor (gcd) of two integers. Here's the .lis file:

   0:  ;
   0:  ;  This algorithm is Euclid's algorithm for computing
   0:  ;  the GCD of two ints
   0:  ;
   0:  WHILE:     LOD   X         ; while (x != y) {
   1:             SUB   Y         ;
   2:             JZ    ENDWHILE  ;
   3:  IF:        JN    ELSE      ;      if (x > y)
   4:  THEN:      STD   X         ;           x = x - y;
   5:             JMP   ENDIF     ;
   6:  ELSE:      LOD   Y         ;      else
   7:             SUB   X         ;
   8:             STD   Y         ;           y = y - x;
   9:  ENDIF:     NOP             ;
  10:             JMP   WHILE     ; }
  11:  ENDWHILE:  HLT             ; /* stop */
  12:
  12:  =1000
1000:  X:         NUM   18        ; int x;
1001:  =2000
2000:  Y:         NUM   24        ; int y;

This program has both a while loop and an if/then/else. Actually, there are no while loops or if statements in assembler, only jumps, which are sometimes called branches because flow of control lines probably look like branches of a tree, especially when a complicated set of nested if/then/elses occurs. But then why aren't loops called snakes?

Another term for jumps or branches is goto. In the 1960s an influential paper by Edsgar Dijkstra was published called "Go To Statement Considered Harmful." It spurred great controversy but ultimately most computer science educators and many practitioners were won over to the goto-less style of programming. A new methodology emerged, called structured programming, which eschewed the spaghetti code of earlier years for a cleaner, clearer style using only whiles and if/then/elses. Today we still use this methodology, though gotos are useful in a few exceptional cases.

Since in assembler, there are only jumps, all other control structures are built out of them. It is possible for an assembler programmer to carefully and consistently use a labeling discipline which mimics the kind of code a C compiler would produce for whiles and if/then/elses. This method is shown in the gcd program above alongside the C code which might have generated this code is shown as comments off to the side.

A while loop in assembler consists of a first statement that has a label, usually TOP or WHILE to which an unconditional jump is taken at the bottom of the loop. Thus, we should see a JMP WHILE, and we do in line 10. At the top of the loop are the instructions which implement the continuation test, which is (x != y) in this program.

The way comparisons are done in most computers is to subtract one number from the other and then test various conditions. Two numbers will be equal only if the difference in the two numbers is 0, i.e. the result of subtraction is 0, in which case the Z condition bit will be set. This is why JZ is used in this program. Here's the skeleton of a while loop that continues execution if x != y:

   0:  WHILE:     LOD   X         ; while (x != y) {
   1:             SUB   Y         ;
   2:             JZ    ENDWHILE  ;
             body of the while loop
  10:             JMP   WHILE     ; }
  11:  ENDWHILE:  etc.

Lines 0, 1 and 2 test if x equals y. If so, a jump beyond the end of the while loop is taken. It would be perfectly feasible to do this another way, although given the fact that there is no JNZ (jump if not Zero) instruction in the CSC-1 computer, this is the shortest code to accomplish this task.

Compilers have to produce code that is very similar to the above, based on what instructions are available in the target computer. Many UNIX C compilers will let you see the assembler code by specifying the -s option on the compile command. With gcc, the popular GNU C compiler, using -s causes the compiler to write a file ending in .s which contains the assembler code:

%  gcc -s myprog.c
%  cat myprog.s

Unfortunately, most real assembler languages are much more complicated than the CSC-1, so you would probably be dismayed by all the extra junk in the assembler program.

If/then and if/then/else statements have a regular form that can be turned into assembler code mechanically. The code to implement the following statement is given in lines 3 through 9 (assuming that the CNVZ bits were already set by a previous subtraction).

if (x > y)
     x = x - y;
else
     y = y - x;

   3:  IF:        JN    ELSE      ;      if (x > y)
   4:  THEN:      STD   X         ;           x = x - y;
   5:             JMP   ENDIF     ;
   6:  ELSE:      LOD   Y         ;      else
   7:             SUB   X         ;
   8:             STD   Y         ;           y = y - x;
   9:  ENDIF:     NOP             ;

This segment of code uses an optimization to produce shorter code -- it does not subtract a-b again since the results of this arithmetic operation done earlier are still valid, still in the condition bits.

Remember that x-y was performed and the condition bits set. Since we got this far, we know that Z != 1, which means they are not equal. But is x > y? If x is greater than y, then x-y would be positive. If it is positive then we should execute the THEN clause. This can be negated logically to say that if x-y is negative then we should go to the ELSE clause, giving rise to JN ELSE. Had we insisted on using the other form, i.e. jumping to the THEN clause, then we would have needed another instruction:

   3:  IF:        JP    THEN      ;      if (x > y)
   4:             JMP   ELSE
   5:  THEN:      STD   X         ;           x = x - y;
   6:             JMP   ENDIF     ;
   7:  ELSE:      LOD   Y         ;      else
   8:             SUB   X         ;
   9:             STD   Y         ;           y = y - x;
  10:  ENDIF:     NOP             ;

Notice the NOP instruction at the end. This serves as a harmless way to attach a label to a do-nothing instruction. If we had a smart compiler or more experienced human assembler programmer, we could have compressed the ENDIF: label into the following statement, which happens to be JMP WHILE:

   3:  IF:        JN    ELSE      ;      if (x > y)
   4:  THEN:      STD   X         ;           x = x - y;
   5:             JMP   ENDIF     ;
   6:  ELSE:      LOD   Y         ;      else
   7:             SUB   X         ;
   8:             STD   Y         ;           y = y - x;
   9:  ENDIF:     JMP   WHILE     ;

These kinds of optimizations are very important in compiler design. Such improvements may look tiny and trivial but remember, these sections of code may be executed billions of times each day, so even a 1% reduction in program size or running speed may have significant economic impact.

In "Insanely Great," a history of the Macintosh computer written by Steven Levy, author of the famous book "Hackers", Levy tells how Steve Jobs, then head of Macintosh development in Apple Computer, pushed his programmers to shave seconds off the boot time of the computer. Jobs did calculations based on how many millions of people he thought would buy and used Macintoshes and how many times they would boot their computers every year, and came up with the fact that shaving 2 seconds off the boot time would save many centuries of human waiting time and frustration. Of course, no one human would ever have access to all those saved-up seconds, but never mind.

The very first FORTRAN compiler was full of sophisticated optimizations that produced excellent assembly code, almost as good as the best human assembler programmers, because the development team thought people would not use FORTRAN unless the programs it produced were almost as fast as what humans could write. Nowadays, nobody would dream of going back to assembly programming full-scale, but every little bit of time saved helps.

One last comment about the if/then/else, and that is that there is a JMP ENDIF right before the ELSE clause. This points out once again that there are no control structures in machine language -- only sequence and gotos. All the constructs of high level languages: while, do/while, if/then/else, switch, etc. are fictions created by compilers.

The gcd program does not do input and output, which is deliberate. The results of this algorithm are found in either X or Y. Just inspect memory to find this "output." The CSC-1 computer does not have explicit I/O instructions, which is not an oversight. Many real computers do not have them either, but rather rely on memory-mapped I/O. What this means is that I/O devices respond to signals that are done by reading from or writing to special memory locations. This is the way the Motorola 68000 chip communicates with the outside world. Other chips, like the Intel x86 series, actually have explicit IN and OUT instructions, but may still use memory-mapped I/O.