CIS 451 Lab 3: Instruction Types

Author: This lab is derived by Zack Kurmas from a lab designed by Greg Wolffe. Some modifications by Andrew Kalafut


Objective: The purpose of this lab is to closely examine an example instruction set --- specifically, the instruction set for the MIPS R2000 RISC chip.

Deliverables: Submit in hardcopy your answers to the numbered questions.


This project should preferably be done in groups of 2. If there are sufficient lab computers available, you may work alone. However, if you have not used the MARS software in CIS 251, I encourage you to find a partner who did.


Overview

  We will be using a graphical MIPS simulator called MARS. In this lab, you will use MARS to examine the contents of memory and CPU registers and to show the machine instructions generated for each assembly language statement.

Resources


Getting Started with MARS

MARS in installed locally. Just type mars from the command line. You can also download the jar file and run your own copy on your personal machine. The current version of MARS is 4.2. Check that you are using the current version. Older versions generate slightly different machine code.

Download the assembly program file named exampleCIT-1.s (either cut and paste it or use a right-click). Then, open it in MARS (File --> Open, or click the "Open a file for editing" button). This file is the minimal MIPS program. There is an empty data portion; the code portion consists of a main function that simply calls the exit system call. Comments are given by the text on a line following the # symbol.

Notice the Registers display on the right. In this particular MIPS processor, all registers hold 32-bits of data.  Their current contents are printed in hexadecimal.  (Registers will be described in more detail in the next section.)

Click the Assemble button (the one with a screwdriver and wrench). In the "Execute" window, you will see the contents of both the text (code) and data segments of memory. Mars simulates a MIPS processor with a single, 4GB, byte-addressable memory. This is in contrast to the processor we are building in class which has separate memories for instructions and data.

The Text Segment window displays the instructions.  Each line contains 5 columns:


MIPS R2000 Architecture

Let's take a closer look at the registers in this processor. The PC register is, of course, the Program Counter. Upon loading a program, the PC will be initialized to the beginning of the text segment. (In other words, the program counter holds the address of the first instruction to be executed.) Registers are designated either by their number (R0..R31) or by their name (e.g. (a0),(sp)). Register R0 is hard-wired to always hold the value 0. Register (at) is reserved for use by the assembler (it is often used as an address register) and the (k) registers are for the use of the operating system (kernel). Registers (v0) and (v1) are used for system call argument passing and return and (a0)..(a3) are used to pass arguments to functions. The (t) registers are for temporary storage and the (s) registers are used for semi-permanent storage (across function calls). Finally, there is the stack pointer (sp), the global pointer (gp), and the return address holder (ra).

Download, load, and assemble exampleCIT-2.s. This program simply declares a variable named val and stores the value 42 in it. The .word directive declares val to be an entire memory word in size. Knowing its value, you should now be able to find where val is stored in memory (it will be located somewhere in the data segment). Recall that all values are in hexadecimal.

Note: At this point, you may find the built in KDE calculator (with hex conversion) useful.

  1. What is the address of the first instruction that will be executed when running the program?
  2. Where did you find this information?
  3. Which memory locations contain the value for the variable val?
  4. Diagram and label a process' address space in Mars. By this, I mean, draw a column labeled 0x00000000 at the bottom and 0xffffffff at the top. Then, show which portions of this address space are used used for instructions, user data, the stack, kernel data, etc. Hint: You will have to look in different MARS windows to find these different parts. Also, The stack grows "down", meaning that each data item added to the stack has a smaller address than the previous item.
  5. What is the maximum program size for this configuration? (In other words, how many instructions can your program have before they overflow into data memory?)

Examining instructions

Load and compile exampleCIT-3.s file. The run button will cause execution of your program.  However, you will usually want to step through you program one instruction at a time - to do this use the step button (an arrow with a "1" on it). Watch the value of the Program Counter as you step through the code.

Closely examine the instructions corresponding to source code lines 10 through 13. Normally, each assembly language instruction corresponds to one machine-language instruction. In this case, the second addi and the two lw assembly instructions are broken into multiple machine-language instructions. In the case of the lw, the first instruction (lui) loads a value into register R1 (the(at) register - recall its use).  The second instruction actually performs the desired data transfer.

  1. What lines of machine language does the addi pseudo-instruction on line 11 produce?
  2. What does "lui" stand for? (Hint: Look on page A-57 in the SPIM guide.)
  3. Why do the two addi instructions result in different numbers of actual instructions? (Hint: Write each constant out in hex.)
  4. Explain how the MIPS assembler applies the "Make the common case fast" principle to addi.
  5. What is the hex representation of the "immediate" parameter to the lui instruction generated for line 12? Why is this the immediate value that is used (i.e., what does it represent)? Hint: Look in the Data Segment window.
  6. Now, look at the second machine instruction generated for line 12 (the lw instruction). Notice that this instruction has three parameters. Describe the function of all three parameters. (Hint: Examine the machine language generated for lines 13 and 15 as well as the description of lw on page A-67.)
  7. Are all three parameters necessary in order for lw to be able to access the entire 4GB memory space; or, could you eliminate the offset parameter? (In other words: If lw did not have an offset parameter, would there be memory locations that could not be read using lw?) If so, give an example. If not, give a sequence of machine instructions that could be used to load val2 into $t1 with a 2-parameter version of lw.
  8. Explain why the three-parameter version of lw is useful. Include an explanation of how can it be used to "make the common case fast." (In other words, how it can be used to reduce the number of instructions needed by the program.) (Hint: Look for redundant code in the execute window for exampleCIT-3.s.)
  9. Explain the cost of the three-parameter version of lw. In particular, include an explanation of how the third parameter can potentially slow the computer. (Think in terms of the hardware configuration needed for this instruction.)

Now, load add_xy.s, which requests two integers from the user and prints their sum. Run and review this code until you remember / learn how it works. The system calls for performing I/O are discussed on page A-43 of the SPIM guide.

  1. How is the li pseudo-instruction implemented? In other words, which "real" instructions are used to implement the li pseudo-instruction? (Remember, register 0 always contains the value 0.)
  2. How does MIPS implement the move pseudo-instruction?
  3. Would a built-in move be faster than the MIPS implementation? Why or why not? Consider the effects on both the time for the individual instruction, and the overall speed of the processor.

Instruction formats

  1. For each instruction in add_xy.s marked with a "*" in the comment (17, 18, 29, 30, 41, 42, and 72) complete a table showing how the assembly-language instruction is mapped into a machine-language instruction. For pseudo-instructions, create one table for each machine instruction produced by the assembler. You may use this template for your tables. Use the example below as a model:
    Assembly instruction add $s2, $s0, $s1
    Machine instruction (hex) 0x02119020
    Machine instruction (binary) 000000 10000 10001 10010 00000 100000
    Instruction field (decimal) 0 16 17 18 0 32
    Field function opcode rs rt rd unused function
  2. What is the value of the immediate parameter for the beq instruction on line 30? Where does this number come from (i.e., how does the assembler calculate it)?
  3. What is the value of the immediate parameter for the j instruction on line 72? (Be careful, you need to look at the actual hex value of the instruction, not the number in the "Basic" column.) Where does this number come from (i.e., how does the assembler calculate it)?