Manual de Laboratorio SPARC
Manual de Laboratorio SPARC
Arthur B. Maccabe
Jeff Vandyke
Department of Computer Science
The University of New Mexico
Albuquerque, NM 87131
Introduction
This laboratory manual was developed to provide a hands on introduction to the
SPARC architecture. The labs are based on ISEM, an instructional SPARC emulator developed at the University of New Mexico.
The ISEM package is available via anonymous ftp. To obtain a copy ftp to cs.unm.edu
and cd to pub/ISEM. The README file in this directory should provide you with the
information needed to obtain a working copy for your environment. ISEM currently runs
on most Unix boxes., There are plans to port ISEM to the DOS/Windows environment as
well as the Mac. If you have any difficulty getting a copy of ISEM or would like more
information regarding the status of the ports, send email to [email protected].
In addition to an instruction set emulator for the SPARC, the ISEM package includes
emulations for several devices (a character mapped display, a bitmapped display, a UART,
etc.), an assembler, and a linker. The assember is a slightly modified version of the GNU
assembler (gas version 2.1.1). The primary modification is the addition of several synthetic
operations to support loads and stores from/to arbitrary locations in memory. These operations are described in the first few laboratory write ups.
The lab maunal is complete in that it covers all of the SPARC operations and instruction
formats. As such, students should not require individual copies of The SPARC Architecture Manual. However, we have found it useful to have copies of the SPARC Architecture
Manual available to students on a reference basis.
This lab manual has been designed to accompany Computer Systems: Architecture, Organization, and Programming by Maccabe (Richard D. Irwin, 1993). However, the manual does
not directly reference the text and, as such, could be used with other text books.
Laboratory 1
Using ISEM (the Instructional
SPARC Emulator)
1.1 Goal
To describe the translation of assembly language programs and introduce basic features of
ISEM, the instructional SPARC emulator.
1.2 Objectives
After completing this lab, you will be able to:
1.3 Discussion
In this lab we describe the translation process and introduce the basic features of ISEM.
We begin by describing the translation process: the steps used to translate an assembly
language program into an executable program. After describing the translation process,
we describe how you can execute and test executable programs using ISEM.
The first thing you need is a simple SPARC program. Figure 1.1 illustrates a simple
SPARC program. We will use this program in the remainder of this lab.
Activity 1.1
Using a text editor, enter the program shown in Figure 1.1 into a file called foo.s.
.data
.word 0x42
.word 0x20
.word 0
!
!
!
!
variables
initialize x to 0x42
initialize y to 0x20
initialize z to 0
.text
! instructions
start:
end:
set
ld
set
ld
add
add
set
st
x, %r2
[%r2], %r2
y, %r3
[%r3], %r3
%r2, %r2, %r2
%r2, %r3, %r2
z, %r3
%r2, [%r3]
ta
!
!
!
!
!
!
!
!
= 2x + y
Hexadecimal ISEM reports all of its results in hexadecimal. To simplify your interaction
with ISEM, we will use hexadecimal notation in our programs. The program shown in
Figure 1.1 uses the hexadecimal constants: 0x42 and 0x20.
We will use the listing specification ;als (to generate a source listing and a list of symbols). The output specification consists of a -o followed by the name of the output file.
Figure 1.2 illustrates the interaction that results when you assemble the program in foo.s.
Activity 1.2
Assemble the program in foo.s, placing the object code in the file foo.o.
The linker is called isem-ld. Linker commands start with the name of the linker, isem-ld,
followed by an output specification, followed by the name of the file containing the object
program.
Activity 1.3
Link the object code in foo.o, placing the executable program in the file foo.
; ;
page 1
1
2 0030 00000042
x:
.data
.word 0x42
! variables
! initialize x
3 0034 00000020
y:
.word 0x20
! initialize y
4 0038 00000000
z:
.word 0
! initialize z
.text
! instructions
to 0x42
to 0x20
to 0
5
6
7
8
8
9
10
10
11
12
start:
0000 05000000
8410A000
0008 C4008000
000c 07000000
8610E000
0014 C600C000
0018 84008002
set
x, %r2
ld
set
[%r2], %r2
y, %r3
! x --> %r2
! &y --> %r3
ld
add
[%r3], %r3
%r2, %r2, %r2
! y --> %r3
! r2 + r2 -->
13 001c 84008003
add
! r2 + r3 -->
14 0020 07000000
14
8610E000
15 0028 C420C000
16 002c 91D02000
17
set
z, %r3
st
ta
%r2, [%r3]
0
! r2 --> x
%r2
%r2
SPARC GAS
end:
foo.s
DEFINED SYMBOLS
foo.s:2
foo.s:3
foo.s:4
foo.s:7
foo.s:17
page 2
2:00000030
2:00000034
2:00000038
1:00000000
1:00000030
x
y
z
start
end
UNDEFINED SYMBOLS
also tells you the current value of the program counter (PC) and the next program counter
(nPC). Finally, ISEM shows you the next instruction that it will execute (i.e., the instruction
pointed to by the PC).
>
ISEM
load foo
Loading File: foo
2000 bytes loaded into Text region at address 8:0
2000 bytes loaded into Data region at address a:2000
PC: 08:00000020
nPC: 00000024
PSR: 0000003e
C:0
start
sethi
0x8, %g2
Note that the instruction (sethi 0x8, %g2) doesnt look like the first instruction in the
sample program (set x, %r2). We will discuss the reason for this when we consider synthetic
operations in Lab 9. For now, it is sufficient to know that set instruction may be implemented
using two instructions: a sethi instruction followed by an or instruction.
The trace command
You can execute your program, one instruction at a time, using the trace command. The
trace command executes a single instruction and reports the values stored in the registers,
followed by the next instruction to be executed.
Figure 1.5 illustrates three successive executions of the trace command. Note that register 2 (first row, third column) now has the value 0x00000042the value used in the initialization of x.
To complete the execution of the sample program, you need to issue nine more trace
commands (a total of twelve trace commands). As you issue trace commands, note how
the values in registers 2 and 3 change. When you have executed all of the instructions in
the sample program, ISEM will print the message Program exited normally. Figure 1.6
illustrates the execution of the last two trace commands.
Activity 1.5
>
ISEM
trace
----0--- ----1------7--G 00000000 00000000
00000000
O 00000000 00000000
00000000
L 00000000 00000000
00000000
I 00000000 00000000
00000000
PC: 08:00000024
C:0
start+04 :
or
PSR: 0000003e
>
ISEM
trace
----0--- ----1------7--G 00000000 00000000
00000000
O 00000000 00000000
00000000
L 00000000 00000000
00000000
I 00000000 00000000
00000000
PC: 08:00000028
C:0
start+08 :
ld
PSR: 0000003e
[%g2], %g2
>
ISEM
trace
----0--- ----1------7--G 00000000 00000000
00000000
O 00000000 00000000
00000000
L 00000000 00000000
00000000
I 00000000 00000000
00000000
PC: 08:0000002c
C:0
start+0c :
sethi
PSR: 0000003e
0x8, %g3
>
ISEM
trace
----0--- ----1------7--G 00000000 00000000
00000000
O 00000000 00000000
00000000
L 00000000 00000000
00000000
I 00000000 00000000
00000000
PC: 08:0000004c
C:0
start+2c :
ta
PSR: 0000003e
[%g0 + 0x0]
>
ISEM
trace
Program exited normally.
----0--- ----1--- ----2--- ----3------7--G 00000000 00000000 000000a4 00002068
00000000
O 00000000 00000000 00000000 00000000
00000000
L 00000000 0000004c 00000050 00000000
00000000
I 00000000 00000000 00000000 00000000
00000000
PC: 09:00000800
nPC: 00000804
C:0
end+7b0 :
jmpl
[%l2], %g0
>
ISEM
dump z,z
0a:00002068 00 00 00 a4 00 00 00 00
........
Use the dump command to examine the values stored in the memory locations x, y, and z.
Memory Addresses When you are interacting with ISEM, memory addresses can be
specified using integer constants or the labels defined in an assembly language program.
have loaded an executable program, you can use the symb command to display the values
defined by the program. Figure 1.8 illustrates the symb command.
>
ISEM
symb
Symbol List
end
start
x
y
z
:
:
:
:
:
00000050
00000020
00002060
00002064
00002068
Activity 1.7
Use the symb command to examine the labels defined by the sample program.
Use the edit command to change the values associated with x and y.
Activity 1.9
Register names
%r0%r31
%pc, %npc
%y
%f0%f31
%fq
%fsr
%psr
%wim
%tbr
For example, the command reg %pc start resets the PC to the start of the program.
You can use this command when you want to rerun the sample program.
The run command
You can use the run command to execute your program, starting with the instruction
pointed to by the PC. This command does not take any arguments and executes instructions until it encounters a breakpoint, an illegal instruction, or a program termination instruction (ta 0).
Activity 1.10 Use the reg command to reset the value of %pc. Then use the run command to rerun the
sample program.
Figure 1.9 illustrates the run command. This interaction starts by setting the %pc and
then issuing the run command. Note that the run command produces an error message.
In this case, the run command stopped executing program instructions because it encountered an illegal instruction. Whenever you load a program, ISEM makes sure that there is
an illegal instruction following the last instruction in your program.
>
ISEM
reg %pc start
Register: %pc = 20
>
ISEM
run
Program exited normally.
>
ISEM
reg %pc start
Register: %pc = 20
>
ISEM
break start+20
>
run
ISEM
Breakpoint encountered at start+14
1.4 Summary
In this lab we have described the steps in the translation process and introduced that basic
functions provided by ISEM.
The ISEM assembler, isem-as, translates an assembly language program into object
code. The linker, isem-ld, translates an object code program into an executable program.
Figure 1.11 summarizes the steps uses to translate an assembly language program into an
executable program.
foo.o
assembly
code
isem-as
foo
object
code
isem-ld
executable
code
;o foo foo.o
10
Memory
Symbol
table
Data
OC
CC
symb
Text
(code)
%r0 (zero)
%r1 (reserved)
%r2
%r3
..
.
C
1
CC ;;;
6
C
;
?
dump
load
edit
%r30
%r31
%pc
6
?
reg
Meaning
Set an execution breakpoint.
Display the contents of memory
Set the contents of a memory location.
Provide information about an isem topic or command.
Load an executable program.
Exit ISEM.
Display the symbol table.
Display or set the contents of a register.
Execute instructions to a breakpoint or illegal instruction.
Execute the next instruction.
Notes:
Items in square brackets ([ . . . ]) are optional.
address
a memory address (may be a label or number
value
an integer value (0x. . . means hexadecimal notation)
filename
the name of a file
register
the name of a register
1.6 Exercises
1. After you have successfully assembled the sample program, perform the following
modifications to the second line of the program (x: .word 0x42). In each case, you
should start with the original sample program and you should write down the error
message (if any) produced by the assembler.
a. remove the : following x,
b. remove the . preceding word.
2. After you have successfully assembled the sample program, perform the following
modifications to the eleventh line of the program (ld [%r3], %r3). In each case, you
should start with the original sample program and you should write down the error
11
dump x,z
dump z,x
dump z
dump x
dump start,end
4. In examining Figure 1.4, note that the value of the PC is given as 08:00000020. What
does the 08: mean?
5. In examining Figure 1.7, note that the address for z is given as 0a:00002068. What
does the 0a: mean?
12
Laboratory 2
Assembly Language Programming
2.1 Goal
To introduce the fundamentals of assembly language programming.
2.2 Objectives
After completing this lab, you will be able to write assembly language programs that use:
2.3 Discussion
In this lab we introduce the fundamentals of SPARC assembly language programming. In
particular, we consider basic assembler directives, register naming conventions, the (synthetic) load and store operations, the integer addition and subtraction operations, and the
(synthetic) register copy and register set operations. We begin by considering the structure
of assembly language programs.
A line that only has spaces or tabs (i.e., white space) is an empty line. Empty lines are
ignored by the assembler.
A label definition line consists of a label definition. A label definition consists of an
identifier followed by a colon (:). As in most programming languages, an identifier
must start with a letter (or an underscore) and may be followed by any number of
letters, underscores, and digits.
A directive line consists of an optional label definition, followed by the name of an
assembler directive, followed by the arguments for the directive. In this lab we will
consider three assembler directives: .data, .word, and .text.
13
14
Every line can conclude with a comment. Comments begin with the character !.
Whenever it encounters a !, the assembler ignores the ! and the remaining characters
on the line.
Activity 2.1 Consider the SPARC program presented in Figure 1.1. For each nonempty line in the program,
identify any labels defined and identify any assembler directives and assembly language instructions.
2.3.2 Directives
In this lab we introduce three directives: .data, .text, and .word. The first two (.data and
.text) are used to separate variable declarations and assembly language instructions. The
.word directive is used to allocate and initialize space for a variable.
Each group of variable declarations should be preceded by a .data directive. Each
group of assembly language instructions should be preceded by a .text directive. Using
these directives, you could mix variable declarations and assembly language instructions;
however, for the present, your assembly language programs should consist of a group of
variable declarations followed by a group of assembly language instructions.
A variable declaration starts with a label definition (the name of the variable), followed
by a .word directive, followed by the initial value for the variable. The assembler supports
a fairly flexible syntax for specifying the initial value. For now, we will use simple integer values to initialize our variables. By default, the assembler assumes that numbers are
expressed using decimal notation. You can use hexadecimal notation if you use the 0x
prefix. Example 2.1 illustrates a group of variable declarations.
xy
Example 2.1 Give directives to allocate space for three variables, , , and . You should initialize these
variables to decimal 23, hexadecimal 3fce, and decimal 42, respectively.
x:
y:
z:
.data
.word
.word
.word
23
0x3fce
42
!
!
!
!
start
int x
int y
int z
a
=
=
=
2.3.3 Labels
In an assembly language program, a label is simply a name for an address. For example,
given the declarations shown in Example 2.1, x is a name for the address of a memory
location that was initialized to 23. On the SPARC an address is a 32-bit value. As such,
labels are 32-bit values when they are used in assembly language programs.
15
Alternate names
%g0%g7
%o0%o7
%l0%l7
%i0%i7
Group name
Global registers
Output registers
Local registers
Input registers
the meanings of these group names when we consider register windows in Lab 11. In the
meantime, we will use the %r names in our assembly language programs. As you may
have noted, ISEM uses the alternate names when it reports the contents of the registers
and when it shows the next instruction to execute.
Register Names Register names on the SPARC always start with a percent sign (%).
For example, the integer registers are named %r0 through %r31.
2.3.5 %r0
The value stored in %r0 is always zero and cannot be altered. If an instruction specifies
%r0 is used as the destination, the result is simply discarded. It is not an error to execute
an instruction that specifies %r0 as the destination for the result; however, the contents of
%r0 will not be altered when this instruction is executed.
%r0 Register %r0 always holds the value zero. The value stored in this register cannot
be altered.
Assembler syntax
set siconst32 , rd
Operation implemented
reg[rd] = siconst32
16
Example 2.2 Using set instructions, write code that will load the value 0x42 into register %r2 and the
address of x (from Example 2.1) into register %r3.
set
set
0x42, %r2
x, %r3
Assembler syntax
ld [rs], rd
st rs, [rd]
Notes:
rd
rs
Operation implemented
reg[rd] = memory[reg[rs]]
memory[reg[rd]] = reg[rs]
Source and destination registers When we introduce assembly language syntax, the
names rs and rd are used to denote source and destination registers, respectively. When an
instruction uses multiple source registers, we use subscripts to distinguish these registers.
17
Assembler syntax
add rs1 ,rs2 , rd
add rs, siconst13 , rd
sub rs1 , rs2 , rd
sub rs, siconst13 , rd
Operation implemented
reg[rd] = reg[rs1 ] + reg[rs2 ]
reg[rd] = reg[rs] + siconst13
reg[rd] = reg[rs1 ] reg[rs2 ]
reg[rd] = reg[rs] siconst13
;
;
a:
b:
c:
d:
start:
end:
.data
.word
.word
.word
.word
0x42
0x43
0x44
0x45
.text
set
ld
set
ld
set
ld
set
ld
a, %r1
[%r1],
b, %r1
[%r1],
c, %r1
[%r1],
d, %r1
[%r1],
add
sub
sub
set
st
ta
a = (a + b) ; (c ; d).
%r2
--> %r2
%r3
--> %r3
%r4
--> %r4
%r5
--> %r5
!
!
!
a + b --> %r2
c ; d --> %r3
(a + b) ; (c ; d)
--> %r2
(a + b) ; (c ; d)
--> a
Activity 2.2 Using a text editor, enter the program shown in Example 2.3 into a file, assemble it, link it, and
test it using isem.
18
Assembler syntax
mov rs, rd
mov siconst13 , rd
Operation implemented
reg[rd] = reg[rs]
reg[rd] = signextend(siconst13 )
Because you can always use the set operation to load a 13-bit value to an integer register,
the second version of the mov operation is redundant for integer registers. However, as we
will discuss, this version of the mov operation is used to load the other state registers on
the SPARC.
2.4 Summary
In this lab we have introduced the basics of SPARC assembly language programming. We
began by considering the structure of an assembly language program. We then considered
the names and uses of the integer registers. We then introduced three assembler directives:
.text, .data, and .word. The first two (.text and .data) are used to identify sections of an assembly language program. The last two (.data and .word) are used to declare and initialize
variables. We will consider additional assembler directives in later labs. We concluded the
lab by introducing six assembly language operations: set, load, store, add, sub, and mov.
Figure 2.1 provides a graphical illustration for several of the operations that we have
introduced in this lab. In particular, this figure illustrates the data paths used in the load,
store, addition, and subtraction operations.
Memory
text
data
* Instruction Register
siconst
ifetch
?
6
6
?
?
store
Integer
@@add/sub;;
registers
load (%r0%r31)
13
Figure 2.1 Illustrating the load, store, add, and subtract operations
The set and mov operations are synthetic (or pseudo) operations. That is, these operations are not really SPARC operations. Instead, the assembler translates these operations
into one or more SPARC operations when it assembles your program. We will consider
synthetic operations in Lab 9
19
2.6 Exercises
1. Suppose that the SPARC did not provide a (synthetic) register copy operation, explain
how you could emulate this operation.
2. For each of the following statements, write, assemble, and test a SPARC assembly
language fragment that implements the statement. Be certain to declare and initialize
all variables in your assembly language programs.
a.
b.
c.
d.
e.
a = c + d.
a = (c + d) ; (c + b + d ; e).
a = (d ; 13) + (a + 23).
a = d + 9832.
a = 87765 ; c.
20
Laboratory 3
Implementing Control Structures
3.1 Goal
To cover the implementation of control structures using the SPARC instruction set.
3.2 Objectives
After completing this lab, you will be able to write assembly language programs that use:
3.3 Discussion
In this lab we introduce a subset of the SPARC branching operations. In particular, we
introduce the operations that provide conditional and unconditional branching based on
the bits in a condition code register.
We begin by considering the bits in the condition code register of a SPARC processor.
After introducing these bits, we consider the operations that affect the bits in the condition
code register. We then consider the conditional and unconditional branching operations
that use the bits in the condition code register to control branching. Next, we introduce
nullification (annulment) in the branching operations of the SPARC. We conclude by considering several examples to illustrate the SPARC operations and by introducing the compare operation provided by the SPARC.
22
Operation name
addcc
subcc
cannot be stored in 32 bits, and cleared when the result can be stored in 32 bits. Finally, the
C bit is set when the operation generates a carry out of the most significant bit, and cleared
otherwise.
In most contexts, you will be most interested in the N and Z bits of the condition code
register and we will emphasize these bits in the remainder of this lab. We will consider the
remaining bits in the condition code register (the C and V bits) at greater length in Lab 13.
Assembler syntax
ba
target
bn
target
bne
target
be
target
bg
target
ble
target
bge
target
bl
target
bgu
target
bleu
target
bcc
target
bcs
target
bpos target
bneg target
bvc
target
bvs
target
Branch condition
1 (always)
0 (never)
not Z
Z
not (Z or (N xor V))
Z or (N xor V)
not (N xor V)
N xor V
not (C or Z)
C or Z
not C
C
not N
N
not V
V
In addition to the operation names defined in Table 10, the SPARC defines several synonyms for these operations. These synonyms are summarized in Table 3.3.
Like most RISC machines, the SPARC uses a branch delay slot. By default, the instruction following a branch instruction is executed whenever the branch instruction is
23
Operation name
bnz
bz
bgeu
blu
Synonym for
bne
be
bcc
bcs
executed.
SPARC assemblers provide a special (synthetic) operation, nop, for situations when it is
not convenient to put a useful instruction in the delay slot of a branch instruction. In assembly language a nop instruction has no operands (i.e., a nop instruction is fully specified
by the name of the operation). When a nop instruction is executed, it does not alter any
of the registers or values stored in memory. However, the use of nop instructions causes
the processor to execute more instructions and, as such, increases the time required to execute the program. Example 3.1 illustrates the conditional and unconditional branching
operations.
Example 3.1 Translate the following C code fragment into SPARC assembly language.
int temp;
int x = 0;
int y = 0x9;
int z = 0x42;
temp = y;
while( temp > 0 )
x = x + z;
temp = temp - 1;
To simplify the translation, we fill the branch delay slots with nop instructions.
.data
x:
.word
0
y:
.word
0x9
z:
.word
0x42
start:
top:
test:
end:
.text
set
ld
set
ld
mov
add
ba
nop
add
subcc
bg
nop
set
st
ta
y, %r1
[%r1], %r2
z, %r1
[%r1], %r3
%r0, %r4
%r2, 1, %r2
test
%r4, %r3, %r4
%r2, 1, %r2
top
x, %r1
%r4, [%r1]
0
! store x
24
Activity 3.1 After each trace command, ISEM reports the values of the bit in the condition code register.
Type the program shown Example 3.1 into a file, assemble it, link it, and load it into ISEM. Trace the program
execution, noting how each instruction affects the bits in the SPARC condition code register.
The SPARC keeps track of the instructions to execute using two program counters: PC,
and nPC. The first program counter, PC, holds the address of the next instruction to execute. The second program counter, nPC, holds the next value for PC. Usually, the SPARC
updates the program counters at the end of each instruction execution by assigning the
current value of nPC to PC, and adding 4 to the value of nPC. When it executes a branching operation, the SPARC assigns the current value of nPC to PC and then updates the
value of nPC. If the branch is taken, nPC is assigned the value of the target specified in the
instruction; otherwise, nPC is incremented by 4. The branch delay slot arises because the
PC is assigned the old value of nPC (before nPC is assigned the target of the branch).
Activity 3.2 After each trace command, ISEM reports the values of PC and nPC. Run the program shown
in Example 3.1 noting the changes to PC and nPC.
3.3.3 Nullification
Every branching instruction can specify that the affect of the instruction in the branch delay slot is to be nullified (annulled in SPARC terminology) if the branch specified by the
conditional branching instruction is not taken. In assembly language, this conditional nullification is specified by appending a suffix of ,a to the name of the branching operation.
Example 3.2 illustrates conditional nullification.
Example 3.2 Rewrite the code fragment shown in Example 3.1 so that the code has meaningful instructions
in the branch delay slots.
x:
y:
z:
start:
top:
end:
.data
.word
.word
.word
0
0x9
0x42
.text
set
ld
set
ld
mov
y, %r1
[%r1], %r2
z, %r1
[%r1], %r3
%r0, %r4
add
subcc
bg,a
add
%r2, 1, %r2
%r2, 1, %r2
top
%r4, %r3, %r4
!
!
!
!
set
st
ta
x, %r1
%r4, [%r1]
0
! store x
25
Assembler syntax
cmp sr1 , sr2
cmp sr, siconst13
Operation implemented
reg[sr1 ] reg[sr2 ]
reg[dr] = reg[sr] siconst13
3.4 Summary
In this lab we have introduced the condition code register, the basic branching operations,
and the integer comparison operation. The branching operations include two unconditional branch operations (ba and bn) and a host of conditional branching operations. The
SPARC branching operations have a branch delay slot. That is, the instruction following a branch instruction is executed whenever the branch instruction is executed. The
SPARC provides conditional annulment of the instruction in the branch delay slot. When
the branch operation specifies annulment (using the operator suffix ,a), the affects of the
instruction are canceled (note, the instruction is executed, but the execution has no affect).
26
3.6 Exercises
1. Suppose that your assembler did not provide an integer comparison operation. Explain how you could implement this operation using the other SPARC operations
that we have considered in this and previous labs.
2. Consider the SPARC code presented in Example 3.2. Currently, the loop is executed
y times. If y is larger than z it would be better to execute the loop z times.
Rewrite the code shown in Example 3.2 to take advantage of this observation.
3. Write a SPARC program that has four variables: x , y > , z , and w. Your program
should assign the quotient of x=y to z and the remainder to w. (You should write this
code using the operations presented in this and previous labs. Do not use the SPARC
integer multiplication or division operations.
4. Write a SPARC program that will compute the greatest common divisor of a and b
and assign this value to c.
Laboratory 4
Multiplication and Division
4.1 Goal
To cover the SPARC operations related to multiplication and division.
4.2 Objectives
After completing this lab, you will be able to write assembly language programs that use:
4.3 Discussion
In this lab we consider the SPARC operations related to integer multiplication and division.
We begin by considering the signed integer multiplication and division operations.
27
28
?
?
@@ smul/umul ;;
@
;
Y register
most significant
32 bits
Integer
registers
(%r0%r31)
least significant
32 bits
Y register
??
?
@@ sdiv/udiv ;;
@
;
quotient
Integer
registers
(%r0%r31)
Table 4.1 Assembly language formats for the integer multiplication and division operations
Operation
integer multiplication
integer division
Assembler syntax
mul-op rs1 ,rs2 , rd
mul-op rs, iconst13 , rd
div-op
rs1 , rs2 , rd
div-op
rs, iconst13 , rd
Operation implemented
Notes:
iconst13 denotes an integer constant. This constant is signed when it used with a signed
operation (e.g., smul) and unsigned when it is used with an unsigned operation (e.g., umul).
The value must be represented in 13 bits.
fx, yg denotes a 64-bit value (or storage location) constructed from two 32-bit values x and
y. The first of these values, x, is the most significant.
29
Table 4.2 The signed and unsigned integer multiplication and division operations
Operation
signed integer multiplication
unsigned integer multiplication
signed integer division
unsigned integer division
Operation names
smul
smulcc
umul umulcc
sdiv
sdivcc
udiv
udivcc
a = (a b)=c
ab
a:
b:
c:
start:
end:
.data
.word
.word
.word
0x42
0x43
0x44
.text
set
ld
set
ld
set
ld
a, %r1
[%r1], %r2
b, %r1
[%r1], %r3
c, %r1
[%r1], %r4
smul
sdiv
set
st
a, %r1
%r2, [%r1]
! %r2 --> a
ta
The signed and unsigned operations are distinguished by the way they interpret their
operands. The signed operations interpret their source operands as signed integers and
produce signed integer results. The unsigned operations interpret their source operands
as unsigned integers and produce unsigned integer results.
30
Assembler syntax
mov %y, rd
mov rs, %y
mov siconst13 , %y
Operation implemented
reg[rd] = reg[%y]
reg[%y] = reg[rs]
reg[%y] = iconst13
When you store a value into the Y register (using a mov instruction), it takes three
instruction cycles before the Y register is actually updated. This means that you need to
make sure there are at least three instructions between an instruction that uses the Y register
as a destination and an instruction that uses the value stored in the Y register.
Writing to the Y register Remember, always make sure that there are at least three instructions between any instruction that writes to the %y register and an instruction that
uses the value in the %y register.
a = (a+ b)=c
Example 4.2 Write a SPARC assembly language fragment to evaluate the statement
. Again,
you should assume that , , and are signed integers and that all results can be represented in 32 bits.
ab
a:
b:
c:
.data
.word
.word
.word
0x42
0x43
0x44
start:
.text
mov
%r0, %y
AT
! LEAST 3 INSTRUCTIONS BETWEEN THE MOV
AND
! SDIV INSTRUCTIONS
set
ld
set
ld
set
ld
a, %r1
[%r1], %r2
b, %r1
[%r1], %r3
c, %r1
[%r1], %r4
end:
31
add
sdiv
! a + b --> %r2
! %r2 / c --> %r2
set
st
ta
a, %r1
%r2, [%r1]
0
! %r2 --> a
4.4 Summary
In this lab we have covered the integer multiplication and division operations provided
by the SPARC. As with the other arithmetic operation (add and sub), there are versions of
the multiplication and division operations that update the condition code bits and other
multiplication and division operations that do not alter the condition code bits.
All of the multiplication and division operations operations use a special purpose register, the Y register (%y). You can use the mov operation to examine and set the contents of
the Y register.
4.6 Exercises
32
Laboratory 5
Bit Manipulation and Character
I/O
5.1 Goal
To cover the bit manipulation operations provided by the SPARC and the character I/O
traps provided by ISEM.
5.2 Objectives
After completing this lab, you will be able to write assembly language programs that use:
5.3 Discussion
In this lab we introduce the bit manipulation operations of the SPARC. In particular, we
consider the logical operations and, or, and xor. In addition, we consider the synthetic
operation not. We follow this with a discussion of the shift operations sll, srl, and sra. The
lab ends with a short discussion of the character I/O facilities provided by ISEM.
0x42
0x43
0x44
0x45
33
a = (a&b) ^ (cjd).
34
start:
end:
Example 5.2
bit).
Assembler syntax
and
rs1 , rs2 , rd
and
rs1 , siconst13 , rd
or
rs1 , rs2 , rd
or
rs1 , siconst13 , rd
xor
rs1 , rs2 , rd
xor
rs1 , siconst13 , rd
andn rs1 , rs2 , rd
andn rs1 , siconst13 , rd
orn
rs1 , rs2 , rd
orn
rs1 , siconst13 , rd
xnor
rs1 , rs2 , rd
xnor
rs1 , siconst13 , rd
Operation implemented
reg[rd] = reg[rs1 ] & reg[rs2 ]
reg[rd] = reg[rs1 ] & siconst13
reg[rd] = reg[rs1 ] reg[rs2 ]
reg[rd] = reg[rs1 ] siconst13
reg[rd] = reg[rs1 ] reg[rs2 ]
reg[rd] = reg[rs1 ] siconst13
reg[rd] = reg[rs1 ] & reg[rs2]
reg[rd] = reg[rs1 ] & siconst13
reg[rd] = reg[rs1 ]
reg[rs2 ]
reg[rd] = reg[rs1 ]
siconst13
reg[rd] = reg[rs1 ]
reg[rs2]
reg[rd] = reg[rs1 ]
siconst13
j
j
^
^
j
j
^
^
.text
set
ld
set
ld
set
ld
set
ld
a, %r1
[%r1],
b, %r1
[%r1],
c, %r1
[%r1],
d, %r1
[%r1],
and
or
xor
set
st
ta
a, %r1
%r2, [%r1]
0
! %r2 --> a
%r2
! a --> %r2
%r3
! b --> %r3
%r4
! c --> %r4
%r5
! d --> %r5
n:
start:
.data
.word
0xaaaaaaaa
.text
set
ld
n, %r1
[%r1], %r2
! n --> %r2
andn
! %r2 &
0x1fe0
--> %r2
st
%r2, [%r1]
! %r2 --> n
end:
ta
0
In this case, the mask (0x1fe0) can be represented using 12 bits and, as such, we can use an andn
instruction with an immediate value.
Activity 5.1
35
Operation name
andcc
orcc
xorcc
andncc
orncc
xnorcc
Assembler syntax
not rs, rd
not rd
Operation implemented
reg[rd] = reg[rs]
reg[rd] = reg[rd]
Assembler syntax
sll
rs1 , rs2 , rd
sll
rs1 , siconst13 , rd
srl
rs1 , rs2 , rd
srl
rs1 , siconst13 , rd
sra rs1 , rs2 , rd
sra rs1 , siconst13 , rd
Operation implemented
reg[rd] = reg[rs1 ]
reg[rs2 ]
reg[rd] = reg[rs1 ]
siconst13
reg[rd] = reg[rs1 ]
reg[rs2 ]
reg[rd] = reg[rs1 ]
siconst13
reg[rd] = reg[rs1 ]31 reg[rs1 ]
reg[rd] = reg[rs1 ]31 reg[rs1 ]
<<
<<
>>
>>
j
j
>> reg[rs2]
>> siconst13
36
As shown in Table 5.4, the shift operations have two source operands. The first operand
specifies the value to be shifted. This value must be stored in an integer register. The second source operand specifies the amount of the shift. This operand may be stored in an
integer register or it may be a small constant value. The processor only uses the least significant 5 bits of the second source operand (and ignores the remaining bits of this operand).
Example 5.3 illustrates the shift operations.
Example 5.3 Write a SPARC program to count the number of bits that are set (i.e., 1) in the memory location
(a variable). The result should be stored in %r2. In writing this code, you may use any of the remaining
registers as temporaries.
n:
start:
loop:
cont:
.data
.word 0xaaaaaaaa
.text
set
ld
clr
andcc
be
nop
inc
srl
n, %r1
[%r1], %r3
%r2
%r3, 1, %r0
cont
%r2
%r3, 1, %r3
bit
end:
cmp
bne
nop
%r3, 0
loop
ta
1
2
The first of these instructions prints the character in the least significant byte of register
%r8 (= %o0) to standard output and the second reads a character from standard input and
places the result in the least significant byte of %r8, clearing the most significant 24 bits of
this register. Example 5.4 illustrates the use of these I/O instructions.
Example 5.4 Write a SPARC/ISEM program that reads a one digit number, adds five to the number and
prints the result.
start:
.text
ta
37
sub
add
cmp
ble
nop
%r8, 0, %r8
%r8, 5, %r8
%r8, 9
one_dg
mov
set
ta
mov
sub
%r8,
1,
1
%r7,
%r8,
!
!
!
!
add
ta
%r8, 0, %r8
1
ta
%r7
%r8
%r8
10, %r8
one_dg:
5.4 Summary
In this lab we have introduced the bitwise and shift operations provided by the SPARC. In
addition, we have introduced the notation used for character data and primitive mechanisms for character input and output. Example 5.5 illustrates all of these operations.
Example 5.5 Write a SPARC/ISEM program to print the binary representation of the unsigned integer in
memory location . In writing the code, you may use any of the registers as temporaries.
n:
start:
loop:
print:
end:
.data
.word
0xaaaaaaaa
.text
set
ld
set
n, %r1
[%r1], %r2
1
31, %r3
set
32, %r4
andcc
be
set
set
1, %r8
ta
! print %r8
deccc
bg
srl
%r4
loop
%r3, 1, %r3
! decrement count
! continue until count == 0
! shift mask right (bd)
ta
<<
38
5.5 Exercises
1. Write a SPARC program that compares the memory contents of the word pointed to
by register r to the contents of register r on a bit-by-bit basis. For all bits i, if
the value of bit i of r is smaller than the value of bit i of r , set bit i of r to 1;
otherwise, set bit i of r to 0.
2. Write a SPARC program that distinguishes ASCII-coded hexadecimal digits from
other bytes and which converts valid digit codes to the corresponding hexadecimal
value. Only valid digit codes are to be converted. Leave invalid characters unaltered. Suppose that register r holds the character on program entry, and that r
holds the result on program exit. Use register r to report on validity. If valid, r
must equal 0; if invalid, r must equal 1.
%2
%3
[% 2]
%3
%2
%3
%2
%3
%3
%3
%3
Digit
Code
0x30
..
.
Hex
Value
0x0
..
.
9
a,A
..
.
0x39
0x61,0x41
..
.
0x9
0xa
..
.
f,F
0x66,0x46
0xf
3. Write a SPARC program to print the hexadecimal representation of an unsigned integer. The unsigned integer should be named n and declared in the data segment. Your
program should finish by printing a newline.
Laboratory 6
Assembler Directives, Assembler
Expressions, and Addressing
Modes
6.1 Goal
To cover several assembler directives and assembler expressions provided by the GNU
assembler (gas) and the SPARC addressing modes.
6.2 Objectives
After completing this lab, you will be able to write assembly language programs that use:
6.3 Discussion
In this lab we introduce three new assembler directives: a directive to allocate space, a
directive to define symbolic constants, and a directive to include header files. After we
describing these directives, we discuss assembler expressions and introduce the distinction between relocatable values and absolute values. We conclude this lab by discussing the
memory addressing modes provided by the SPARC and the SETHI instruction.
40
is an expression that defines the value of the symbol. This directive can be written using
standard directive syntax (e.g., .set symbol, expression), or it can be written using the infix
= operator (e.g., symbol = expression).
In many cases, you will want to collect a group of definitions for symbolic constants into
a header file that can be included in several different programs or modules. (By including
the same file in each of the programs or modules, you can be sure that all of the programs
and modules use the same values for the symbolic constants.) The .include directive supports this style of programming. This directive takes a single argument, a string that gives
the name of the file to include. The code from the included file logically replaces the .include
directive. When the assembler is finished processing the included file, it resumes after the
.include directive in the original file.
Table 6.1 summarizes the directives that we have introduced in this section.
Table 6.1 Assembler directives
Operation
Allocate space
Symbolic constant
Include file
Assembler syntax
.skip
n
.set
symbol, expression
symbol = expression
.include filename
;
+
=
<<
>>
j
&
+
;
Operation
Unary minus
Unary plus
Multiplication
Division
Left shift
Right shift
Bitwise inclusive or
Bitwise and
Bitwise exclusive or
Bitwise and not
Addition
Binary subtraction
Precedence
highest
high middle
low middle
lowest
41
You can use absolute values with any of the operators. If all of the operands for an
operator are absolute values, the expression using the operator is an absolute value.
You can only use relocatable values in expressions using the binary addition and subtraction operators. When you use binary addition, at most one operand can be a relocatable
value, the other operand must be an absolute value. If one operand is relocatable, the value
of the expression is relocatable.
When you use binary subtraction, you cannot subtract a relocatable value from an absolute value. When subtract an absolute value from a relocatable value, the result is a relocatable value. When you subtract two relocatable values, the two values must be defined
in the same assembler segment (e.g., text or data), and the result is an absolute value.
register indirect
direct memory
Assembler syntax
[r1 +r2 ]
[r1 +siconst13 ]
[siconst13 +r1 ]
[r1 siconst13 ]
[r]
[siconst13 ]
Implementation
basic mode
basic mode
[r+%r0]
[%r0+siconst13 ]
Effective address
reg[r1 ]+reg[r2 ]
reg[r1 ]+siconst13
siconst13 +reg[r1 ]
reg[r1 ] siconst13
reg[r]
siconst13
Addressing modes can only be used with the load and store instructions. Table 6.4
summarizes the load word and store word operations.
42
Assembler syntax
ld address, rd
st rs, address
Notes:
address
eff addr(x)
Example 6.1
Operation implemented
reg[rd] = memory[eff addr(address)]
memory[eff addr(address)] = reg[rs]
Show how you would load a 32-bit, bignum, into %r2 without using the set instruction.
.set
sethi
or
ta
bignum, 0x87654321
bignum>>10, %r2
%r2, bignum&0x3ff, %r2
0
Note the use of the expression bignum>>10 to extract the most significant 22 bits
of bignum and the expression bignum&0x3ff to extract the least significant 10 bits of
bignum. To make your code more readable, SPARC assemblers provide two special operators: %hi(x) yields the most significant 22 bits of x, while %lo(x) yields the least significant
10 bits of x. Note that these operators are written using function call notation.
Example 6.2
Rewrite the code fragment given in Example 6.1, using the %hi and %lo operators.
.set
sethi
or
ta
bignum, 0x87654321
%hi(bignum), %r2
%r2, %lo(bignum), %r2
0
Example 6.3 Write an assembly language fragment to sum up the elements in the array. Give directives to
declare an array of 20 words and an additional word to hold the sum.
arr:
sum:
start:
loop:
end:
.data
.skip
.word
20*4
0
.text
set
mov
mov
set
arr, %r2
%r0, %r3
%r0, %r4
20, %r5
!
!
!
!
%r2
%r3
%r4
%r5
is
is
is
is
the
the
the
the
base address
index value
running sum
number of elems to add
ld
add
subcc
bne
add
[%r2+%r3], %r6
%r4, %r6, %r4
%r5, 1, %r5
loop
%r3, 4, %r3
!
!
!
!
!
sethi
st
ta
%hi(sum), %r1
%r4, [%r1+%lo(sum)]
0
43
Note that the code in Example 6.3 stores the result into sum using a sethi instruction
followed by a st instruction. In previous examples we have used a set instruction followed
by a st instruction to accomplish the same task. However, the set instruction is actually a
synthetic instruction and the assembler implements this instruction using a sethi instruction followed by an or instruction (as we showed in Example 6.1). As such, our earlier
code actually requires three instruction for every (load or) store. Using the sethi instruction
directly, we can avoid an unnecessary instruction.
6.4 Summary
6.5 Review Questions
6.6 Exercises
1. The %hi operator yields the most significant 22 bits while %lo operator yields the
least significant 10 bits of a 32-bit value. Considering that the SPARC uses 13-bit
signed integers in lots of contexts, it might seem that it would be better to have the
%hi and %lo operators yield 19 and 13 bits respectively. What problems would this
cause?
2. Write a SPARC assembly language program to count the number of ones in a bit
string. The bit string should be named BitString and the length (in bits) of the bit
string should be named Length. The result should be stored as a word named Count.
Note: there is no limit on the number of bits in the bit string.
3. Given an array, A, and the number of elements in the array, n, write a SPARC program
to sort the array. You may use any method you like to sort the array.
44
Laboratory 7
Operand Sizes and Unsigned
Values
7.1 Goal
To complete our coverage of the load and store instructions provided by the SPARC and to
cover a collection of useful synthetic operations.
7.2 Objectives
After completing this lab, you will be able to write assembly language programs that use:
7.3 Discussion
In this lab we introduce the assembler directives and operations associated with different sized operands and unsigned operands. We begin by considering the operand sizes
supported by the SPARC. Then we consider assembler directives used to allocate different
amounts of memory. Next, we consider the load and store operations for different sized operations. The we consider the load operations for unsigned operands. Finally, we conclude
by considering a small collection of useful synthetic operations.
46
Size
8 bits
16 bits
32 bits
64 bits
Size
8 bits
16 bits
32 bits
64 bits
The SPARC load and store operations require that halfword values be aligned on even
addresses (i.e., halfword alignment) and that word and double word values be aligned on
addresses that are a multiple of four (i.e., word aligned). You can use the .align directive to
make sure that your variables are aligned as needed. This directive takes a two arguments,
a number and an optional pad value. When the assembler encounters an align directive,
it makes sure that the next address in the current assembler segment is a multiple of the
first argument. For example, the directive .align 8 will ensure that the next address is
a multiple of 8. To ensure that the next address meets the alignment requirements, the
assembler emits pad bytes. If the second argument is supplied, the assembler uses this
value when it emits pad bytes; otherwise, the assembler emits zeros. Example 7.1 illustrates
the size and alignment directives.
Example 7.1 Consider the following C declarations. Assuming that a character is one byte, a short integer is
two bytes, and an integer is four bytes, give assembler directives to allocate and initialize the memory specified
by these directives.
short int short1 = 22;
char ch1 = a;
short int short2 = 33;
char ch2 = A;
int int1 = 0;
.data
.align 2
short1: .hword 22
ch1:
.byte a
.align 2
short2: .hword 33
ch2:
.byte A
.align 4
int1:
.word 0
!
!
!
!
!
!
!
!
halfword align
allocate and initialize
allocate and initialize
halfword align
allocate and initialize
allocate and initialize
word align
allocate and initialize
a halfword
a byte
a second halfword
a second byte
a word
47
Load
ldb
ldh
ld
ldd
Store
stb
sth
st
std
Notes
the address must be halfword aligned
the address must be word aligned
the address must be word aligned and the
register must be an even number
The load operations take two operands: the (source) memory address followed by the
(destination) register. Similarly, the store operations take two operands: the (source) register followed by the (destination) memory address.
The load byte, halfword, and word operations set all 32 bits of the destination register.
The load byte operation (ldb) fetches an 8-bit value, sign extends this value to 32 bits,
and loads the resulting value into destination register. The load halfword operation (ldh)
fetches a 16-bit value and sign extends this value to 32 bits. The load word operation (load)
fetches a 32-bit value and load this value into the destination register. The load doubleword
operation fetches two 32-bit values and loads them into consecutive registersstarting
with the register specified in the instruction (e.g., %r2 and %r3).
The store byte operation (stb) stores the least significant 8 bits of the source register
to the destination memory location. The store halfword operation (sth) stores the least
significant 16 bits of the source register into the destination memory location. The store
word operation (st) stores the contents of a register to the destination memory address. The
store doubleword operation (stored) stores the contents two consecutive registers (starting
with the register specified in the instruction).
The memory addresses used with the operations that load and store halfwords must
be even (i.e., halfword aligned). The memory addresses used with the operations that
load and store words and doublewords must be multiples of four (i.e., word aligned). The
register used with the operations that load and store doublewords must be even (e.g., %r2
but not %r3). When these operations are used, the most significant 32 bits are stored in the
even register.
Table 7.4 The load and store instructions
Operation
load byte
load halfword
load word
load doubleword
Instruction syntax
ldb address, rd
ldh address, rd
ld
address, rd
ldd address, rd
store byte
store halfword
store word
store doubleword
stb
sth
st
std
rs, address
rs, address
rs, address
rs, address
Operation implemented
reg[rd] = signextend( memory[address]8 )
reg[rd] = signextend( memory[address]16 )
reg[rd] = memory[address]32
reg[rd] = memory[address]32
reg[rd+1] = memory[address+4]32
memory[address]8 = reg[rs]8
memory[address]16 = reg[rs]16
memory[address]32 = reg[rs]32
memory[address]32 = reg[rs]32
memory[address+4]32 = reg[rs+1]32
Example 7.2 Rewrite the code presented in Example 6.3 using an array of 20 bytes (i.e., chars) instead of
words. (You should still store the sum in a word.)
.data
48
start:
loop:
end:
.skip
.word
20
0
.text
set
mov
mov
set
arr, %r2
%r0, %r3
%r0, %r4
20, %r5
!
!
!
!
%r2
%r3
%r4
%r5
is
is
is
is
the
the
the
the
base address
index value
running sum
number of elems to add
ldb
add
subcc
bne
add
[%r2+%r3], %r6
%r4, %r6, %r4
%r5, 1, %r5
loop
%r3, 4, %r3
!
!
!
!
!
sethi
st
ta
%hi(sum), %r1
%r4, [%r1+%lo(sum)]
0
Instruction syntax
ldub address, rd
lduh address, rd
Operation implemented
reg[rd] = zerofill( memory[address]8 )
reg[rd] = zerofill( memory[address]16 )
Rewrite the code presented in Example 6.3 using the operations defined in Table 7.6.
arr:
sum:
start:
loop:
.data
.skip
.word
20*4
0
.text
set
clr
clr
set
arr, %r2
%r3
%r4
20, %r5
!
!
!
!
ld
add
[%r2+%r3], %r6
%r4, %r6, %r4
%r2
%r3
%r4
%r5
is
is
is
is
the
the
the
the
base address
index value
running sum
number of elems to add
49
end:
Instruction syntax
clr
rd
clr
address
clrh
address
clrb
address
neg
rd
neg
rs,rd
inc
rd
inc
siconst13 , rd
inccc
rd
inccc
siconst13 , rd
dec
rd
dec
siconst13 , rd
deccc rd
deccc siconst13 , rd
Operation implemented
reg[rd = 0
memory[address]32 = 0
memory[address]16 = 0
memory[address]8 = 0
reg[rd] = reg[rd]
reg[rd] = reg[rs]
reg[rd] = reg[rd] + 1
reg[rd] = reg[rd] + siconst13
reg[rd] = reg[rd] + 1
reg[rd] = reg[rd] + siconst13
reg[rd] = reg[rd] 1
reg[rd] = reg[rd] siconst13
reg[rd] = reg[rd] 1
reg[rd] = reg[rd] siconst13
;
;
;
;
;
;
deccc
bne
inc
%r5
loop
4, %r3
sethi
st
ta
%hi(sum), %r1
%r4, [%r1+%lo(sum)]
0
7.4 Summary
7.5 Review Questions
1. Explain why the SPARC does not provide unsigned store operations.
7.6 Exercises
1. Suppose that the SPARC did not have a load unsigned byte operation, explain how
you could implement this operation using the load byte operation (recall, this operation always sign extends the value being loaded). Note, the Intel i860 processor
provides a load byte operation but does not provide a load unsigned byte operation.
50
Laboratory 8
The ISEM Graphics Accelerator
8.1 Goal
To cover uses of the graphics accelerator device provided by ISEM. (Currently, this device
is only available in the X11 environment.)
8.2 Objectives
After completing this lab, you will be able to write assembly language programs that use:
8.3 Discussion
In the previous labs, we have focused on assembly language programming, presenting
SPARC instructions and assembler directives. Now, its time for some fun! In this lab
we present the ISEM graphics accelerator device, gx. In addition to showing you how a
simple device works, this lab will give you an opportunity to review the assembly language
constructs covered in the previous labs.
When you open the gx device, it creates a black and white graphics window. By issuing gx commands, you can instruct the gx device to draw lines, fill rectangles, and copy
pixels. Individual pixels in
rectangles in the window. The visible gx display is
the visible region are addressed by an x; y pair. The pixel in the upper left corner of the
display is addressed by the pair ; . The pixel in the lower right corner is addressed by
the pair
; .
In addition to the visible pixels, the gx device provides a 512x64 rectangle of pixels
that are not displayed. These pixels are commonly used with the blt operation (described
later in this lab). They are addressed using the pixel addresses ;
through
; .
Figure 8.1 illustrates the pixel addresses provided by the gx device.
The gx device is a memory mapped device. This means that the gx device registers are mapped into memory locations and can be accessed using the standard load and
store operations. (The SPARC architecture doesnt provide any special I/O instructions, so
all devices must be memory mapped when they are used with a SPARC.) The gx device
has 256 registers: a status register, a command register, and 254 argument registers. Each
register is one word (four bytes, 32 bits) wide.
The status register is mapped into memory location 0x100000. Storing a value into this
location has no affect; however, when you load a register using this memory location, you
will actually read the status register of the gx device. The value of the status register is 0
when the gx device hasnt been (opened and) displayed. This register has the value 1 when
the device has been opened and mapped onto the display. The gx command register is
(511 511)
(0 0)
( )
512 512
(0 512)
51
(511 575)
52
;
;
(0,0)
512
512
Visible
(displayed)
;
;
(0,512)
Hidden
(not displayed)
?6
64
?
; ;
(511,511)
; ;
(511,575)
53
ISEM
memory
0x100000
0x100004
0x100008
0x10000c
0x100010
status
- command
- arg 1
- arg 2
- arg 3
gx
device
..
.
0x1007fe
arg 254
Command
name
open
close
color
Arguments
3
4
5
gx op
line
fill
n
x1 y1 x2 y2
xywh
blt
x1 y1 w h x2 y2
Operation performed
open the device
close the device
set the drawing color (n
for black, n
for white)
set the drawing function (see Table 8.2)
draw a line from pixel x1 ; y1 to x2 ; y2
fill the rectangle with upper left corner
x; y , width w, and height h
copy the rectangle with upper left corner
x1 ; y1 , width w, and height h to the rectangle with upper left corner x2 ; y2
=0
( )
( )
) (
filling a rectangle) or the source pixel (when copying a rectangle) to the destination pixel.
Table 8.2 summarizes the other drawing functions provided by the gx device.
The line function draws a line from one pixel to another based on the current drawing
color and function. This command takes four arguments: two arguments to specify the the
coordinates for each pixel.
The fill command fills a rectangle based on the current drawing color and function.
This command takes four arguments: the first two arguments specify the coordinates of the
upper left corner of the rectangle, the third argument specifies the width of the rectangle,
and the fourth argument specifies the height of the rectangle.
The blt command copies (subject to the drawing function) a rectangle from one part of
the gx memory to another. This command takes six arguments: the first four arguments
specify the source rectangle, the last two specify the upper left corner of the destination
rectangle.
To issue a gx command, you first store the arguments in the gx argument registers and
then store the command into the gx command register. The order in which you perform
54
Value
0x0
0x1
0x2
0x3
0x4
0x5
0x6
0x7
0x8
0x9
0xa
0xb
0xc
0xd
0xe
0xf
Drawing operation
0
source AND destination
source AND NOT destination
source
NOT source AND destination
destination
source XOR destination
source OR destination
NOT source AND NOT destination
NOT source XOR destination
NOT destination
source OR NOT destination
NOT source
NOT source OR destination
NOT source OR NOT destination
1
these stores is critical. You must load the command register after you have loaded the
argument registers. The gx device reads its argument registers as soon as you store a value
in the command register.
Symbolic constant that make devices, like the gx device, easier to use are commonly
defined in header files. Figure 8.3 presents a header file for the gx device. Example 8.1
illustrates the gx device and a use of the gx header file.
Example 8.1 Write an ISEM program that opens the gx device and draws a line from the upper left corner
to lower right corner.
.include "gx.h"
main:
wait:
.text
set
BX_BUFFER, %r1
st
ld
cmp
be
ld
%r0, [%r1+GX_CMD]
[%r1+GX_STATUS], %r2
%r2, 0
wait
[%r1], %r2
! open display
! load status word
! set up the
st
%r0,
st
%r0,
mov
511,
st
%r2,
st
%r2,
command arguments
[%r1+GX_LINE_X1]
[%r1+GX_LINE_Y1]
%r2
[%r1+GX_LINE_X2]
[%r1+GX_LINE_Y2]
! x1 = 0
! y1 = 0
! x2 = 511
! y2 = 511
55
!
! gx.h -- symbolic constants for the gx device
!
.set GX_BUFFER,
0x100000 ! start address for GX registers
.set
.set
.set
.set
.set
.set
.set
GX_OPEN,
GX_CLOSE,
GX_COLOR,
GX_OP,
GX_LINE,
GX_FILL,
GX_BLIT,
.set GX_STATUS,
.set GX_CMD,
.set GX_ARG,
0
1
2
3
4
5
6
! command numbers
0
4
8
.set
.set
.set
.set
GX_FILL_X,
GX_FILL_Y,
GX_FILL_W,
GX_FILL_H,
8
12
16
20
for fill
.set
.set
.set
.set
GX_LINE_X1,
GX_LINE_Y1,
GX_LINE_X2,
GX_LINE_Y2,
8
12
16
20
for line
.set
.set
.set
.set
.set
.set
GX_BLIT_X1,
GX_BLIT_Y1,
GX_BLIT_W,
GX_BLIT_H,
GX_BLIT_X2,
GX_BLIT_Y2,
8
12
16
20
24
28
for blit
56
end:
! all done -- the display will remain until you exit isem
ta
0
8.4 Summary
8.5 Review Questions
8.6 Exercises
1. A bitmap is rectangle of black and white pixels. In X11, bitmaps are stored row-byrow in an array of bytes, i.e., consecutive memory locations. Each bit in this array
represents a pixel on the screen1 for black and 0 for white (the inverse of the color
convention used for the gx device). Each row of the bitmap is stored in an integral
number of bytes. If the number of columns is not a multiple of 8, the last byte is
padded with zeros. The first byte of the array represents the leftmost 8 pixels in the
top row. The next byte represents the next 8 pixels in the top row or the first 8 pixels
of the next row if there are fewer than 9 columns in the bitmap. Within a byte, the
least significant bit represents the leftmost pixel.
As an example, Figure 8.4 illustrates a simple bitmap and assembly language declarations for the X11 representation of this bitmap.
height:
width:
bits:
.word
.word
.byte
.byte
8
8
0x01, 0x02, 0x04, 0x08
0x10, 0x20, 0x40, 0x80
width:
height:
bits:
.word
.word
.byte
.byte
.byte
.byte
.byte
.byte
31
13
0x00,
0xbf,
0xc1,
0x18,
0x7e,
0x00,
0x00,
0xdf,
0x1f,
0x18,
0xbf,
0x00,
0x00,
0x1d,
0x18,
0xb0,
0xdf,
0x00,
57
0x00,
0x18,
0xbf,
0xc1,
0x18,
0x00,
0x7e,
0x83,
0xc7,
0x18,
0x7e,
0x00,
0xbf,
0xc1,
0x1a,
0x18,
0xbf,
0x00,
0xdf,
0x1f,
0x18,
0xb0,
0xdf,
0x00
0x18,
0x18,
0xbf,
0xc1,
0x18,
0x7e
0x83
0xc7
0x18
0x00
58
Laboratory 9
The SPARC Instruction Formats
9.1 Goal
To cover the instruction encoding and decoding for the SPARC.
9.2 Objectives
After completing this lab, you will be able to:
9.3 Discussion
In this lab we consider instruction encoding and decoding for the operations that we have
introduced in previous labs. In particular, we will consider encodings for instructions that
use the data manipulation and branching operations. After we introduce instruction encoding, we consider the translation of synthetic operations. Finally, we conclude this lab
by considering instruction decoding on the SPARC.
All SPARC instructions are encoded in a single 32-bit instruction word, there are no
extension words.
59
60
31 30 29
25 24
11
19 18
op3
rd
11
25 24
[rs1 +siconst13 ], rd
rd, [rs1 +siconst13 ]
19 18
op3
rd
(load instructions) or
(store instructions)
5
14 13 12
rs1
[rs1 +rs2 ], rd
rd, [rs1 +rs2 ]
rs2
(load instructions) or
(store instructions)
0
14 13 12
rs1
asi
siconst13
op3
000000
000001
000010
000011
001001
001010
Operation
st
stb
sth
std
op3
000100
000101
000110
000111
Because this instruction uses two registers in the address specification, it is encoded using the first
format shown in Figure 9.1. As such, we must determine the values for the rd, op3 , rs1 , and rs2 fields.
The following table summarizes these encodings:
Field Symbolic value Encoded value
rd
%r11
01011
op3
ldd
000011
rs1
%r4
00100
%r7
00111
rs2
These encodings lead to the following machine instruction:
31 30 29
25 24
19 18
14 13 12
5 4
01011
000011
00100
0
00000000
11
That is, 1101 0110 0001 1001 0000 0000 0000 0111 in binary, or 0xD6190007.
00111
If the assembly language instruction only uses a single register in the address specification (e.g., register indirect addressing), the register is encoded in one of the source register
fields (i.e., sr1 or sr2 ) while %r0 is encoded in the other. It doesnt matter which field holds
the register specified in the assembly language instruction and which field holds the encoding for %r0. However, isem-as encodes %r0 in sr2 .
Example 9.2
ldub
Because this instruction uses registers in the address specification, it is encoded using the first format
shown in Figure 9.1. As such, we must determine the values for the rd, op3 , rs1 , and rs2 fields. The
following table summarizes these encodings:
61
25 24
19 18
5 4
14 13 12
10011
000001
10111
0
00000000
00000
11
That is, 1110 0110 0000 1101 1100 0000 0000 0000 in binary, or 0xE60DC000.
In the second format the 32-bit instruction is divided into six fields. As in the previous
format, the first field holds the 2-bit value 11. However, unlike the previous format, the
fifth field holds the 1-bit value 1. The remaining fields, rd, op3 , rs1 , and siconst13 , hold
encodings for the destination register, the operation, the source register, and the constant
value, respectively. When this format is used, the integer constant is encoded using the
13-bit 2s complement representation and stored in the siconst13 field of the instruction.
00
25 24
22 21
100
rd
const22
25 24
22 21
00
00010
100
1000 0111 0110 0101 0100 00
That is, 0000 0101 0010 0001 1101 1001 0101 0000 in binary, or 0x0521D950.
62
31 30 29
10
25 24
19 18
op3
rd
14 13 12
rs1
rs2
unused(zero)
10
25 24
19 18
op3
rd
14 13 12
rs1
siconst13
Table 9.2 summarizes the operation encodings for the data manipulation operations
that we have covered in the previous labs. When an instruction using one of these operations is encoded, the operator encoding is placed in the op3 field of the machine instruction.
Example 9.4
sub
Because this instruction uses two source registers, it is encoded using the first format shown in Figure 9.3. As such, we must determine the values for the op3 , rd, rs1 , and rs2 fields. The following table
summarizes these encodings:
Field Symbolic value Encoded value
rd
%r27
11011
op3
sub
000100
%r16
10000
rs1
rs2
%r26
11010
These encodings lead to the following machine instruction:
31 30 29
25 24
19 18
14 13 12
5 4
10
11011
000100
10000
0
00000000
11010
That is, 1011 0110 0010 0100 0000 0000 0001 1010 in binary, or 0xB624001A.
63
op3
000000
000001
000101
000010
000110
001110
001010
001011
001111
000100
000011
000111
100101
100101
100111
Operation
addcc
andcc
andncc
orcc
orncc
udivcc
umulcc
smulcc
sdivcc
subcc
xorcc
xnorcc
op3
010000
010001
010101
010010
010110
011110
011010
011011
011111
010100
010011
010111
31 30 29
25 24
19 18
14 13 12
10
10011
011011
11101
1111 1111 0100 1
1
That is, 1010 0110 1101 1111 0111 1111 1110 1001 in binary, or 0xA6DF7FE9.
00
25 24
cond
22 21
010
disp22
64
The a field of a machine instruction is set (i.e., 1) for instructions that use the annul suffix
(,a). This field is clear (i.e, 0) for conditional branching instructions that do not nullify
the results of the next instruction. The cond field of a machine instruction encodes the
condition under which the branch is taken. Table 9.3 summarizes the operation encodings
for the branching operations supported by the SPARC.
Table 9.3 Operation encodings for the conditional branching operations
Operation
ba
bne (bnz)
bg
bge
bgu
bcc (bgeu)
bpos
bvc
cond
1000
1001
1010
1011
1100
1101
1110
1111
Operation
bn
be (bz)
ble
bl
bleu
bcs (blu)
bneg
bvs
cond
0000
0001
0010
0011
0100
0101
0110
0111
Hand Assemble the branch instruction in the following SPARC code fragment.
cmp
%r2, 8
bne
l1
nop
inc
%r3
l1:
In this case, the target is 3 instructions from the branch instruction, so the disp22 field will be the 22-bit
binary encoding of 3.
Field
Symbolic value Encoded value
a
0
cond
bne
1001
disp2
l1
0000 0000 0000 0000 0000 11
These encodings lead to the following machine instruction:
31 30 29 28
25 24
22 21
00 0 1001
010
0000 0000 0000 0000 0000 11
That is, 0001 0010 1000 0000 0000 0000 0000 0011 in binary, or 0x12800002.
Example 9.7
top:
Hand Assemble the branch instruction in the following SPARC code fragment.
add
%r2, %r3, %r2
deccc
%r4
bne
top
In this case, the target is 2 instructions (back) from the branch instruction, so the disp22 field will be
the 22-bit binary encoding of
.
;2
65
Field
Symbolic value Encoded value
a
0
cond
bne
1001
disp2
l1
1111 1111 1111 1111 1111 10
These encodings lead to the following machine instruction:
31 30 29 28
25 24
22 21
00 0 1001
010
1111 1111 1111 1111 1111 10
That is, 0001 0010 1011 1111 1111 1111 1111 1110 in binary, or 0x12BFFFFE.
66
tst
rs
Implementation
andn
rd, rs, rd
andn
rs, siconst13 , rd
or
rd, rs, rd
or
rd, siconst13 , rd
andcc rs1 , rs2 , %g0
andcc rs, siconst13 , %g0
xor
rd, rs, rd
xor
rs, siconst13 , rd
or
%g0, %g0, rd
stb
%g0, [address]
sth
%g0, [address]
st
%g0, [address]
subcc rs1 , rs2 , %g0
subcc rs, siconst13 , %g0
sub
rd, 1, rd
sub
rd, siconst13 , rd
subcc rd, 1, rd
subcc rd, siconst13 , rd
add
rd, 1, rd
add
rd, siconst13 , rd
addcc rd, 1, rd
addcc rd, siconst13 , rd
or
%g0, rs, rd
or
%g0, siconst13 , rd
rd
statereg, rd
wr
%g0, rs, statereg
wr
%g0, siconst13 , statereg
sub
%g0, rs, rd
sub
%g0, rd, rd
xnor
rd, %g0, rd
xnor
rs, %g0, rd
or
%g0, iconst, rd
or
sethi
%hi(iconst), rd
or
sethi
%hi(iconst), rd
or
rd, %lo(iconst), rd
orcc
%g0, rs, %g0
provide the secondary opcode. If the primary opcode is 01, the instruction is a call instruction and the remaining bits (bits 029) are a displacement for the program counter (we will
discuss the call instruction at greater length in Lab 10). Otherwise, if the primary opcode
is either 10 or 11, bits 1924 of the instruction provide the secondary opcode. Figure 9.5
illustrates the positions of the secondary opcodes based on the primary opcode.
Once you have determined the primary and secondary opcodes, youll be able to to
determined the instruction and, knowing the instruction, decode the remaining fields of
the instruction. If the primary opcode is 01, the instruction is a call instruction and you can
easily complete the decoding of the instruction.
If the primary opcode is 00, the instruction is an unimplemented instruction, a condi-
25 24
67
0
22 21
op2
00
31 30 29
01
31 30 29
25 24
31 30 29
19 18
19 18
op3
10
25 24
op3
11
Instruction
The unimplemented instruction
illegal
Conditional branchinteger unit
illegal
SETHI
illegal
Conditional branchfloating point unit
Conditional branchcoprocessor
The data manipulation instructions are encoded with a primary opcode of 10. Table 9.6
shows how the 6-bit value in the op3 field is used to determine the instruction when the
primary opcode is 10.
Table 9.6 Decoding the op3 field when the primary opcode is 10
xxx000
xxx001
xxx010
xxx011
xxx100
xxx101
xxx110
xxx111
000xxx
add
and
or
xor
sub
andn
orn
xnor
001xxx
addx
umul
smul
subx
udiv
sdiv
010xxx
addcc
andcc
orcc
xorcc
subcc
andncc
orncc
xnorcc
011xxx
addxcc
umulcc
smulcc
subxcc
udivcc
sdivcc
100xxx
taddcc
tsubcc
taddcctv
tsubcctv
mulscc
sll
srl
sra
101xxx
rd
rd
rd
rd
110xxx
wr
wr
wr
wr
FPU op
FPU op
CP op
CP op
111xxx
jmpl
rett
trap
flush
save
restore
Instructions that access memory are encoded with a primary opcode of 11. Table 9.7
shows how the 6-bit value in the op3 field is used to determine the instruction when the
primary opcode is 11.
When you decode an instrcution that has a primary opcode of 10 or 11, you will need to
examine bit 13 to determine whether bits 012 of the instruction hold an immediate value
68
Table 9.7 Decoding the op3 field when the primary opcode is 11
xxx000
xxx001
xxx010
xxx011
xxx100
xxx101
xxx110
xxx111
000xxx
ld
ldub
lduh
ldd
st
stb
sth
std
001xxx
ldsb
ldsh
ldstub
swap
010xxx
lda
lduba
lduha
ldda
sta
stba
stha
stda
011xxx
ldsba
ldsha
ldstuba
swapa
100xxx
ldf
ldfsr
lddf
stf
stfsr
stdfq
stdf
101xxx
110xxx
ldc
ldcsr
lddc
stc
stcsr
scdfq
scdf
111xxx
In binary, this instruction is 00 00100 100 000100. . . . That is, the primary opcode is 00 and op2 is 100.
From Table 9.5, this is a sethi instruction. Using the sethi format to partition the bits yields:
31 30 29
25 24
22 21
Example 9.9
In binary, this instruction is 00 01000 010 000000. . . . That is, the primary opcode is 00 and op2 is
010. From Table 9.5, this is a conditional branch instruction. Using the conditional branch format to
partition the bits yields:
31 30 29 28
25 24
22 21
Example 9.10
In binary, the instruction is 10 00011 000000 0001. . . . That is, the primary opcode is 10 and op3 is
000000. From Table 9.6, this is an add instruction. Because bit 13 is 1, we use the second format in
Figure 9.3 to decode this instruction.
31 30 29
25 24
19 18
14 13 12
rs1 =00101 1
siconst13 =0 0000 0000 1110
10 rd=00011
000000
Thus, the destination is %r3, the source register is %r5, and the constant is 0xE. The following instruction will be assembled as 0x8601600E.
add
%r5, 14, %r3
9.4 Summary
9.5 Review Questions
9.6 Exercises
69
70
Laboratory 10
Leaf Procedures on the SPARC
10.1 Goal
To introduce the calling conventions associated with leaf procedures on the SPARC.
10.2 Objectives
After completing this lab, you will be able to write assembly language programs that:
10.3 Discussion
This lab is the first of three labs that cover procedure calling conventions on the SPARC.
In this lab we consider the conventions associated with leaf procedures: procedures that
do not make calls to other procedures. In Lab 11 we consider the use of register windows
on the SPARC. In Lab 12 we complete our coverage of procedure calling conventions by
considering the standard calling conventions used by compilers.
72
Use
zero
temporary
callers variables
return value
parameters
stack pointer
return address
callers variable
(actually, the address of the call instruction used to call the leaf procedure) is stored in
register %r15. A leaf procedure can alter this register; however, it will be difficult to return
to the point of the call if you alter the value in%r15.
An optimized leaf procedure should only alter the values stored in registers %r1 and
%r8%r13. If the leaf procedure requires more local storage than these registers provide,
or if the parameters do not fit in these registers, the leaf procedure cannot be implemented
as an optimized leaf procedure. We will discuss the techniques use to implement other
types of procedures in the next two labs.
Assembler syntax
call label
retl
Operation implemented
%r15 = PC
PC = nPC
nPC = label
PC = nPC
nPC = %r15 + 8
Example 10.1 Write SPARC procedure that prints a NULL terminated string. The address of the string to
print will be passed as the first parameter (i.e., in %r8).
.text
! pr_str - print a null terminated string
!
! Parameters:
%r8 - pointer to string (initially)
!
73
! Temporaries:
!
!
pr_str: mov
pr_lp: ldub
cmp
be
nop
ta
ba
inc
%r8, %r9
[%r9], %r8
%r8, 0
pr_dn
1
pr_lp
%r9
! print character
pr_dn:
Example 10.2
ple 10.1.
str:
main:
end:
retl
nop
Write a SPARC assembly language fragment that calls the procedure presented in Exam.data
.asciz
.text
set
call
nop
ta
str, %r8
pr_str
0
disp30
01
Example 10.3
main:
Show how the call instruction in the following SPARC assembly code fragment is encoded.
.text
set
str, %r8
! setup the first argument
call
pr_string
! call print string
nop
!
(branch delay)
ta
0
! exit gracefully
pr_string:
mov
%r8, %r9
74
In this case, the target is 3 instructions from the call instruction, so the disp30 field is set to the 30-bit
binary encoding of 3.
This leads to the following machine instruction:
31 30 29
01
0000 0000 0000 0000 0000 0000 0000 11
That is, 0100 0000 0000 0000 0000 0000 0000 0011 in binary, or 0x40000003.
The retl instruction is actually a synthetic instruction that is translated to a jmpl (jump
and link) instruction. The jmpl instruction has two operands: an address, and a destination
register. The address is similar to the addresses used in the load and store instructions;
however, the brackets surrounding the address in the load and store instructions are omitted in the jmpl instruction. When a SPARC processor executes a jmpl instruction, it saves
the address of the jmpl instruction in the destination register and sets the next program
counter to the address specified in the instruction. Figure 10.2 illustrates the formats used
to encode jmpl instructions.
A. Instructions of the form: jmpl rs1 +rs2 , rd
31 30 29
10
25 24
19 18
14 13 12
rs1
111000
rd
rs2
unused(zero)
10
25 24
19 18
111000
rd
14 13 12
rs1
siconst13
A retl instruction is translated to the instruction jmpl %r15+8, %r0. This instruction is encoded using
the second format shown in Figure 10.2. Figure 9.3. As such, we must determine the values for the
rd, rs1 , and siconst13 fields. The following table summarizes these encodings:
Field
Symbolic value Encoded value
rd
%r0
00000
rs1
%r15
01111
siconst13 8
0000 0000 0100 0
These encodings lead to the following machine instruction:
31 30 29
25 24
19 18
14 13 12
10
00000
111000
01111
0000 0000 0100 0
1
That is, 1000 0001 1100 0011 1110 0000 0000 1000 in binary, or 0x81C3E008.
10.4 Summary
A leaf procedure is a procedure that never calls any other procedure. In this lab we have
introduced the SPARC instructions used to write leaf procedures: call and retl. In the next
two labs, we will examine more general procedure calling conventions.
75
10.6 Exercises
1. Write a SPARC assembly language program consisting of of a main program and a
procedure, pr octal, that prints an unsigned integer in octal notation.
2. Write a SPARC assembly language program consisting of of a main program and a
procedure, pr hex, that prints an unsigned integer in hexadecimal notation.
3. Write a SPARC procedure, called strcmp, that compares two strings. Your procedure should accept two parameters, s1 and s2, both pointers to NULL terminated
strings. Your procedure should return an integer based on the comparison. In particular,
76
Laboratory 11
Register Windows
11.1 Goal
To introduce register windows.
11.2 Objectives
After completing this lab, you will be able to use:
Register windows,
The save and restore operations,
The (synthetic) return operation.
11.3 Discussion
In this lab we introduce a more general procedure calling mechanism that uses register
windows. We introduce the save and restore instructions and another synthetic instruction,
ret, for returning from procedures.
Alternate names
%g0%g7
%o0%o7
%l0%l7
%i0%i7
Group name
Global registers
Output registers
Local registers
Input registers
The alternate names reflect the uses of the registers when procedures use register windows. The global registers (%g0%g7) are shared by all procedures. The output registers
are used for parameters when calling another procedure. That is, the output registers are
outputs from the caller to the called procedure. The local registers (%l0%l7) are used to
store local values used by a procedure. The input registers (%i0%i7) are used for the parameters passed into the procedure. That is, the input registers are inputs passed from the
caller to the called procedure.
77
78
When you consider the relationship between the output and input registers, the trick is
to make the callers output registers the same as the called procedures input registers. On
the SPARC, this is done using overlapping register windows.
All procedures share the global registers (%g0%g7). The remaining registers, %o0
%o7, %l0%l7, and %i0%i7, are called a register window. When a procedure starts its execution, it allocates a set of 16 registers (using the save instruction as described in the following section). The new register set provides the procedure with its own output and local
registers (%o0%o7 and %l0%l7). The procedures input registers (%i0%i7) are overlapped with the callers output registers. Figure 11.1 illustrates the overlapping of register
windows between the caller and the callee.
Global registers
%g7 (%r7)
%g6 (%r6)
%g5 (%r5)
%g4 (%r4)
%g3 (%r3)
%g2 (%r2)
%g1 (%r1)
%g0 (%r0)
Callees
register
window
6 6
Overlap
? ?
Callees
Names
%o0 (%r8)
%o1 (%r9)
%o2 (%r10)
%o3 (%r11)
%o4 (%r12)
%o5 (%r13)
%o6 (%r14), %sp
%o7 (%r15)
%l0 (%r16)
%l1 (%r17)
%l2 (%r18)
%l3 (%r19)
%l4 (%r20)
%l5 (%r21)
%l6 (%r22)
%l7 (%r23)
%i0 (%r24)
%i1 (%r25)
%i2 (%r26)
%i3 (%r27)
%i4 (%r28)
%i5 (%r29)
%i6 (%r30), %fp
%i7 (%r31)
Callers
register
window
?
Figure 11.1 Overlapping register windows
Callers
Names
%o0 (%r8)
%o1 (%r9)
%o2 (%r10)
%o3 (%r11)
%o4 (%r12)
%o5 (%r13)
%o6 (%r14), %sp
%o7 (%r15)
%l0 (%r16)
%l1 (%r17)
%l2 (%r18)
%l3 (%r19)
%l4 (%r20)
%l5 (%r21)
%l6 (%r22)
%l7 (%r23)
%i0 (%r24)
%i1 (%r25)
%i2 (%r26)
%i3 (%r27)
%i4 (%r28)
%i5 (%r29)
%i6 (%r30), %fp
%i7 (%r31)
79
In addition to the register names we have discussed, Figure 11.1 introduces two new
names: %sp and %fp. The first of these, %sp, denotes the stack pointer. In an assembly
language program, %sp is simply another name for %o6. Similarly, %fp denotes the frame
pointer and is simply another name for %i6. We will discuss the special uses of these
registers (and hence the additional names) in the next lab when we consider stack frame
organization.
An implementation of the SPARC integer unit may have between 40 and 520 integer
registers. Every SPARC has 8 global registers, plus a circular stack of 2 to 32 register sets.
Each register set has 16 registers. The number of registers sets is implementation dependent. Number of register sets provided a particular implementation of the SPARC architecture has been given the name NWINDOWS. ISEM provides 32 register sets, the maximum
number supported by the SPARC Architecture. Most hardware implementations provide
7 or 8 register sets.
Syntax
save rs1 , rs2 , rd
restore callers
register window
restore
Operation implemented
res = reg[rs1 ]+reg[rs2 ]
CWP = (CWP 1) % NWINDOWS
reg[rd] = res
res = reg[rs1 ]+siconst13
CWP = (CWP 1) % NWINDOWS
reg[rd] = res
res = reg[sr1 ]+reg[sr2 ]
CWP = (CWP+1) % NWINDOWS
reg[rd] = res
res = reg[rs]+siconst13
CWP = (CWP+1) % NWINDOWS
reg[rd] = res
CWP = (CWP+1) % NWINDOWS
80
Syntax
call label
ret
Operation implemented
%o7 = PC
PC = nPC
nPC = label
PC = nPC
nPC = %i7 + 8
Example 11.1 Write a SPARC assembly language procedure, pr str, that will print a NULL terminated
string. Your procedure should take a single argument, the address of the string to print. In writing this
procedure, you should assume that the procedure pr ch is available for printing a character.
.text
! pr_str - print a null terminated string
!
! Temporaries: %i0 - pointer to string
pr_lp:
pr_dn:
ldub
cmp
be
nop
call
nop
ba
inc
81
[%i0], %o0
%o0, 0
pr_dn
! load character
! check for null
pr_char
! print character
pr_lp
%i0
ret
restore
Example 11.2 Write a main SPARC assembly language fragment that allocates space for the stack and
calls the pr str procedure in the previous example.
str:
.data
.asciz
.align 8
stack_top:
. = . + 2048
stack_bot:
.text
start:
set
end:
set
call
nop
ta
Example 11.3 Write a procedure that recursively calculates the Nth Fibonacci number. You may assume
that N is non-negative and will be small enough that register overflow will not occur.
! fib - calculate the Nth Fibonacci number
!
!
fib(N) = fib(N-1) + fib(N-2)
!
fib(0) = fib(1) = 1
fib:
save
! PROLOGUE
cmp
bg
nop
%i0, 1
fib_call
! call recursively
ret
restore %g0, 1, %o0
! EPILOGUE
! return 1
82
fib
%i0, 1, %o0
%o0, %l0
call
sub
fib
%i0, 2, %o0
ret
restore %l0, %o0, %o0
! EPILOGUE
! return fib(N-1) + fib(N-2)
11.3.5 Exceptions
Both the save and restore operations can generate exceptions (or traps). Before the CWP is
modified, the bit in the WIM cooresponding to the new value for the CWP is tested. If the
bit in the WIM is 1, an exception is generated. For a save instruction, this causes a window
overflow trap. For a restore instruction, this causes a window underflow trap.
These traps are normally handled by the operating system and are transparent to the
application programmer. In tkisem these traps are handled by the rom code. We will discuss
the code used to handle these traps in Lab 17.
op3
111100
111101
Like the retl instruction, the ret instruction is a synthetic instruction, based on the jmpl
instruction. The ret instruction is translated to jmpl %i7+8, %g0.
11.4 Summary
This lab presents a more general mechanism for procedures on the SPARC. Register windows provide easy access to a large collection of registers and can reduce the need to save
registers in memory. While this mechanism has many advantages there are several disadvantages to keep in mind. The mechanism only provides six registers for procedure
parameters. If you write a procedure with more than six parameters, you will need to to
use the stack for any parameters beyond six. Secondly, most implementations only have 7
or 8 register sets. So, if your call sequence gets deeper than NWINDOWS (as it probably
will in most recursive procedures), you are again forced to use the stack.
83
84
Laboratory 12
Standard Calling Conventions
12.1 Goal
To cover the standard procedure calling conventions for the SPARC.
12.2 Objectives
After completing this lab, you will be able to write assembly language procedures that:
12.3 Discussion
In most cases, you will not want to write entire programs in assembly language. Instead,
you will want to write most of the program in a high-level language (like C) and only write
a few procedures in assembly languagethe procedures that cannot be easily optimized
in the high-level language or that need to take advantage of special features provided by
the machine.
In this lab, we complete our presentation of the SPARC application binary interface
(ABI). The SPARC ABI is a set of conventions that are expected to be folowed by all compilers and assembly language programmers. These conventions cover the uses of registers
and the structure of the stack frame. If you follow the conventions specified by the SPARC
ABI in your assembly language procedures, it will be possible to call your proceudres from
procedures written in high-level languages. You will also be able to call procedures written
in high-level languages from your assembly language procedures.
In Lab 10 we covered the portion of the SPARC ABI that deals with optimized leaf
procedures. In Lab 11 we covered the conventions related to register usage for procedures
that are not implementd as optimized leaf procedures. In this lab we cover the conventions
related to the allocation and structure of stack frames. Throughout this lab we will assume
that we are not implementing an optimized leaf procedure.
86
As such, allocation of a stack frame is implemented by subtracting a value from the current
stack pointer (actually, this usually done by adding a negative number to the stack pointer).
%sp (=%o6)
smaller
addresses
Current
stack
frame
%fp (=%i6)
(previous %sp)
6
Stack
growth
Callers
stack
frame
larger
addresses
12.3.2 Parameters
As we noted in the previous two labs, registers %o0%o5 are used for the first six parameters passed to a procedure. If a procedure has more than 6 parameters, the remaining
parameters are passed on the stack. SPARC procedures do not push parameters (beyond
the sixth parameter) onto the procedure call stack. Instead, they allocate space in their stack
frame for the parameters and copy parameters into this space. This means that the called
procedure will find its parameters (beyond the sixth) in the callers stack frame. The called
procedure can access these parameters using the frame pointer (%fp) with positive offsets.
87
returns a structured value, the result may not fit in a register. In this case, the ccalling
procedure must allocate space for the return value (probably in its stack frame). The calling
procedure then puts the address of this space into the hidden parameter before making the
call.
The remaining 6 words can be used by the called procedure to store the first six arguments (the ones passed in %o0%o5). In most cases, the called procedure will be able
to access these parameters in the registers %i0%i5 and will not need to store them in the
callers stack frame. However, if the called procedure needs to take the address of a parameter, it needs to store the parameter into memory (you cant take the address of a register).
In addition to the regions that we have discussed, a procedure may allocate additional
stack space for: alignment (the stack pointer should always be a multiple of 8), outgoing
parameters (beyond the sixth parameter), automatic local arrays and other automatic local variables that dont fit in the local registers %l0%l7, temporaries, and floating point
registers. Figure 12.2 illustrates the organization of a SPARC stack frame.
%sp
6
space for
%i0%i7
and %l0%l7
%sp
%sp
%sp
+ 64
+ 68
+ 92
hidden parameter
required
for all
procedures
required
for nonleaf
procedures
space for
the first six
parameters
outgoing
parameters
past the sixth
%sp
+ (92 + 4 p)
(4 l)
%fp ;
temporaries
and pad
automatic
local
variables
%fp
88
f
g
return p1 + p2 + p3 + p4 + p5 + p6 + p7;
In this case, the parameters p1p6 will be in registers %i0%i5. The parameter p7 will be in the
callers stack frame at offset 92 (that is, %fp + 92).
.text
add7:
save
%sp, -64, %sp
! this is a leaf procedure
ld
[%fp+92], %l0
add
add
add
add
add
%i0,
%i0,
%i0,
%i0,
%i0,
!
!
!
!
!
%i1,
%i2,
%i3,
%i4,
%i5,
%i0
%i0
%i0
%i0
%i0
ret
restore %i0, %l0, %o0
add
add
add
add
add
in
in
in
in
in
p2
p3
p4
p5
p6
! add in p7
Example 12.2 Translate the following C function which includes a call to the function defined in example 12.1
int test( int x1, int x2 )
l1 = x1 + x2;
l2 = x2 - x1;
return add7( x1, x2, l1, l2, l1+l2, l2-l1, l2+l2 );
test:
.text
save
add
sub
! l1 = x1 + x2;
! l2 = x2 - x1;
mov
mov
mov
mov
add
sub
%i0,
%i1,
%l0,
%l1,
%l0,
%l1,
!
!
!
!
!
!
add
call
st
pad)
%o0
%o1
%o2
%o3
%l1, %o4
%l0, %o5
ret
restore %g0, %o0, %o0
first parameter
second parameter
third parameter
fourth parameter
fifth parameter
sixth parameter
! temp = l2+l2
! seventh parameter (delay slot)
89
It is also common to access local variables stored in the stack using negative offsets from
the frame pointer.
Example 12.3 Translate the following C function into a SPARC procedure. You should assume that the
procedures read int and write int are defined elsewhere.
void read10( )
int i;
int a[10];
for( i = 0 ; i < 10 ; i++ )
a[i] = read_int();
In this case, we will use %l0 for i (scaled by 4) and the array a will be stored in the local space starting
at %fp 40.
.text
add7:
save
%sp, -(92+4*10+4), %sp ! we need 40 words for the array
sub
clr
%l0
! i = 0
call
nop
st
read_int
%o0, [%l1+%l0]
! a[i] = read_int();
inc
cmp
bl
nop
%l0, 4
%l0, 40
top1
! increment i += 4
! i < 10*4
array
top1:
top2:
mov
ld
call
nop
36, %l0
[%l1+%l0], %o0
write_int
! i = 9*4
! write_int( a[i] )
deccc
bge
nop
%l0, 4
top2
! i -= 4
! i >= 0
ret
restore
90
12.4 Summary
12.5 Review Questions
12.6 Exercises
Laboratory 13
Integer Arithmetic on the SPARC
13.1 Goal
To cover
13.2 Objectives
After completing this lab, you will be able to write SPARC programs that:
13.3 Discussion
13.4 Summary
13.5 Review Questions
13.6 Exercises
91
92
Laboratory 14
The Floating Point Coprocessor
14.1 Goal
To cover the floating point coprocessor on the SPARC.
14.2 Objectives
After completing this lab, you will be able to write SPARC programs that:
14.3 Discussion
14.4 Summary
14.5 Review Questions
14.6 Exercises
93
94
Laboratory 15
Linking and Loading
15.1 Goal
To cover the translation process implemented by the ISEM tools.
15.2 Objectives
After completing this lab, you will:
coff
15.3 Discussion
15.4 Summary
15.5 Review Questions
15.6 Exercises
95
96
Laboratory 16
Traps
16.1 Goal
To cover the basic SPARC trap mechanism and trap instructions.
16.2 Objectives
After completing this lab, you will be able to write trap handlers:
trap always
conditional traps
16.3 Discussion
16.3.1 The Processor Status Register (PSR)
Figure 16.1 presents the fields in the processor status register.
97
98
ASI
7
8
9
10
Laboratory 17
Exceptions and Exception
Handling
17.1 Goal
To cover exception handling on the SPARC.
17.2 Objectives
After completing this lab, you will:
exception handlers.
17.3 Discussion
17.4 Summary
17.5 Review Questions
17.6 Exercises
99
100
Laboratory 18
Interrupts and Interrupt Handling
18.1 Goal
To cover interrupts and interrupt handling on the SPARC.
18.2 Objectives
After completing this lab, you will:
interrupt handlers.
18.3 Discussion
18.4 Summary
18.5 Review Questions
18.6 Exercises
101
102
Laboratory 19
Context Switching
19.1 Goal
To cover context switching on the SPARC.
19.2 Objectives
After completing this lab, you will:
multitasking.
19.3 Discussion
19.4 Summary
19.5 Review Questions
19.6 Exercises
103
104