Lecture10
Lecture10
Announcement
• Programming Assignment 1 grade is out
• You got a “0” if dlc could not compile your code
• Mainly due to the “parse error”
• Talk to a TA about this (within two weeks)
2
Carnegie Mellon
Announcement
• Programming Assignment 2
• Details: https://www.cs.rochester.edu/courses/252/fall2024/
labs/assignment2.html
• Due on Oct. 2nd (Wednesday), 11:59 PM (extended two
days)
• You (may still) have 3 slip days
3
Carnegie Mellon
Announcement
• Midterm on Oct. 9th, Wednesday
• Open-book; e-book allowed.
• No cheat sheet.
• Exams for CSC 252 and CSC 452 will be slightly different
• More problems
• Covers everything until today’s lecture (Sep. 30th)
4
Carnegie Mellon
void call_echo() {
echo();
}
5
Carnegie Mellon
void call_echo() {
echo();
}
unix>./bufdemo-nsp
Type a string:012345678901234567890123
012345678901234567890123
5
Carnegie Mellon
void call_echo() {
echo();
}
unix>./bufdemo-nsp
Type a string:012345678901234567890123
012345678901234567890123
unix>./bufdemo-nsp
Type a string:0123456789012345678901234
Segmentation Fault
5
Carnegie Mellon
6
Carnegie Mellon
7
Carnegie Mellon
8
Carnegie Mellon
9
Carnegie Mellon
9
Carnegie Mellon
11
Carnegie Mellon
So far in 252…
C Program
Assembly
Program
Processor
Microarchitecture
Circuits
13
Carnegie Mellon
So far in 252…
C Program
Assembly
Program
ret, call
Instruction Set Architecture movq, addq
jmp, jne
Processor
Microarchitecture
Circuits
13
Carnegie Mellon
So far in 252…
C Program
ret, call
Instruction Set Architecture movq, addq
jmp, jne
Processor
Microarchitecture
Circuits
13
Carnegie Mellon
So far in 252…
int, float
C Program if, else
+, -, >>
ret, call
Instruction Set Architecture movq, addq
jmp, jne
Processor
Microarchitecture
Circuits
13
Carnegie Mellon
So far in 252…
int, float
C Program if, else
+, -, >>
ret, call
Instruction Set Architecture movq, addq
jmp, jne
Circuits
13
Carnegie Mellon
So far in 252…
int, float
C Program if, else
+, -, >>
ret, call
Instruction Set Architecture movq, addq
jmp, jne
Circuits Transistors
13
Carnegie Mellon
So far in 252…
C Program • ISA is the interface between
assembly programs and
microarchitecture
Assembly
Program
Processor
Microarchitecture
Circuits
14
Carnegie Mellon
So far in 252…
C Program • ISA is the interface between
assembly programs and
microarchitecture
Assembly • Assembly view:
Program
Processor
Microarchitecture
Circuits
14
Carnegie Mellon
So far in 252…
C Program • ISA is the interface between
assembly programs and
microarchitecture
Assembly • Assembly view:
Program • How to program the machine,
based on instructions and
processor states (registers,
Instruction Set Architecture memory, condition codes, etc.)?
Processor
Microarchitecture
Circuits
14
Carnegie Mellon
So far in 252…
C Program • ISA is the interface between
assembly programs and
microarchitecture
Assembly • Assembly view:
Program • How to program the machine,
based on instructions and
processor states (registers,
Instruction Set Architecture memory, condition codes, etc.)?
• Instructions are executed
sequentially.
Processor
Microarchitecture
Circuits
14
Carnegie Mellon
So far in 252…
C Program • ISA is the interface between
assembly programs and
microarchitecture
Assembly • Assembly view:
Program • How to program the machine,
based on instructions and
processor states (registers,
Instruction Set Architecture memory, condition codes, etc.)?
• Instructions are executed
sequentially.
Processor • Microarchitecture view:
Microarchitecture
Circuits
14
Carnegie Mellon
So far in 252…
C Program • ISA is the interface between
assembly programs and
microarchitecture
Assembly • Assembly view:
Program • How to program the machine,
based on instructions and
processor states (registers,
Instruction Set Architecture memory, condition codes, etc.)?
• Instructions are executed
sequentially.
Processor • Microarchitecture view:
Microarchitecture • What hardware needs to be built to
run assembly programs?
Circuits
14
Carnegie Mellon
So far in 252…
C Program • ISA is the interface between
assembly programs and
microarchitecture
Assembly • Assembly view:
Program • How to program the machine,
based on instructions and
processor states (registers,
Instruction Set Architecture memory, condition codes, etc.)?
• Instructions are executed
sequentially.
Processor • Microarchitecture view:
Microarchitecture • What hardware needs to be built to
run assembly programs?
• How to run programs as fast
Circuits (energy-efficient) as possible?
14
Carnegie Mellon
15
Carnegie Mellon
15
Carnegie Mellon
15
Carnegie Mellon
15
Carnegie Mellon
15
Carnegie Mellon
16
Carnegie Mellon
16
Carnegie Mellon
16
Carnegie Mellon
17
Carnegie Mellon
18
Carnegie Mellon
Y86-64 Instructions
halt
nop
cmovXX rA, rB
irmovq V, rB
mrmovq D(rB), rA
OPq rA, rB
jXX Dest
call Dest
ret
pushq rA
popq rA
19
Carnegie Mellon
Y86-64 Instructions
halt
nop
cmovXX rA, rB
irmovq V, rB
Y86-64 Instructions
halt
nop
cmovXX rA, rB
irmovq V, rB addq
rrmovq
Y86-64 Instructions cmovle
cmovl
halt
cmove
nop
cmovne
cmovXX rA, rB
cmovge
irmovq V, rB addq
cmovg
rmmovq rA, D(rB) subq
jmp
mrmovq D(rB), rA andq
jle
OPq rA, rB xorq
jl
jXX Dest
je
call Dest
jne
ret
jge
pushq rA
jg
popq rA
19
Carnegie Mellon
rrmovq
Y86-64 Instructions cmovle
cmove
nop
cmovne
cmovXX rA, rB
cmovge
irmovq V, rB addq
cmovg
rmmovq rA, D(rB) subq
jmp
mrmovq D(rB), rA andq
jle
OPq rA, rB xorq
jl
jXX Dest
je
call Dest
jne
ret
jge
pushq rA
jg
popq rA
19
Carnegie Mellon
Encoding Opcodes
halt addq • 27 Instructions, so need 5 bits
nop subq
for encoding the operand
rrmovq
cmovXX rA, rB andq
cmovle
irmovq V, rB xorq
cmovl
rmmovq rA, D(rB)
jmp
cmove
mrmovq D(rB), rA
jle
cmovne
OPq rA, rB
jl
cmovge
jXX Dest
je
cmovg
call Dest
jne
ret
jge
pushq rA
jg
popq rA
20
Carnegie Mellon
Encoding Opcodes
halt addq • 27 Instructions, so need 5 bits
nop subq
for encoding the operand
Encoding Opcodes
halt addq • 27 Instructions, so need 5 bits
nop subq
for encoding the operand
OPq rA, rB
jle
cmovne • E.g., 12 categories, so 4 bits
jl
cmovge
jXX Dest
je
cmovg
call Dest
jne
ret
jge
pushq rA
jg
popq rA
20
Carnegie Mellon
Encoding Opcodes
halt addq • 27 Instructions, so need 5 bits
nop subq
for encoding the operand
OPq rA, rB
jle
cmovne • E.g., 12 categories, so 4 bits
jl
cmovge
• There are four instructions within
jXX Dest
je
the OPq category, so additional
cmovg
call Dest 2 bits. Similarly, 3 more bits for
jne
ret
jXX and cmovXX, respectively.
jge
pushq rA
jg
popq rA
20
Carnegie Mellon
Encoding Opcodes
halt addq • 27 Instructions, so need 5 bits
nop subq
for encoding the operand
OPq rA, rB
jle
cmovne • E.g., 12 categories, so 4 bits
jl
cmovge
• There are four instructions within
jXX Dest
je
the OPq category, so additional
cmovg
call Dest 2 bits. Similarly, 3 more bits for
jne
ret
jXX and cmovXX, respectively.
pushq rA
jge
• Which one is better???
jg
popq rA
20
Carnegie Mellon
Encoding Opcodes
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
• Design decision chosen by the textbook
cmovXX rA, rB 2 fn authors (don’t have to be this way!)
irmovq V, rB 3 0 • Use 4 bits to encode the instruction
category
rmmovq rA, D(rB) 4 0
• Another 4 bits to encode the specific
5 0
mrmovq D(rB), rA
instructions within a category
OPq rA, rB 6 fn • So 1 bytes for encoding opcode
jXX Dest 7 fn • Is this better than the alternative of using
pushq rA A 0
popq rA B 0
21
Carnegie Mellon
Encoding Registers
Each register has 4-bit ID
• Same encoding as in x86-64
• Register ID 15 (0xF) indicates “no register”
%rax 0 %r8 8
%rcx 1 %r9 9
%rdx 2 %r10 A
%rbx 3 %r11 B
%rsp 4 %r12 C
%rbp 5 %r13 D
%rsi 6 %r14 E
%rdi 7 No Register F
22
Carnegie Mellon
Encoding Registers
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB
mrmovq D(rB), rA 5 0 rA rB
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
23
Carnegie Mellon
Instruction Example
Addition Instruction
addq rA, rB 6 0 rA rB
24
Carnegie Mellon
Instruction Example
Addition Instruction
Assembly Form
addq rA, rB 6 0 rA rB
24
Carnegie Mellon
Instruction Example
Addition Instruction
Assembly Form
Encoded Representation
addq rA, rB 6 0 rA rB
24
Carnegie Mellon
andq rA, rB 6 2 rA rB
Exclusive-Or
xorq rA, rB 6 3 rA rB
25
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB
mrmovq D(rB), rA 5 0 rA rB
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
mrmovq D(rB), rA 5 0 rA rB
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Move Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
The instruction length limits the
immediate value and displacement.
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
26
Carnegie Mellon
Encoding: 30 f2 cd ab 00 00 00 00 00 00
Encoding: 20 43
mrmovq -12(%rbp),%rcx
Encoding: 50 15 f4 ff ff ff ff ff ff ff
rmmovq %rsi,0x41c(%rsp)
Encoding: 40 64 1c 04 00 00 00 00 00 00
27
Carnegie Mellon
Jump/Call Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
jXX Dest 7 fn
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
28
Carnegie Mellon
Jump/Call Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
nop 1 0
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
28
Carnegie Mellon
Jump/Call Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
The assembler would assume a start
nop 1 0 address of the program, and then calculates
2 fn rA rB
the address of each instruction.
cmovXX rA, rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
28
Carnegie Mellon
Jump/Call Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
The assembler would assume a start
nop 1 0 address of the program, and then calculates
2 fn rA rB
the address of each instruction.
cmovXX rA, rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
call Dest 8 0
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
28
Carnegie Mellon
Jump/Call Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
The assembler would assume a start
nop 1 0 address of the program, and then calculates
2 fn rA rB
the address of each instruction.
cmovXX rA, rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
28
Carnegie Mellon
Jump/Call Instructions
Byte 0 1 2 3 4 5 6 7 8 9
halt 0 0
The assembler would assume a start
nop 1 0 address of the program, and then calculates
2 fn rA rB
the address of each instruction.
cmovXX rA, rB
irmovq V, rB 3 0 F rB V
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
ret 9 0
pushq rA A 0 rA F
popq rA B 0 rA F
28
Carnegie Mellon
OPq rA, rB 6 fn rA rB
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
addq %rax,%rsi
call <foo>
jmp .L0
… …
.L0 irmovq $0xabcd, %rdx
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
addq %rax,%rsi
call <foo>
jmp .L0
… …
.L0 irmovq $0xabcd, %rdx
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
addq %rax,%rsi
call <foo>
jmp .L0
… …
.L0 irmovq $0xabcd, %rdx
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
addq %rax,%rsi 60 06
call <foo>
jmp .L0
… …
.L0 irmovq $0xabcd, %rdx
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
addq %rax,%rsi 60 06
call <foo> 80 00 01 00 00 00 00 00 00
jmp .L0
… …
.L0 irmovq $0xabcd, %rdx
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
addq %rax,%rsi 60 06
call <foo> 80 00 01 00 00 00 00 00 00
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
addq %rax,%rsi 60 06
call <foo> 80 00 01 00 00 00 00 00 00
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
29
Carnegie Mellon
OPq rA, rB 6 fn rA rB
jmp .L0 70 00
????????
02 00 00 00 00 00 00
… …
0x200 .L0 irmovq $0xabcd, %rdx 30 f2 cd ab 00 00 00 00 00 00
29
Carnegie Mellon
jmp .L0 70 00
????????
02 00 00 00 00 00 00
… …
0x200 .L0 irmovq $0xabcd, %rdx 30 f2 cd ab 00 00 00 00 00 00
30
Carnegie Mellon
Jump Instructions
Jump Unconditionally
jmp Dest 7 0 Dest
31
Carnegie Mellon
ret 9 0
32
Carnegie Mellon
call Dest 8 0 Dest (essentially the start address of the callee) call foo
33
Carnegie Mellon
call Dest 8 0 Dest (essentially the start address of the callee) call foo
• The instruction length limits how far you can jump/call functions. What
if the jump target has a very long address that can’t fit in 8 bytes?
33
Carnegie Mellon
call Dest 8 0 Dest (essentially the start address of the callee) call foo
• The instruction length limits how far you can jump/call functions. What
if the jump target has a very long address that can’t fit in 8 bytes?
• One alternative: use a super long instruction encoding format.
• Simple to encode, but space inefficient (waste bits for jumps to short
addr.)
33
Carnegie Mellon
call Dest 8 0 Dest (essentially the start address of the callee) call foo
• The instruction length limits how far you can jump/call functions. What
if the jump target has a very long address that can’t fit in 8 bytes?
• One alternative: use a super long instruction encoding format.
• Simple to encode, but space inefficient (waste bits for jumps to short
addr.)
• Another alternative: encode the relative address, not the absolute
address
• E.g., encode (.L4 - current address) in Dest
33
Carnegie Mellon
addq %rax,%rsi 60 06
34
Carnegie Mellon
34
Carnegie Mellon
34
Carnegie Mellon
34
Carnegie Mellon
34
Carnegie Mellon
34
Carnegie Mellon
34
Carnegie Mellon
call Dest 8 0 Dest (essentially the start address of the callee) call foo
35
Carnegie Mellon
call Dest 8 0 Dest (essentially the start address of the callee) call foo
• What if you want to jump really far away from the current instruction?
• indirect jump, use a combination of absolute + relative addresses
(“Far jumps” in x86). Elegant design.
35
Carnegie Mellon
Stack Operations
pushq rA A 0 rA F
• Decrement %rsp by 8
• Store word from rA to memory at %rsp
• Like x86-64
popq rA B 0 rA F
36
Carnegie Mellon
Miscellaneous Instructions
nop 1 0
• Don’t do anything
halt 0 0
37
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon
38
Carnegie Mellon