0% found this document useful (0 votes)
73 views

An Introduction To Computer Architecture

experiment0

Uploaded by

khaled mahmud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

An Introduction To Computer Architecture

experiment0

Uploaded by

khaled mahmud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

1

EEE 415 July 2021

Course Instructor(s):
Section A: Dr. Sajid Muhaimin Choudhury, Assistant Professor
Email: [email protected]
Office: ECE222, ECE Building

Section B: Dr. Md. Zunaid Baten, Associate Professor


Email: [email protected]

Section C: Mr. Sadman Sakib Ahbab, Lecturer


Email: [email protected]

N.B. the following slides were prepared by Dr. Sajid Muhaimin Choudhury; slight modifications have been made by Dr. Md. Zunaid Baten

1
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
2

EEE 415
Course Information

2
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
3

EEE 415 - Old Syllabus


Basic components of a computer system. Simple-As-Possible (SAP) computer: SAP-1, selected concepts from
SAP-2 and SAP-3 (jump, call, return, stack, push and pop). Evolution of microprocessors.
Introduction to Intel 8086 microprocessor: features, architecture, Minimum mode operation of 8086
microprocessor: system timing diagrams of read and write cycles, memory banks, design of decoders for RAM,
ROM and PORT.
Introduction to Intel 8086 Assembly Language Programming: basic instructions, logic, shift and rotate
instructions, addressing modes, stack management and procedures, advanced arithmetic instructions for
multiplication and division, instructions for BCD and double precision numbers, introduction to 8086
programming with C language.
Hardware Interfacing with Intel 8086 microprocessor: programmable peripheral interface, programmable
interrupt controller, programmable timer, serial communication interface, keyboard and display interface (LED,
7 segment, dot matrix and LCD).

3
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
4

EEE 415 - New Syllabus


• Fundamentals of microprocessor and computer design, processor data path, architecture, microarchitecture,
complexity, metrics, and benchmark; Instruction Set Architecture, introduction to CISC and RISC, Instruction-Level
Parallelism, pipelining, pipelining hazards and data dependency, branch prediction, exceptions and limits, super-pipelined
vs superscalar processing; Memory hierarchy and management, Direct Memory Access, Translation Lookaside Buffer;
cache, cache policies, multi-level cache, cache performance; Multicore computing, message passing, shared memory,
cache-coherence protocol, memory consistency, paging, Vector Processor, Graphics Processing Unit, IP Blocks, Single
Instruction Multiple Data and SoC with microprocessors. Simple Arm/RISC-V based processor design with VerilogHDL
• Introduction to embedded systems design, software concurrency and Realtime Operating Systems, Arm Cortex M /
RISC-V microcontroller architecture, registers and I/O, memory map and instruction sets, endianness and image,
Assembly language programming of Arm Cortex M / RISC-V based embedded microprocessors (jump, call-return, stack,
push and pop, shift, rotate, logic instructions, port operations, serial communication and interfacing), system clock,
exceptions and interrupt handling, timing analysis of interrupts, general purpose digital interfacing, analog interfacing,
timers: PWM, real-time clock, serial communication, SPI, I2C, UART protocols, Embedded Systems for Internet of
Things (IoT)

4
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
5

Introduction to OBE
• In this course, we are going to follow the approach of Outcome
Based Education (OBE)
• Outcome-based education (OBE) is an educational theory that
bases each part of an educational system around goals
(outcomes). By the end of the educational experience, each
student should have achieved the goal

5
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
BUGS EEE - Introducing Outcome Based Education
6

Courtesy: Dr. Yasin, IQAC OBE Workshop

6
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Program Objectives of Dept of EEE BUET
PO1 Engineering Knowledge: Apply knowledge of mathematics, science, and engineering to solve complex electrical and electronic engineering problems. (*K1 to K4).
PO2 Problem Analysis: Identify, formulate, research literature, interpret data, and analyze complex electrical and electronic engineering problems using principles of
mathematical, natural and engineering sciences. (K1 to K4).

PO3 Design/development Solution: Design solutions to complex engineering problems and design systems, components, or processes that meet the needs relevant
to electrical and electronic engineering with appropriate considerations to public health and safety, cultural, societal, and environmental considerations. (K5).

PO4 Investigation: Conduct investigations of complex problems using research-based knowledge and research methods including design of experiments, analysis
and interpretation of data, and synthesis of information to provide valid conclusions. (K8).

PO5 Modern tool usage: Use techniques, skills, and modern engineering tools to solve complex and practical engineering problems related to electrical and
electronic engineering with understanding of the limitations. (K6).

PO6 The Engineer and Society: Apply reasoning to assess societal, health, safety, legal andcultural issues and the consequent responsibilities relevant to
professional engineering practice and solutions to complex engineering problems. (K7).

PO7 Environment and sustainability: Understand and evaluate the sustainability and impact of professional engineering work in the solution of
complex engineering problems in societal and environmental contexts. (K7).
PO8 Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of engineering practice. (K7).
PO9 Individual work and team work: Function effectively as an individual, and as a member or leader in diverse teams and in multi-disciplinary settings.
Communication: Communicate effectively on complex engineering activities with the electrical and electronic engineering and other inter-disciplinary communities
PO10 and with society at large, such as being able to comprehend and write effective reports and design documentation, make effective presentations, and give and receive
clear instructions.

PO11 Project management and finance: Demonstrate knowledge and understanding of engineering management principles and economic decision-making and
apply these to one's own work, as a member and leader in a team, to manage projects and in multidisciplinary environments.
PO12 Life-long Learning: Recognize the need for, and ability to engage in life-long learning

7
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
BUGS EEE - Introducing Outcome Based Education
8

Course Outcome
• Syllabus of each course needs to be modified to include
specific
Course Outcomes (CO)
• Each Course will have several COs
• Each CO will be mapped with 1 or more PO
• Each CO must be evaluated through assessments (CTs, Term
final Exam questions, Presentation etc)
• Through marks obtained in each exam question, it will be
calculated how much of the COs have each student obtained
• CO-PO mapping will tell how much PO each student has
obtained
8
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
9

Course Outcomes of EEE 415


After completing this course, the students will be able to -
CO CO Statement Corresponding Domains and Delivery Method(s) Assessment
No. PO(s)* Taxonomy and Activity(-ies) Tool(s)
level(s)**
CO1 Explain the architecture, instruction PO1 C4 Lectures, Class test,
set, memory and input/output Handouts Final exam
interface of a ARM Microprocessor
and different principles of Embedded
Systems
CO2 Construct simple microprocessor PO7 C3 Lectures, Assignment
systems using state-of-the-art tools Handouts
like Arm Assembly compiler and
VerilogHDL understanding the
limitations
CO3 Illustrate emerging technologies and PO12 C2 -- Video
trends in Microprocessor design to Presentation,
recognize the need to always learn Report
the state-of-the art
Cognitive Domain Taxonomy Levels: C1: Remember; C2: Understand; C3: Apply; C4: Analyse; C5: Evaluate; C6: Create
9
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
10

Grading Policy:
1. Class Attendance – Class participation and attendance will be recorded in every class. 30 Marks as per university AC
policy

2. Continuous Assessment –
• Assignment and/or (video) presentation
• Class tests

3. Final Examination – A comprehensive term final examination will be held at the end of the Term following the
guideline of the Academic Council

Distribution of Marks
Class Attendance – 10%
Continuous Assessment – 20%
Final Examination – 70%

10
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
11

Textbooks

[Harris] Sarah Harris, David Harris – “Digital Design and Computer Architecture, ARM
Edition, Morgan Kaufmann (2015)

[Patterson] David A. Patterson and John L. Hennessy, “Computer Organization and Design
– The Hardware / Software Interface ARM edition” Morgan Kaufmann

[Zhu] Yifeng Zhu “Embedded Systems with ARM Cortex-M Microcontrollers with
Assembly Language and C”

11
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
This Course
• Microprocessor Part:
• Aims to introduce the key concepts and ideas in computer architecture
• Explores the design of modern microprocessors
• Examines important trends and current and future challenges

• Embedded Systems Part:


• Aims to introduce basic concepts of Embedded System design
• Introduces Basic embedded system interfacing and design

12
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
13

Weekly Scheduled Course Plan


Week Lectures Topic Textbook

Fundamentals of microprocessor and computer design, processor data path, Patterson 1


1
1-3 architecture, microarchitecture, introduction to CISC and RISC, complexity,
metrics, and benchmark
2 4-6
Arm Assembly, Micro Architecture Harris 6.2, Harris 7
3 7-9

4 7-9

5 Architecture Harris 6.3-


13-15

6
16-18 Memory Harris 8

13
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
14

Fundamentals of
Microprocessor and
Computer Architecture

14
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Week Content
• What is computer architecture?
• Computer architecture arena and design goals
• Historical performance of computer architecture
• Future trends with multicore processors,
systems on chip (SoCs), and beyond

A system on a chip also written as system-on-a-chip and system-on-chip, is an integrated circuit that integrates all or most components of
a computer or other electronic system. These components almost always include central processing unit (CPU), memory interfaces, on-
chip input/output devices; other components such as radio modems and a graphics processing unit (GPU) – all on a single substrate or
microchip can also exist.

15
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
In the beginning...

The Difference Engine ENIAC


Physical configuration specifies the computation
16
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
EDSAC was the second electronic digital stored-program
computer to go into regular service

Introduction
• The modern computer is less than 100 years old.
• The first electromechanical and valve-based machines
were produced in the 1930s and 1940s.
• Today’s machines are many orders of magnitude faster,
EDSAC replica (2018)1
lower power, more reliable, and cheaper.

A single-board computer (SBC) is a complete computer built on a single circuit


board, with microprocessor(s), memory, input/output (I/O) and other features
required of a functional computer.

Raspberry Pi 2 Arm Cortex-M02

1. EDSAC photo, CC BY-SA 4.0


2. The courier mail, CC BY-SA 4.0
17
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Introduction
• The modern microprocessor is often the engine behind
how we communicate and work.
• It has helped to create our digital world where
communication, computation, and storage is almost free.
• It underpins many scientific breakthroughs and can help
Die photo of Arm Cortex-M3 Microcontroller
us make better use of the world’s limited resources with 16KB flash memory and 4KB RAM 1

The ARM Cortex-M is a group of 32-bit RISC ARM processor cores licensed by ARM. These cores
are optimized for low-cost and energy-efficient integrated circuits, which have been embedded
in tens of billions of consumer devices

Google Data Center2

1. By Zeptobars, CC BY 3.0
18 2. By Connie Zhou, CC BY-NC 4.0
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Orientation

The internet
Motherboard

19
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Computer Motherboard

Power Power

Memory Memory
20
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
The processors go here…

21
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Processor are Cool!
• Chips are made of silicon
• Aka“sand”
• The most adundant element in the earth’s crust.
• Extremely pure (<1 part per billion)
• This is the purest stuff people make

22
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Building Chips

23
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Building Chips

Photolithography

Mask Mask

Resist Resist
SiO2
Silicon Wafer Silicon Wafer SiO2 SiO2
Grow silicon dioxide Silicon Wafer Silicon Wafer
Apply photo resist Expose to UV

Etch SiO2
SiO2 Me Me
(Or not)
Silicon Wafer Silicon Wafer t t
Etch SiO2
24
EEE 415Patterned resist of EEE, BUET
- Department Silicon Wafer Silicon Wafer
© 2021 Dr. Sajid Muhaimin Choudhury
Deposit metal
Building Blocks:Transistors

25
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Building Blocks:Wires

26
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
The Apple M1 Pro and M1 Max are systems-on-chip (SoC) designed by Apple Inc
State of the Art CPU for the MacBook Pro laptop series and the Mac Studio desktop series, based on
the licensed ARM architecture.

27
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Major Players in the Microprocessor world

28
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Major Players in the Microprocessor world
Advanced RISC Machines
Ltd., now known as ARM

29
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Introducing Abstraction

30
EEE 415 - Department of EEE, BUET © 2021 Dr. Sajid Muhaimin Choudhury
Abstractions of the PhysicalWorld…
This Course

Physics/Materials Devices Micro-architecture


Processors
Architectures

31
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
…for the Rest of the System
CSE

JVM

Processor Software
Compilers Languages
Architectures Abstraction Engineers/
Applications
32
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Levels of Abstraction
• Architecture
• A set of specifications that allows developers to write software and firmware
• These include the instruction set.
• Microarchitecture
• The logical organization of the inner structure of the computer
• Hardware or Implementation
• The realization or the physical structure, i.e., logic design and chip packaging

33
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
34

Why Study Architecture?


• Get a fundamental understanding of computers
• Culmination and practical application of all things learned from EEE 201, EEE 207, EEE 303, EEE 313

• It’s cool!
• Microprocessors are among the most sophisticated devices manufactured by people
• How they work (and even that they work) as reliably and as quickly as they do is amazing.

• Architecture is undergoing a revolution


• Modern application specific SoC design is on the rise (AI, Crypto, Big Data)
• Big job opportunity and economic prospect to custom Si design

34
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
The Computer Architecture Arena

Computer
architecture
Application
characteristics
Markets

New
applications

Technology

Source: “Early 21st Century Processors,” S. Vajapeyam and M. Valero, IEEE Computer, April 2004
35
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Design Goals
• Functional – hard to correct (unlike software). Verification is perhaps the highest single cost in the design
process. We also need to test our chips once they have been manufactured, again this can be a costly
process and requires careful thought at the design stage

• Performance – what does this mean? No single best answer, e.g., sports car vs. off-road 4x4 vehicle –
performance will always depend on the “workload”

• Power – a first-order design constraint for most designs today. Power limits the performance of most
systems.

• Security – e.g., the ability to control access to sensitive data or prevent carefully crafted malicious inputs
from hijacking control of the processor

• Cost – design cost (complexity), die costs (i.e., the size or area of our chip), packaging, etc.

• Reliability – do we need to try to detect and/or tolerate faults during operation?

36
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Markets and Features
Each target market will require a different trade-off in terms of power consumption, cost,
area, performance, security, reliability, etc.
Here are some example processor classes (different families) from Arm:
• Cortex-A: high-performance application processors, e.g., for mobile phones
• Cortex-R: deterministic real-time performance, fault detection, and tolerance.
• Cortex-M: energy-efficient embedded devices (“microcontroller” class cores)
• Neoverse: scalable networks of processors on a single chip
• e.g., 8, 16, 64, or 128 cores. Used in datacenters, edge servers, and storage

The ARM Cortex-A and Cortex-B are groups of 32-bit and 64-bit RISC ARM processor cores licensed by Arm.
The ARM Cortex-M is a group of 32-bit RISC ARM processor cores licensed by ARM.

Arm Neoverse is a server chip microarchitecture that ARM's customers — the big chipmakers of the
world — can design chips around for servers in the big datacenters that power the internet.

37
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Camera
The Smartphone CORTEX-M Sensor hub
Touchscreen & CORTEX-M Power management
sensor hub
CORTEX-M
CORTEX-A
• A single CORTEX-M
smartphone will Flash controller
Apps processor
contain many CORTEX-M
CORTEX-A
different CORTEX-M
processor cores.
GPS
2G/3G/4G/5G CORTEX-M
• Why not use a CORTEX-A
single processor? CORTEX-R
CORTEX-M Bluetooth
CORTEX-M
Wi-Fi
CORTEX-R
CORTEX-M
38
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Architecture
• The Architecture is a contract between the hardware and the software.
• The hardware defines a set of operations,their semantics,and rules for their use.
• The software agrees to follow these rules

39
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
The Stored Program Computer

• A very simple model


• Several questions
Processor IO

• How are program


represented?
• How do we get
Memory

algorithms out of our


brains and into that
representation? Data Program

• How does the


computer interpret a
program?

40
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
From Brain to Bits
Your brain

Brain/
Fingers/
SWE

Programming
Language (C, C++, Java)

Compiler

Assembly language

Assembler

Machine code
(i.e., .o files)

Linker

Executable
(i.e., .exe files)
41
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
C Code

int i;
int sum = 0;
int j = 4;

for (i=0; i<10; i++) {


sum = i*sum + j;
}

42
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
In the Compiler
addi $s0, $zero, 0
sum = 0
addi $s1, $zero, 4
j=4
i=0 addi $s2, $zero, 0

addi $t0, $zero, 10


i < 10? bge $s2, $t0

false true
true false

t1 = i * j mult $t0, $s1, $s2


sum = sum + t1 add $s0, $t0
i++; addi $s2, $s2, 1

...
...
Control flow graph w/high- Control flow graph
43
level instructions w/real instructions
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
17
Out of the Compiler
addi $s0, $zero, 0
addi $s0, $zero, 0 addi $s1, $zero, 4
addi $s1, $zero, 4 addi $s2, $zero, 0
addi $s2, $zero, 0
top:
addi $t0, $zero, 10 addi $t0, $zero, 10
bge $s2, $t0
bge $s2, $t0, after

true false body:


mult $t0, $s1, $s2
mult $t0, $s1, $s2
add $s0, $t0
add $s0, $t0
addi $s2, $s2, 1 addi $s2, $s2, 1
br top

after:
...
...

44
EEE 415 - Department of EEE, BUET
Assembly language
© 2019 Arm Limited
44
Program Execution
• This is the algorithm for a stored-program computer
• The Program Counter (PC) is the key

Instruction Read instruction from program storage


Fetch

Instruction Determine required actions and instruction size


Decode

Operand Locate and obtain operand data


Fetch

Execute Compute result value


Result
Store
Deposit results in storage for later use
Next Determine successor instruction (i.e. compute next PC).
Instruction

45
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Historical Performance

46
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Historical Performance Gains
• The “iron law” of processor performance:

Time = instructions executed x clocks per instruction (CPI) x clock period

• Clocks per instruction (CPI)


• We will also refer to Instructions Per Cycle (IPC), i.e., 1/CPI.
• Early machines were limited by transistor count. As a result, they often required
multiple clock cycles to execute each instruction (CPI >> 1).
• As transistor budgets improved, we could aim to get closer to a CPI of 1.

47
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Moore’s Law
• Moore’s Law predicts that the number of
transistors we can integrate onto a chip, for
the same cost, doubles every 2 years.

Gordon Moore and Robert Noyce at Intel in 1970


Source: IntelFreePress, CC BY-SA-2.0
48
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Moore’s Law
• Processor transistor budgets grew
quickly as microarchitectures became
more complex.
• 1985 – Intel 386
275K transistors, die size = 43 mm2
• 2002 – Intel Pentium 4
42M transistors, die size = 217 mm2

49
EEE 415 - Department of EEE, BUET Source: Wgsimon, Wikipedia, CC BY-SA 3.0© 2019 Arm Limited
Clocks Per Instruction (CPI)
• Eventually, the industry was also able to fetch and execute multiple instructions per
clock cycle. This reduced CPI to below 1.
• When we fetch and execute multiple instructions together, we often refer to
Instructions Per Cycle (IPC), which is 1/CPI.
• For instructions to be executed at the same time, they must be independent.
• Again, growing transistor budgets were exploited to help find and exploit this
Instruction-Level Parallelism (ILP).

50
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Parallelism and Pipelining
Engine
Chassis Paint
Time Time
A A
B B
C C

Order of manufacturing Order of


(Car A, B, and then C) manufacturing

51
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
IPC and Instruction Count
• From 1985 to 2002, performance

Clock Frequency (MHz)


improved by ~800 times.
• Clock frequency improved quickly
between 1985 and 2002:
• ~10x from smaller, faster transistors, and
• ~10x from pipelining and circuit-level
advances.
• The remaining gains (~8x) were from
a reduction in instruction count,
better compiler optimizations, and
improvements in IPC.
A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M. Horowitz.
Clock Frequency, Stanford CPU DB. Accessed on Nov. 5, 2019.
[Online]. Available:
http://cpudb.stanford.edu/visualize/clock_frequency

52
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Slowing Single-core Performance Gains
Sustaining single core performance gains became difficult due to:
• The limits of pipelining
• The limits of Instruction-Level Parallelism (ILP)
• Power consumption
• The performance of on-chip wires
As a result performance gains slowed from 52% to 21% per year for the highest
performance processors.
• Power = alpha* fCV2 where f is the frequency, C is capacitance, V the voltage and alpha the percentage of time
switched;
• Dennard scaling predicted that Power would remain constant if we decrease transistor size. That would have
allowed us to increase f (as V decreases with transistor scaling)
• However in reality short-channel effects and other issues in small transistors increase Power consumption,
therefore limiting the f

53
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Multicore Processors
• Eventually, it made sense to shift from
single-core to multicore designs.
• From ~2005, multicore designs became
mainstream.
• The number of cores on a single chip
increased over time.
• Individual cores were designed to be as
power efficient as possible.

e.g., 4 x Arm Cortex-A72 processors,


each with their own L1 caches and a
shared L2 cache
54
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Specialization
• Today, we often need to look beyond general-purpose programmable processors to
meet our design goals.
• We trade flexibility for efficiency.
• We remove the ability to run all programs and design for a narrow workload, perhaps
even a single algorithm.
• These “accelerators” can be 10-1000x better than a general-purpose solution in terms
of power and performance.

55
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Specialization

Graphics Processing Unit Neural Processor Unit (NPU)


(GPU)
56
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
Today’s SoC Designs
• A modern mobile phone SoC (2019)
may contain more than 7 billion mem interface mem interface
transistors.
• It will integrate: L3 cache
• Multiple processor cores memory Neural
Processor
• A GPU
Unit
• A large number of specialized accelerators 4 “big” (NPU)
• Large amounts of on-chip memory GPU
cores
• High bandwidth interfaces to off-chip
memory
4 “small” Other
cores accelerators

mem interface mem interface

A high-level block diagram of a


57
EEE 415 - Department of EEE, BUET mobile phone SoC © 2019 Arm Limited
Trends in Computer Architecture

Approach Outcome
Time Early computers Gains from bit-level parallelism
Pipelining + Instruction-level parallelism
+ Thread-level parallelism/data-level
Multicore/GPUs
parallelism
Greater integration (large SoCs),
+ Accelerator-level parallelism
heterogeneity, and specialization

Note: Memory hierarchy developments have also been significant. The


memory hierarchy typically consumes a large fraction of the transistor
budget.

58
EEE 415 - Department of EEE, BUET © 2019 Arm Limited
The Future – The End of Moore’s Law?
• The end of Moore’s Law has been predicted many times.
• Scaling has perhaps slowed in recent years, but transistor density continues to improve.
• Eventually, 2D scaling will have to slow down.
• We are ultimately limited by the size of atoms!
• Where next?
• Going 3D - Future designs may take advantage of multiple layers of transistors on a single chip.
– Note: the gains are linear rather than exponential.
• Better packaging and integration technologies (e.g., chip stacking)
• New types of memory
• New materials and devices

59
EEE 415 - Department of EEE, BUET © 2019 Arm Limited

You might also like