0% found this document useful (0 votes)
318 views58 pages

EEE415 Week01 Introduction

This document provides an overview of the EEE 415 - Microprocessors and Embedded Systems course at Bangladesh University of Engineering and Technology. The course introduces students to microprocessor architecture, programming, and operating principles using ARM microprocessors. It also covers embedded system design and real-time operating systems. The document outlines the course objectives, instructors, and broader context of where this course fits within the electrical engineering curriculum.

Uploaded by

Meow
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
318 views58 pages

EEE415 Week01 Introduction

This document provides an overview of the EEE 415 - Microprocessors and Embedded Systems course at Bangladesh University of Engineering and Technology. The course introduces students to microprocessor architecture, programming, and operating principles using ARM microprocessors. It also covers embedded system design and real-time operating systems. The document outlines the course objectives, instructors, and broader context of where this course fits within the electrical engineering curriculum.

Uploaded by

Meow
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

6/3/2023

EEE 415 - Microprocessors and Embedded Systems

EEE An Introduction to
Computer
415 Architecture
Week 1

Dr. Sajid Muhaimin Choudhury, Assistant Professor


Department of Electrical and Electronics Engineering
Bangladesh University of Engineering and Technology

EEE 415 - Microprocessors and Embedded Systems

EEE Introducing the


Course
415 Week 1
Lecture 1.1

Dr. Sajid Muhaimin Choudhury, Assistant Professor


Department of Electrical and Electronics Engineering
Bangladesh University of Engineering and Technology

EEE 415 - Week 01 1


6/3/2023

Welcome to EEE 415


• Microprocessor and Embedded Systems

• The last Compulsory course with lab in your syllabus!

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 3
Department of EEE, BUET
3

Course Objectives
 Illustrate the architecture, programming and operating
principle of an ARM microprocessor
 Introduce Microprocessor design using VerilogHDL
 Interpret assembly language programs by executing
ARM instruction sets
 Introduce design of embedded systems and RTOS

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 4
Department of EEE, BUET
4

EEE 415 - Week 01 2


6/3/2023

EEE 415 July 2023 Course Teachers

Course Instructor(s):
Section A: Dr. Sajid Muhaimin Choudhury, Assistant Professor
Email: [email protected]
Office: ECE222, ECE Building
Website: http://sajid.buet.ac.bd/

Section B: Dr. Md. Zunaid Baten, Associate Professor


Email: [email protected]
Office: ECE122, ECE Building
Website: https://mdzunaid.buet.ac.bd/

Section C: Mr. Sadman Sakib Ahbab, Lecturer


Email: [email protected]
Office: ECE530, ECE Building
Website: https://eee.buet.ac.bd/faculty/details/sadman-sakib-ahbab
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 5
Department of EEE, BUET
5

About Section Instructor


Dr. Sajid Muhaimin Choudhury
Education:
• BSc. Engg (2009), Department of EEE, BUET
• MSc. Engg (2011), Department of EEE, BUET
– Microstrip Patch Antenna (Hexaflake Patch Structure)
• PhD (2019), Purdue University, West Lafayette, IN,
USA
– TAed in ECE 362 – Microprocessor and Interfacing
– Metasurface Hologram and Thermophotovoltaics
Career:
• Lecturer, IICT, BUET Nov 2009-Jan 2010
• Lecturer, Department of EEE, BUET 2010-2013
• Assistant Professor, Department of EEE, BUET 2013-
todate

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 6
Department of EEE, BUET
6

EEE 415 - Week 01 3


6/3/2023

About Section Instructor


Research Interest
• Plasmonics and Nanophotonics
• Plasmonic Biosensing
• High Speed Modulator
• Photocatalytic Water Splitting
• Quantum Computing
• Embedded Systems Design

Volunteering and Leadership Experience


• Founding Chair, IEEE Photonics Society Bangladesh Chapter
• Founding Chair, The Optical Society – OSA Bangladesh Section
• Founding Moderator, BUET Optical and Photonics Society
• Chair, IEEE YP Bangladesh 2020-21
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 7
Department of EEE, BUET
7

EEE 415
Course Information

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 8
Department of EEE, BUET
8

EEE 415 - Week 01 4


6/3/2023

Broader Picture: Where This Course Fits

programs
• PHY 165 Electricity and Magnetism,
CSE / Comp Engg Modern Physics and Mechanics
device drivers • EEE 101 - Electrical Circuits I
• EEE 105 - Electrical Circuits II

EEE 415
instructions
registers • EEE 201 - Electronic Circuits I
datapaths • EEE 203 - Energy Conversion I
controllers
• EEE 207 - Electronic Circuits II
adders

EEE 303, EEE 467


memories • EEE 209 - Engineering Electromagnetics
AND gates • EEE 303 - Digital Electronics
NOT gates
• EEE 313 Solid State Devices
amplifiers
EEE 101, 105, 207, 315, 465* filters • EEE 315 Power Electronics
transistors • EEE 415 Microprocessors and
EEE 201, 313 diodes
Embedded Systems
PHY 165, 209, 461* electrons • EEE 465* Analog Integrated Circuits

EEE 415 – Microprocessor and Embedded Systems • EEE 467* VLSI Circuits and Design
Dr. Sajid Muhaimin Choudhury 9
Department of EEE, BUET
9

Course Outcomes of EEE 415


After completing this course, the students will be able to -
CO CO Statement Corresponding Domains and Delivery Method(s) Assessment
No. PO(s)* Taxonomy and Activity(-ies) Tool(s)
level(s)**
CO1 Explain the architecture, instruction PO1 C4 Lectures, Class test,
set, memory and input/output Handouts Final exam
interface of a ARM Microprocessor
and different principles of Embedded
Systems
CO2 Construct simple microprocessor PO7 C3 Lectures, Assignment
systems using state-of-the-art tools Handouts
like Arm Assembly compiler and
VerilogHDL understanding the
limitations
CO3 Illustrate emerging technologies and PO12 C2 -- Video
trends in Microprocessor design to Presentation,
recognize the need to always learn Report
the state-of-the art
Cognitive Domain Taxonomy Levels: C1: Remember; C2: Understand; C3: Apply; C4: Analyse; C5: Evaluate; C6: Create
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET
10

EEE 415 - Week 01 5


6/3/2023

Grading Policy: (As approved by Academic Council)


1. Class Attendance – Class participation and attendance will be recorded in every class. 30 Marks as per
university AC policy

2. Continuous Assessment –
• 1 assignment (20) – Mandatory
• 3 Class tests, best 2 counted (20)

3. Final Examination – A comprehensive term final examination will be held at the end of the Term following
the guideline of the Academic Council

Distribution of Marks
Class Attendance – 10%
Continuous Assessment – 20%
Final Examination – 70%

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET
11

Class Assignments and Presentations


Please make sure that you are added to the Microsoft Teams dedicated for your
section! We will post assignments and announcements there.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 12
Department of EEE, BUET
12

EEE 415 - Week 01 6


6/3/2023

Textbooks

[Harris] Sarah Harris, David Harris – “Digital Design and Computer


Architecture, ARM Edition, Morgan Kaufmann (2015)

[Zhu] Yifeng Zhu “Embedded Systems with ARM Cortex-M Microcontrollers


with Assembly Language and C”

[Patterson] David A. Patterson and John L. Hennessy, “Computer Organization


and Design – The Hardware / Software Interface ARM edition” Morgan
Kaufmann

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 13
Department of EEE, BUET
13
Class/Lecture Schedule

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 14
Department of EEE, BUET
14

EEE 415 - Week 01 7


6/3/2023

Undergraduate Academic Calendar – Jan 2023 Semester

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 15
Department of EEE, BUET
15

Public Holidays
• 10 Jun 2023 (Sat) – BUET Admission Test
• 29 Jul 2023 (Wed) – Muharram
• 13 Sep 2023 (Wed) – Akhari Chahar Shamba
• 28 Sep 2023 (Wed) – Eid e Miladunnabi
• We anticipate 5 extra classes needed throughout the
semester.

• So we will take Each Tuesday 8am, an extra class for first 5


weeks. Please keep routine free at that time.
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 16
Department of EEE, BUET
16

EEE 415 - Week 01 8


6/3/2023

EEE 415 - New Syllabus


• Fundamentals of microprocessor and computer design, processor data path, architecture,
microarchitecture, complexity, metrics, and benchmark; Instruction Set Architecture, introduction to CISC and
RISC, Instruction-Level Parallelism, pipelining, pipelining hazards and data dependency, branch prediction,
exceptions and limits, super-pipelined vs superscalar processing; Memory hierarchy and management, Direct
Memory Access, Translation Lookaside Buffer; cache, cache policies, multi-level cache, cache performance;
Multicore computing, message passing, shared memory, cache-coherence protocol, memory consistency,
paging, Vector Processor, Graphics Processing Unit, IP Blocks, Single Instruction Multiple Data and SoC with
microprocessors. Simple Arm/RISC-V based processor design with VerilogHDL
• Introduction to embedded systems design, software concurrency and Realtime Operating Systems, Arm
Cortex M / RISC-V microcontroller architecture, registers and I/O, memory map and instruction sets,
endianness and image, Assembly language programming of Arm Cortex M / RISC-V based embedded
microprocessors (jump, call-return, stack, push and pop, shift, rotate, logic instructions, port operations, serial
communication and interfacing), system clock, exceptions and interrupt handling, timing analysis of
interrupts, general purpose digital interfacing, analog interfacing, timers: PWM, real-time clock, serial
communication, SPI, I2C, UART protocols, Embedded Systems for Internet of Things (IoT)
Approved
EEE 415 – Microprocessor and Embedded in Academic Council
Systems Dr. Sajid Muhaimin Choudhury 17
Department of EEE, BUET
17

This Course
• Microprocessor Part:
• Aims to introduce the key concepts and ideas in computer architecture
• Explores the design of modern microprocessors
• Examines important trends and current and future challenges

• Embedded Systems Part:


• Aims to introduce basic concepts of Embedded System design
• Introduces Basic embedded system interfacing and design

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 18
Department of EEE, BUET
18

EEE 415 - Week 01 9


6/3/2023

Weekly Lecture Plan


Week Lectures Topic Textbook
01 1-3 Patterson 1
Fundamentals of microprocessor and computer design, processor data
path, architecture, microarchitecture, introduction to CISC and RISC,
complexity, metrics, and benchmark
02 4-6 Assembly Language, Harris 6.1-6.3
03 7-9 Assembly Language Programming, Harris 6.3
04 10-12 Machine Language, Compiling, Assembling, Harris 6.4
05 13-15 Performance Analysis, Single Cycle and Multicycle Processor Harris 7.1-7.4
06 16-18 Pipelining, Hazards, Advanced Microarchitecture Harris 7.5,7.7
07 19-21 Memory Systems – Cache and Virtual Memory Harris 8.2-8.4
08 20-24 Introducing Embedded System Design, IoT, Arm Cortex m4 Lecture Slides
09 25-27 General Purpose Input Output Zhu 14
10 28-30 General Purpose Timers Zhu 15.1-15.3
11 31-33 Interrupts Zhu 11, 15.4
12 34-36 ADC + DAC Zhu 20,21
13 37-39 Serial Communication Zhu 22
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 19
Department of EEE, BUET
19

EEE 415 - Microprocessors and Embedded Systems

EEE Fundamentals of
Microprocessor and
415 Computer
Architecture
Week 1
Lecture 1.1
Dr. Sajid Muhaimin Choudhury, Assistant Professor
Department of Electrical and Electronics Engineering
Bangladesh University of Engineering and Technology

20

EEE 415 - Week 01 10


6/3/2023

Week Content
• What is computer architecture?
• Computer architecture arena and design goals
• Historical performance of computer architecture
• Future trends with multicore processors,
systems on chip (SoCs), and beyond

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 21
Department of EEE, BUET
21

Introduction
• The modern computer is less than 100 years old.
• The first electromechanical and valve-based
machines were produced in the 1930s and 1940s.
• Today’s machines are many orders of magnitude EDSAC replica (2018)1
faster, lower power, more reliable, and cheaper.

Raspberry Pi 2 Arm Cortex-M02


1. EDSAC photo, CC BY-SA 4.0
2. The courier mail, CC BY-SA 4.0
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 22
Department of EEE, BUET
22

EEE 415 - Week 01 11


6/3/2023

What is architecture?
Architecture is the art and technique of designing and
building, as distinguished from the skills associated with
construction.[3] It is both the process and the product of
sketching, conceiving,[4] planning, designing,
and constructing buildings or other structures.[5]

The term comes from Latin architectura; from Ancient


Greek ἀρχιτέκτων (arkhitéktōn) 'architect'; from ἀρχι-
(arkhi-) 'chief', and τέκτων (téktōn) 'creator'.
Architecture - Wikipedia
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 23
Department of EEE, BUET
23

What is architecture?
How do you build a machine that computes?
Quickly,safely,cheaply,efficiently, in technology X,for applicationY,etc.

Civilization advances by extending the number of


important operations which we can perform without
thinking about them. -- Alfred NorthWhitehead

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 24
Department of EEE, BUET
24

EEE 415 - Week 01 12


6/3/2023

Introduction
• The modern microprocessor is often the engine
behind how we communicate and work.
• It has helped to create our digital world where
communication, computation, and storage is almost
free.
Die photo of Arm Cortex-M3 Microcontroller
• It underpins many scientific breakthroughs and can with 16KB flash memory and 4KB RAM1
help us make better use of the world’s limited
resources

Google Data Center2

1. By Zeptobars, CC BY 3.0
EEE 415 – Microprocessor and Embedded Systems Dr.
2. By Sajid Muhaimin
Connie Zhou, Choudhury
CC BY-NC 4.0 25
Department of EEE, BUET
25

Orientation
The internet
Motherboard

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 26
Department of EEE, BUET
26

EEE 415 - Week 01 13


6/3/2023

Computer Motherboard
IO System Bus
(PCI)
Memory
Memory

Power Power

Memory Memory
EEE 415 – Microprocessor andArchitecture
Embedded Systemsbegins about here.
Dr. Sajid Muhaimin Choudhury 27
Department of EEE, BUET
27

Microprocessor

EEE 415
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 28
Department of EEE, BUET
28

EEE 415 - Week 01 14


6/3/2023

The processors go here…

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 29
Department of EEE, BUET
29

Microprocessors are made with Sand!


• In the course, we will primary look at high level design of chips.
• Still want to remind the details of how a chip is made.

• Chips are made of silicon


• Aka“sand”
• The most adundant element in the earth’s crust.
• Extremely pure (<1 part per billion)
• This is the purest stuff people make
• Two fundamental processes:
• Wafer formation
• Chip Fabrication

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 30
Department of EEE, BUET
30

EEE 415 - Week 01 15


6/3/2023

Building Chips: The wafer

Melt in furnace
Ingot formation

Pure Sand

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 31
Department of EEE, BUET Cut and polish

31

Building Chips: Fabrication


Photolithography

Mask Mask

Resist Resist
SiO2
SiO2 SiO2
Silicon Wafer Silicon Wafer
Silicon Wafer Silicon Wafer
Grow silicon dioxide
Apply photo resist Expose to UV

Etch SiO2
SiO2 Met Met
(Or not)
Silicon Wafer Silicon Wafer Silicon Wafer Silicon Wafer
Patterned resist Etch SiO2
Deposit metal

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 32
Department of EEE, BUET
32

EEE 415 - Week 01 16


6/3/2023

The chip manufacturing process

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 33
Department of EEE, BUET
33

Intel Core i7 Wafer

• 300mm wafer, 280 chips, 32nm technology


• Each chip is 20.7 x 10.5 mm
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 34
Department of EEE, BUET
34

EEE 415 - Week 01 17


6/3/2023

Integrated Circuit Cost


Cost per wafer
Cost per die 
Dies per wafer  Yield
Dies per wafer  Wafer area Die area
1
Yield 
(1 (Defects per area  Die area/2))2

• Nonlinear relation to area and defect rate


• Wafer cost and area are fixed
• Defect rate determined by manufacturing process
• Die area determined by architecture and circuit design

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 35
Department of EEE, BUET
35

Cost Optimization in Processing


1. With high volumes, the manufacturing process can be tuned to a particular
design, increasing the yield.
2. It is less work to design a high-volume part than a low-volume part.
3. The masks used to make the chip are expensive, so the cost per chip is lower
for higher volumes.
4. Engineering development costs are high and largely independent of volume;
thus, the development cost per die is lower with high-volume parts.
5. High-volume parts usually have smaller die sizes than low-volume parts and
therefore, have higher yield per wafer.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 36
Department of EEE, BUET
36

EEE 415 - Week 01 18


6/3/2023

Building Blocks:Transistors

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 37
Department of EEE, BUET
37

Building Blocks:Wires

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 38
Department of EEE, BUET
38

EEE 415 - Week 01 19


6/3/2023

State of the Art CPU

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 39
Department of EEE, BUET
39

Major Players in the Microprocessor world

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 40
Department of EEE, BUET
40

EEE 415 - Week 01 20


6/3/2023

EEE 415 - Microprocessors and Embedded Systems

EEE Introducing Architecture

415 Week 1
Lecture 1.2

Dr. Sajid Muhaimin Choudhury, Assistant Professor


Department of Electrical and Electronics Engineering
Bangladesh University of Engineering and Technology

41

What is architecture?
In this course, we will assume that we have the
building blocks of logic in the lower abstraction
levels.

Just as a reminder, we discuss the Silicon


fabrication process in the next few slides

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 42
Department of EEE, BUET
42

EEE 415 - Week 01 21


6/3/2023

Introducing Abstraction

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 43
Department of EEE, BUET
43

Abstractions of the Physical


This Course
World…

Physics/Materials Devices Micro-architecture


Processors
Architectures

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 44
Department of EEE, BUET
44

EEE 415 - Week 01 22


6/3/2023

…for the Rest of the System


CSE

JVM

Processor Software
Compilers Languages
Architectures Abstraction Engineers/
Applications
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 45
Department of EEE, BUET
45

Levels of Abstraction
• Architecture
• A set of specifications that allows developers to write software and firmware firmware is low
level controller
• These include the instruction set. for device's
• Microarchitecture specific
• The logical organization of the inner structure of the computer hardware

• Hardware or Implementation
• The realization or the physical structure, i.e., logic design and chip packaging

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 46
Department of EEE, BUET
46

EEE 415 - Week 01 23


6/3/2023

Why Study Architecture?


• Get a fundamental understanding of computers
• Culmination and practical application of all things learned from EEE 201,
EEE 207, EEE 303, EEE 313

• It’s cool!
• Microprocessors are among the most sophisticated devices manufactured
by people
• How they work (and even that they work) as reliably and as quickly as they
do is amazing.

• Architecture is undergoing a revolution


• Modern application specific SoC design is on the rise (AI, Crypto, Big Data)
• Big job opportunity and economic prospect to custom Si design

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 47
Department of EEE, BUET
47

What is architecture?
In computer engineering, computer architecture is a description of
the structure of a computer system made from component parts.[1]
It can sometimes be a high-level description that ignores details of the
implementation.[2]
At a more detailed level, the description may include the instruction
set architecture design, microarchitecture design, logic design,
and implementation.[3]

1. Dragoni, Nicole (n.d.). "Introduction to peer to peer computing" (PDF). DTU Compute – Department of Applied Mathematics and Computer Science. Lyngby, Denmark.
2.^ Clements, Alan. Principles of Computer Hardware (Fourth ed.).
3.^ Hennessy, John; Patterson, David. Computer Architecture: A Quantitative Approach (Fifth ed.). p. 11.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 48
Department of EEE, BUET
48

EEE 415 - Week 01 24


6/3/2023

Architecture Specifications
• The architecture specifies a contract
between the hardware and software.
• Many different compatible processors
may be implemented, e.g., to meet
different power consumption, cost, area,
and performance goals.
• When a software is written to conform
with an architecture specification, it can
be portable.
IBM System/360
Source of photo: Erik Pitti, CC-BY-2.0

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 49
Department of EEE, BUET
49

Computer Architecture
• Computer architecture is much more than the task of defining the instruction-set
or high-level architecture.
• The computer architect must contribute to, and understand, all levels of design
in order to deliver the most appropriate design for a particular application and
target market.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 50
Department of EEE, BUET
50

EEE 415 - Week 01 25


6/3/2023

Computer Architecture
• Computer architecture is concerned with how best to exploit fabrication
technology to meet marketplace demands.
• e.g., how best might we use five billion transistors and a power budget of two watts to design
the chip at the heart of a mobile phone?
• Computer architecture builds on a few simple concepts, but is challenging as we
must constantly seek new solutions.
• What constitutes the “best” design changes over time and depending on our
use-case. It involves considering many different trade-offs.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 51
Department of EEE, BUET
51

Computer Architecture
• Each level of design imposes different
requirements and constraints, which change
over time. Markets
• History and economics: there is commercial Applications
pressure to evolve in a way that minimizes Operating Systems
disruption and possible costs to the Programming languages and
ecosystem (e.g., software). compilers
• There is also a need to look forward and not Architecture
design for yesterday’s technology and Microarchitecture
workloads! Hardware
• Design decisions should be carefully justified Fabrication Technology
through experimentation.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 52
Department of EEE, BUET
52

EEE 415 - Week 01 26


6/3/2023

The Computer Architecture Arena

Computer
architecture
Application
characteristics
Markets

New
applications

Technology

Source: “Early 21st Century Processors,” S. Vajapeyam and M. Valero, IEEE Computer, April 2004
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 53
Department of EEE, BUET
53

Design Goals
• Functional – hard to correct (unlike software). Verification is perhaps the highest single cost in
the design process. We also need to test our chips once they have been manufactured, again
this can be a costly process and requires careful thought at the design stage

• Performance – what does this mean? No single best answer, e.g., sports car vs. off-road 4x4
vehicle – performance will always depend on the “workload”

• Power – a first-order design constraint for most designs today. Power limits the performance of
most systems.

• Security – e.g., the ability to control access to sensitive data or prevent carefully crafted
malicious inputs from hijacking control of the processor

• Cost – design cost (complexity), die costs (i.e., the size or area of our chip), packaging, etc.

• Reliability – do we need to try to detect and/or tolerate faults during operation?

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 54
Department of EEE, BUET
54

EEE 415 - Week 01 27


6/3/2023

Markets and Features


Each target market will require a different trade-off in terms of power consumption,
cost, area, performance, security, reliability, etc.
Here are some example processor classes from Arm:
• Cortex-A: high-performance application processors, e.g., for mobile phones
• Cortex-R: deterministic real-time performance, fault detection, and tolerance.
• Cortex-M: energy-efficient embedded devices (“microcontroller” class cores)
• Neoverse: scalable networks of processors on a single chip
• e.g., 8, 16, 64, or 128 cores. Used in datacenters, edge servers, and storage

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 55
Department of EEE, BUET
55

Camera
The Smartphone CORTEX-M Sensor hub
Touchscreen & CORTEX-M Power management
sensor hub
CORTEX-M
CORTEX-A
• A single CORTEX-M
smartphone will Flash controller
Apps processor
contain many CORTEX-M
CORTEX-A
different CORTEX-M
processor cores.
GPS
2G/3G/4G/5G CORTEX-M
• Why not use a CORTEX-A
single CORTEX-R
processor? CORTEX-M Bluetooth
CORTEX-M
Wi-Fi
CORTEX-R
CORTEX-M
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 56
Department of EEE, BUET
56

EEE 415 - Week 01 28


6/3/2023

Eight Great Ideas for Computer Architecture

Architecture
§1.2 Eight Great Ideas in Computer
• Design for Moore’s Law

• Use abstraction to simplify design

• Make the common case fast

• Performance via parallelism

• Performance via pipelining

• Performance via prediction

• Hierarchy of memories

• Dependability via redundancy


EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 57
Department of EEE, BUET
57

Below Your Program


§1.3 Below Your Program
• Application software
• Written in high-level language
• System software
• Compiler: translates HLL code to
machine code
• Operating System: service code
– Handling input/output
– Managing memory and storage
– Scheduling tasks & sharing resources
• Hardware
• Processor, memory, I/O controllers

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 58
Department of EEE, BUET
58

EEE 415 - Week 01 29


6/3/2023

Levels of Program Code


• High-level language
• Level of abstraction closer to problem
domain
• Provides for productivity and portability
• Assembly language
• Textual representation of instructions
• Hardware representation
• Binary digits (bits)
• Encoded instructions and data

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 59
Department of EEE, BUET
59

§1.4 Under the Covers

Components of a Computer

The BIG Picture • Same components for


all kinds of computer
• Desktop, server,
embedded
• Input/output includes
• User-interface devices
– Display, keyboard, mouse
• Storage devices
– Hard disk, CD/DVD, flash
• Network adapters
– For communicating with other
computers

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 60
Department of EEE, BUET
60

EEE 415 - Week 01 30


6/3/2023

Opening the Box


Capacitive multitouch LCD screen

3.8 V, 25 Watt-hour battery

Computer board

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 61
DepartmentChapter 1 — Computer Abstractions and Technology — 61
of EEE, BUET
61

Inside the Processor (CPU)


• Datapath: performs operations on data
• Control: sequences datapath, memory, ...
• Cache memory
• Small fast SRAM memory for immediate access to data

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 62
Department of EEE, BUET
62

EEE 415 - Week 01 31


6/3/2023

Inside the Processor


• Apple A5

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 63
Department of EEE, BUET
63

The Architecture Question

• How do we build computer from contemporary silicon


device technology that executes general- purpose
programs quickly,efficiently,and at reasonable cost?

• i.e. How do we build the computer on your desk.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 64
Department of EEE, BUET
64

EEE 415 - Week 01 32


6/3/2023

In the beginning...

The Difference Engine ENIAC


Physical configuration specifies the computation
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 65
Department of EEE, BUET
65

The Stored Program Computer

• The program is data


• i.e.,it is a sequence of numbers that machine interprets

• A very elegant idea


• The same technologies can store and manipulate programs and data
• Programs can manipulate programs.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 66
Department of EEE, BUET
66

EEE 415 - Week 01 33


6/3/2023

The Stored Program Computer

• A very simple model


• Several questions
Processor IO

• How are program


represented?
• How do we get
Memory

algorithms out of our


brains and into that
representation? Data Program

• How does the the


computer interpret a
program?

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 67
Department of EEE, BUET
67

Representing Programs
• We need some basic building blocks -- call
them “instructions”

• What does“execute a program” mean?


• What instructions do we need?
• What should instructions look like?
• Is it enough to just specify the instructions?
• How complex should an instruction be?
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 68
Department of EEE, BUET
68

EEE 415 - Week 01 34


6/3/2023

Program Execution
• This is the algorithm for a stored-program computer
• The Program Counter (PC) is the key

Instruction Read instruction from program storage (mem[PC])


Fetch

Instruction Determine required actions and instruction size


Decode

Operand Locate and obtain operand data


Fetch

Execute Compute result value


Result
Store Deposit results in storage for later use

Next Determine successor instruction (i.e. compute next PC).


Instruction
Usually this mean PC = PC + <instruction size in bytes>
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 69
Department of EEE, BUET
69

What instructions do we need?

• Basic operations are a good choice.


• Motivated by the programs people write.
• Math: Add,subtract,multiply,bit-wise operations
• Control:branches,jumps,and function calls.
• Data access:Load and store.
• The exact set of operations depends on many,
many things
• Application domain,hardware trade-offs,performance,
power,complexity requirements.
• You will see these trade-offs first hand in the assignment

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 70
Department of EEE, BUET
70

EEE 415 - Week 01 35


6/3/2023

What should instructions look like?

• They will be numbers -- i.e.,strings of bits


• It is easiest if they are all the same size,say 32 bits
• We can break up these bits into“fields” -- like
members in a class or struct.
• This sets some limits
• On the number of different instructions we can have
• On the range of values any field of the instruction can
specify

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 71
Department of EEE, BUET
71

Is specifying the instructions sufficient?

• No! We also must what the instructions operate on.


• This is called the“Architectural State” of the machine
Registers -- a few named data values that instructions can operate on
cache stores frequently required Memory -- a much larger array of bytes that is available for storing values.
data, because fast to access. registers
stores data of ongoing operations. How big is memory? 32 bits or 64 bits of addressing.
• 64 is the standard today for desktops and mobiles.
• 32 for lower end phones
• 16/32 for for embedded processors

• •WeThe“Stack
also need to specify semantics of function calls
Discipline,” “Calling convention,” or“Application
binary interface (ABI)”.
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 72
Department of EEE, BUET
72

EEE 415 - Week 01 36


6/3/2023

How complex should instructions be?

• More complexity
• More different instruction types are required.
• Increased design and verification costs
• More complex hardware.
• More difficult to use --What’s the right instruction in this context?
• Less complexity
• Programs will require more instructions -- poor code density
• Programs can be more difficult for humans to understand
• In the limit,decremement-and-branch-if-negative is sufficient
• Imagine trying to decipher programs written using just one instruction.
• It takes many,many of these instructions to emulate simple operations
.
• •Today,what matters most is the compiler
The Machine must be able to understand program
• A program must be able to decide which instructions to use
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 73
Department of EEE, BUET
73

Big “A” Architecture


• TheArchitecture is a contract between the hardware and the software.
• The hardware defines a set of operations,their semantics,and rules for their use.
• The software agrees to follow these rules.
• The hardware can implement those rules IN ANY WAY IT CHOOSES!
 Directly in hardware Via a software layer
 Via a trained monkey with a pen and paper.

• This is a classic interface -- they are everywhere in computer science.


• “Interface,”“Separation of concerns,” “API,” “Standard,”

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 74
Department of EEE, BUET
74

EEE 415 - Week 01 37


6/3/2023

EEE 415 - Microprocessors and Embedded Systems

EEE Improving Computer


Performance:

415 Historic Perspective


Week 1
Lecture 1.2

Dr. Sajid Muhaimin Choudhury, Assistant Professor


Department of Electrical and Electronics Engineering
Bangladesh University of Engineering and Technology

75

Historical Performance Gains


• By 1985, it was possible to integrate a complete microprocessor onto a single
die or “chip.”
• As fabrication technology improved, and transistors got smaller, the
performance of a single core improved quickly.
• Performance improved at the rate of 52% per year for nearly 20 years
(measured using SPEC benchmark data).
• Note: the data are for desktop/server processors

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 76
Department of EEE, BUET
76

EEE 415 - Week 01 38


6/3/2023

Historical Performance Gains


Clock period
• Clock frequency improved quickly
between 1985 and 2002:

Clock Frequency (MHz)


• ~10x from faster transistors, and
• ~10x from pipelining and circuit-level
advances.
• So overall, ~100X of the total 800X
gains came from reduced clock
periods.

Year
A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M. Horowitz.
Clock Frequency, Stanford CPU DB. Accessed on Nov. 5, 2019.
[Online]. Available:
EEE 415 – Microprocessor and Embedded Systems http://cpudb.stanford.edu/visualize/clock_frequency
Dr. Sajid Muhaimin Choudhury 77
Department of EEE, BUET
77

Historical Performance Gains


• From 1985 to 2002, performance improved by ~800 times.
• Over time, technology scaling provided much greater numbers of faster and
lower power transistors.
• The “iron law” of processor performance:

Time = instructions executed x clocks per instruction (CPI) x clock period

• Clocks per instruction (CPI)


• We will also refer to Instructions Per Cycle (IPC), i.e., 1/CPI.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 78
Department of EEE, BUET
78

EEE 415 - Week 01 39


6/3/2023

Clocks Per Instruction (CPI)


• Early machines were limited by transistor count. As a result, they often required
multiple clock cycles to execute each instruction (CPI >> 1).
• As transistor budgets improved, we could aim to get closer to a CPI of 1.
• This is easy if we don’t care at all about clock frequency.
• Designing a high-frequency design with a good CPI is much harder. We need to
keep our high-performance processor busy and avoid it stalling, which would
increase our CPI. This requires many different techniques and costs transistors
(area) and power.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 79
Department of EEE, BUET
79

Clocks Per Instruction (CPI)


• Eventually, the industry was also able to fetch and execute multiple instructions
per clock cycle. This reduced CPI to below 1.
• When we fetch and execute multiple instructions together, we often refer to
Instructions Per Cycle (IPC), which is 1/CPI.
• For instructions to be executed at the same time, they must be independent.
• Again, growing transistor budgets were exploited to help find and exploit this
Instruction-Level Parallelism (ILP).

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 80
Department of EEE, BUET
80

EEE 415 - Week 01 40


6/3/2023

Relative Performance
• Define Performance = 1/Execution Time
• “X is n time faster than Y”
Performanc e X Performanc e Y
 Execution time Y Execution time X  n

 Example: time taken to run a program


 10s on A, 15s on B
 Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
 So A is 1.5 times faster than B
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 81
Department of EEE, BUET
81

Measuring Execution Time


• Elapsed time
• Total response time, including all aspects
– Processing, I/O, OS overhead, idle time
• Determines system performance
• CPU time
• Time spent processing a given job
– Discounts I/O time, other jobs’ shares
• Comprises user CPU time and system CPU time
• Different programs are affected differently by CPU and system
performance

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 82
Department of EEE, BUET
82

EEE 415 - Week 01 41


6/3/2023

CPU Clocking
• Operation of digital hardware governed by a
constant-rate clock
Clock period

Clock (cycles)

Data transfer
and computation
Update state

 Clock period: duration of a clock cycle


 e.g., 250ps = 0.25ns = 250×10–12s
 Clock frequency (rate): cycles per second
 e.g., 4.0GHz = 4000MHz = 4.0×109Hz
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 83
Department of EEE, BUET
83

CPU Time
CPU Time  CPU Clock Cycles  Clock Cycle Time
CPU Clock Cycles

Clock Rate
• Performance improved by
• Reducing number of clock cycles
• Increasing clock rate means, high frequecy
• Hardware designer must often trade off
clock rate against cycle count

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 84
Department of EEE, BUET
84

EEE 415 - Week 01 42


6/3/2023

CPU Time Example


• Computer A: 2GHz clock, 10s CPU time
• Designing Computer B
• Aim for 6s CPU time
• Can do faster clock, but causes 1.2 × clock cycles
• How fast must Computer B clock be?
Clock CyclesB 1.2  Clock Cycles A
Clock RateB  
CPU Time B 6s
Clock Cycles A  CPU Time A  Clock Rate A
 10s  2GHz  20  10 9
1.2  20  10 9 24  10 9
Clock RateB    4GHz
6s
EEE 415 – Microprocessor and Embedded Systems
6s
Dr. Sajid Muhaimin Choudhury 85
Department of EEE, BUET
85

Instruction Count and CPI


Clock Cycles  Instructio n Count  Cycles per Instructio n
CPU Time  Instructio n Count  CPI  Clock Cycle Time
Instructio n Count  CPI

Clock Rate
• Instruction Count for a program
• Determined by program, ISA and compiler
• Average cycles per instruction
• Determined by CPU hardware
• If different instructions have different CPI
– Average CPI affected by instruction mix

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 86
Department of EEE, BUET
86

EEE 415 - Week 01 43


6/3/2023

CPI Example
• Computer A: Cycle Time = 250ps, CPI = 2.0
• Computer B: Cycle Time = 500ps, CPI = 1.2
• Same ISA
• Which is faster, and by how much?
CPU Time  Instruction Count  CPI  Cycle Time
A A A
 I  2.0  250ps  I  500ps A is faster…
CPU Time  Instruction Count  CPI  Cycle Time
B B B
 I  1.2  500ps  I  600ps
CPU Time
B  I  600ps  1.2
…by this much
CPU Time I  500ps
A
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 87
Department of EEE, BUET
87

CPI in More Detail


• If different instruction classes take
different numbers of cycles
n
Clock Cycles   (CPIi  Instruction Count i )
i1

 Weighted average CPI


Clock Cycles n
 Instruction Count i 
CPI     CPIi  
Instruction Count i1  Instruction Count 

Relative frequency

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 88
Department of EEE, BUET
88

EEE 415 - Week 01 44


6/3/2023

CPI Example
• Alternative compiled code sequences using
instructions in classes A, B, C

Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1

 Sequence 1: IC = 5  Sequence 2: IC = 6
 Clock Cycles  Clock Cycles
= 2×1 + 1×2 + 2×3 = 4×1 + 1×2 + 1×3
= 10 =9
 Avg. CPI = 10/5 = 2.0  Avg. CPI = 9/6 = 1.5
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 89
Department of EEE, BUET
89

Performance Summary
The BIG Picture

Instructions Clock cycles Seconds


CPU Time   
Program Instruction Clock cycle

• Performance depends on
• Algorithm: affects IC, possibly CPI
• Programming language: affects IC, CPI
• Compiler: affects IC, CPI
• Instruction set architecture: affects IC,
CPI, Tc
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 90
Department of EEE, BUET
90

EEE 415 - Week 01 45


6/3/2023

§1.7 The Power Wall


Power Trends

• In CMOS IC technology
Power  Capacitive load  Voltage 2  Frequency

×30 5V → 1V ×1000
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 91
Department of EEE, BUET
91

Reducing Power
• Suppose a new CPU has
• 85% of capacitive load of old CPU
• 15% voltage and 15% frequency reduction
Pnew Cold  0.85  (Vold  0.85)2  Fold  0.85
 2
 0.85 4  0.52
Pold Cold  Vold  Fold

 The power wall


 We can’t reduce voltage further
 We can’t remove more heat
 How else can we improve performance?
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 92
Department of EEE, BUET
92

EEE 415 - Week 01 46


6/3/2023

What Is Pipelining?
Engine
Chassis Paint
Time Time
A A
B B
C C

Order of manufacturing Order of


(Car A, B, and then C) manufacturing
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 93
Department of EEE, BUET
93

IPC and Instruction Count


• Of the 800x improvement in
performance (1985-2002), ~100x SpecInt2000 per MHz

is from clock frequency 0.0035

improvements. 0.003

• The remaining gains (~8x) were 0.0025

from a reduction in instruction 0.002

count, better compiler 0.0015

optimizations, and improvements 0.001

in IPC. 0.0005

The graph to the right shows these improvements. It plots 0


1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004
performance (SpecInt2000 benchmark performance per
MHz for Intel processors against time).
Year

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 94
Department of EEE, BUET
94

EEE 415 - Week 01 47


6/3/2023

A Shorter Critical Path


• We can also try to reduce the number
of gates on our critical path.

Critical path length (FO4 delays)


• This can be done by inserting
additional registers to break complex
logic into different “pipeline” stages.
• Advances were also made that
improved circuit-level design
techniques.
• The length of our critical paths reduced
by ~10x (1985-2002).

Year
A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M.
EEE 415 – Microprocessor and Embedded Systems Horowitz. Stanford CPU DB. Accessed on Nov. 5, 2019.
Dr.
[Online]. Available: Sajid Muhaimin Choudhury
http://cpudb.stanford.edu 95
Department of EEE, BUET
95

Moore’s Law
• Moore’s Law predicts that the number of
transistors we can integrate onto a chip, for
the same cost, doubles every 2 years.

Gordon Moore and Robert Noyce at Intel in 1970


Source: IntelFreePress, CC BY-SA-2.0
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 96
Department of EEE, BUET
96

EEE 415 - Week 01 48


6/3/2023

Moore’s Law
• Processor transistor budgets grew
quickly as microarchitectures became
more complex.
• 1985 – Intel 386
275K transistors, die size = 43 mm2
• 2002 – Intel Pentium 4
42M transistors, die size = 217 mm2

EEE 415 – Microprocessor and Embedded Systems Dr. CC


Sajid Muhaimin Choudhury 97
Source: Wgsimon, Wikipedia, BY-SA 3.0
Department of EEE, BUET
97

Better Performance and Lower Power

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 98
Department of EEE, BUET
98

EEE 415 - Week 01 49


6/3/2023

Technology Scaling: Faster Transistors


• From 1985 to 2002, we saw ~7 new
process generations.
• Scaling provides smaller and faster
transistors. Performance improves

Feature Size (um)


~1.4x per generation, so for 7
generations, we have ~10x faster logic
gates.

Year
A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M. Horowitz.
Stanford CPU DB. Accessed on Nov. 5, 2019. [Online]. Available:
http://cpudb.stanford.edu

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 99
Department of EEE, BUET
99

Instruction Count
• Increased datapath width (e.g., 16-bit to 32-bit to 64-bit)
• Larger register files (fewer load/store instructions)
• More complex instructions?
• SIMD instructions

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 0
100

EEE 415 - Week 01 50


6/3/2023

Limits to Single Core Performance


• On-chip wiring
• Wire delays scale relatively poorly compared to logic delays.
• This limits the amount of state reachable in one clock cycle.
• Unfortunately, this limits the performance of large complex processors.
• Limits to pipelining
• Cost of interruptions grow, e.g., impact of cache misses and mispredicted branches.
• Ultimately, some components are difficult or expensive to pipeline.
• There are also practical limits to distributing very high-frequency clocks, registers represent a
finite delay, and we may struggle to balance logic between pipeline stages.
• Limits of Instruction-Level Parallelism (ILP)
• Large amounts of ILP are very difficult to discover and exploit efficiently.
• Our returns on investment quickly diminish, i.e., we must use more power and more
transistors to expose and exploit ever smaller amounts of ILP.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 1
101

Limits to Single Core Performance


• Power consumption
• Historical performance gains have been
impressive, but power consumption
also grew very quickly during the 1980s
and 1990s.
• This happened even with
improvements in fabrication technology
and reductions in supply voltage.
• Power quickly became, and remains, a
first-order design constraint for all
significant markets.

Year
Figure source: Original data collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K.
Olukotun, L. Hammond, and C. Batten. Dotted-line extrapolations by C. Moore: Chuck Moore,
2011, “Data processing in exascale-class computer systems,” The Salishan Conference on
High Speed Computing, April 27, 2011.
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 2
102

EEE 415 - Week 01 51


6/3/2023

Multicore Processors,
Systems on Chip (SoC), and Beyond

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 3
103

Slowing Single-core Performance Gains


To summarize, sustaining single core performance gains became difficult due to:
• The limits of pipelining
• The limits of Instruction-Level Parallelism (ILP)
• Power consumption
• The performance of on-chip wires

As a result performance gains slowed from 52% to 21% per year for the highest
performance processors.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 4
104

EEE 415 - Week 01 52


6/3/2023

Multicore Processors
• Eventually, it made sense to shift from
single-core to multicore designs.
• From ~2005, multicore designs
became mainstream.
• The number of cores on a single chip
increased over time.
• Clock frequencies increased more
slowly.
• Individual cores were designed to be
as power efficient as possible.
e.g., 4 x Arm Cortex-A72 processors,
each with their own L1 caches and a
shared L2 cache
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 5
105

Multicore Processors
Exploiting multiple cores comes with its own set of challenges and limitations:
• Power consumption may still limit performance.
• We need to write scalable and correct parallel programs to exploit them.
• We might not be able to find enough parallel threads to take advantage of our
cores.
• On-chip and off-chip communication will limit performance gains.
• Off-chip bandwidth is limited and may throttle our many cores.
• Cores also need to communicate to maintain a coherent view of memory.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 6
106

EEE 415 - Week 01 53


6/3/2023

Specialization
• Today, we often need to look beyond general-purpose programmable processors
to meet our design goals.
• We trade flexibility for efficiency.
• We remove the ability to run all programs and design for a narrow workload,
perhaps even a single algorithm.
• These “accelerators” can be 10-1000x better than a general-purpose solution in
terms of power and performance.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 7
107

Specialization

Graphics Processing Unit Neural Processor Unit (NPU)


(GPU)
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 8
108

EEE 415 - Week 01 54


6/3/2023

Specialization
What does specialization allow us to do?
• Remove infrequently used parts of the processor.
• Tune the instruction set for common operations or replace with hardwired control
.
• Exploit forms of parallelism abundant in the application(s) – we often see a
specialized processing element and local memory reproduced many times.
• Instantiate specialized memories and tune their widths and sizes.
• Provide specialized interconnect between components.
• Optimize data-use patterns.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 10
Department of EEE, BUET 9
109

Specialization

Data assume a 45 nm process @0.9 V


M. Horowitz, Computing's energy problem (and what we can do about it), IEEE, March. 6, 2014.
[Online]. Available: https://ieeexplore.ieee.org/document/6757323

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET 0
110

EEE 415 - Week 01 55


6/3/2023

Limits to Specialization
• There are costs associated with designing each new accelerator.
• The chip, or “ASIC,” produced may only be competitive in a smaller target
market, reducing profitability.
• Specialization reduces flexibility.
• The logic invested in specialized accelerators is no longer general-purpose.
• Algorithm changes may render specialized hardware obsolete.
• Once we’ve specialized, further gains may be difficult to achieve.
• Specialization isn’t immune to the concept of diminishing returns.

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET 1
111

Today’s SoC Designs


• A modern mobile phone SoC (2019)
may contain more than 7 billion mem interface mem interface
transistors.
• It will integrate: L3 cache
• Multiple processor cores memory Neural
Processor
• A GPU
Unit
• A large number of specialized 4 “big” (NPU)
accelerators GPU
cores
• Large amounts of on-chip memory
• High bandwidth interfaces to off-chip
memory 4 “small” Other
cores accelerators

mem interface mem interface

A high-level block diagram of a


EEE 415 – Microprocessor and Embedded Systems mobile phone SoC Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET 2
112

EEE 415 - Week 01 56


6/3/2023

Trends in Computer Architecture

Time Early computers Gains from bit-level parallelism


Pipelining and superscalar issue + Instruction-level parallelism
+ Thread-level parallelism/data-level
Multicore/GPUs
parallelism
Greater integration (large SoCs),
+ Accelerator-level parallelism
heterogeneity, and specialization

Note: Memory hierarchy developments have also been significant. The


memory hierarchy typically consumes a large fraction of the transistor
budget.
EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET 3
113

The Future – The End of Moore’s Law?


• The end of Moore’s Law has been predicted many times.
• Scaling has perhaps slowed in recent years, but transistor density continues to
improve.
• Eventually, 2D scaling will have to slow down.
• We are ultimately limited by the size of atoms!
• Where next?
• Going 3D - Future designs may take advantage of multiple layers of transistors on a single
chip.
– Note: the gains are linear rather than exponential.
• Better packaging and integration technologies (e.g., chip stacking)
• New types of memory
• New materials and devices

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET 4
114

EEE 415 - Week 01 57


6/3/2023

From Sensors and Smartphones to Servers


1 laptop or server class
An area optimized Mid-range 64-bit processor (e.g., A76 core with
microcontroller core processor (e.g., Arm 512KB of L2 cache)
(e.g., Arm Cortex-M0) Cortex-A55). For
1X smartphones, TVs,
network infrastructure,
1 square represents …
the area of this core 1380X

130X
High-performance
32-bit core
(e.g., Arm Cortex-M7) High-performance
13X
Used in automotive, processor (e.g., Arm
sensor hub, and other Cortex-A73). For
embedded applications. mobile and consumer
devices.
EEE 415 – Microprocessor and Embedded Systems 520X Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET 5
115

Next Week
• Introduction to Assembly Language Programming

EEE 415 – Microprocessor and Embedded Systems Dr. Sajid Muhaimin Choudhury 11
Department of EEE, BUET 6
116

EEE 415 - Week 01 58

You might also like