0% found this document useful (0 votes)
148 views

High Performance Computing: Course Introduction

This document introduces a graduate course on high performance computing (HPC). The course aims to provide concepts and techniques in HPC, including parallelization methods, performance evaluation, and programming models. It will also cover state-of-the-art HPC technologies like multicore processors, GPUs, and high-performance networks. The course takes a computer science perspective and will study how these technologies enable advanced applications in fields like simulation, analytics, machine learning and more.

Uploaded by

mmit123
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views

High Performance Computing: Course Introduction

This document introduces a graduate course on high performance computing (HPC). The course aims to provide concepts and techniques in HPC, including parallelization methods, performance evaluation, and programming models. It will also cover state-of-the-art HPC technologies like multicore processors, GPUs, and high-performance networks. The course takes a computer science perspective and will study how these technologies enable advanced applications in fields like simulation, analytics, machine learning and more.

Uploaded by

mmit123
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

HIGH

PERFORMANCE
COMPUTING
Course Introduction

Master Degree Program in Computer Science and Networking


Academic Year 2020-2021
Dr. Gabriele Mencagli, PhD
Department of Computer Science
University of Pisa, Italy

01/20/2021 High Performance Computing, G. Mencagli 1


Course Goals and Content
 Providing a solid framework of concepts and techniques in High-
Performance Computing (HPC)
• Parallelization methodology and models
• Techniques to enhance parallel processing
• Performance evaluation (analytic cost models)
• Support to parallel programming models and software development tools

A methodology for studying current and future systems

 Technology: state-of-the-art and trends


• Parallel processors Studied
• Multiprocessors
according to a
• Multicore / Manycore / GPUs / FPGAs
Computer
• High-performance networks
Science
• Shared vs distributed memory architectures
• Programming models and their run-time support
Slant!

01/20/2021 High Performance Computing, G. Mencagli 2


Technologies and Applications
■   an enabling technology for advanced industries and users
HPC:
 Technology pull  more application domains needing HPC
• Large-scale simulations
• Big data, real-time analytics, data stream processing
• Machine learning and artificial intelligence
• Real-time control
• Data-intensive and compute-intensive applications
 Technology push  parallel machines are evolving
• Processor technology: multi-/many-core (10-100 or even 1,000 cores on the same chip),
GPUs, FPGAs
• Memory technology: from traditional DRAM to ‘intelligent’ memories
• Interconnect technology: high-bandwidth, low-latency networks (e.g., Infiniband,
Myrinet)
• Computing power: Petascale ( operations per second) – Exascale ( operations per
second)
• How many processors/cores are expected in the near future?
• Additional challenge: minimization of power/energy consumption (Green Computing)

01/20/2021 High Performance Computing, G. Mencagli 3


Historical Path of
Computing Systems

01/20/2021 High Performance Computing, G. Mencagli 4


Early Stages
 The first computer was designed in the years 1943-1946 by researchers at the
Pennsylvania University. It was called ENIAC (Electronic Numerical Integrator and
Computer). It was based on vacuum tubes
It was developed for supporting the war effort
Computing the trajectory tables for new weapons
30 tons, occupy 1500 square feet of floor space

 Fundamental contribution by John Von Neumann in 1945 was the concept


of ”Stored Program”: the program can be represented in a form suitable
for storing in memory alongside the data
First implementation of this idea in 1952
IAS Computer (Princeton) Central Processing Unit (CPU)

Arithmetic-
Logic
Unit (CA)

Memory of 1,000 locations called words Main


Memory
(M)
I/O
Equip-
ment
(I, O)

Program

Only less than 40KB


Control
Unit (CC)

It allows computers to be easily programmed Figure 2.1 Structure of the IAS Computer

01/20/2021 High Performance Computing, G. Mencagli 5


Revolution of Transistors
 Transistors were invented in 1947 at the Bell Labs (USA). Much
smaller and cheaper than Vacuum Tubes used in the first-generation
computers
 More complex arithmetic and logic units and control units
 High-level programming languages were introduced
 First computers of this kind were officially developed in late 1950s.
Front-runner companies delivering this new technology were NCR and
RCA followed by IBM with the 7000 series

IBM 700/7000 Series: 700 (Vacuum) 7000 (Transistors)

01/20/2021 High Performance Computing, G. Mencagli 6


Integrated Circuits
 Transistors were developed as self-contained components. In 1958 the
concept of Integrated Circuit was invented
 Before, electronic equipments were composed by
discrete components (e.g., transistors, capacitors and
resistors) separately packaged into their silicon-based
containers and wired together onto circuit boards
 The entire manufacturing process was expensive and
cumbersome
 The integration of large numbers of tiny transistors into a small chip
composed of semiconductor material (normally Silicon)
 This results in circuits that are orders of magnitude smaller, faster, and
less expensive than those constructed of discrete electronic components
 Components of Integrated Circuits: memory cells (data storage), logic
gates (data processing), paths (data movement)

01/20/2021 High Performance Computing, G. Mencagli 7


Wafter and Chips
 A tiny wafer of silicon is divided into a matrix of small areas, each a few
millimeters square
 The identical circuit pattern is replicated on each square area
 Each area is separated in a chip
 Chips are composed of gates,
memory cells and a number of input
and output attachment points Wafer

 Chips are packaged in housing that


protects them and provides pins for Chip

attachment with other devices


 They are called packages and are Gate

ready to be assembled to build Packaged


chip

complex systems Figure 2.7 Relationship Among Wafer, Chip, and Gate

 A number of these packages can then be interconnected on a printed


circuit board to produce larger and more complex circuits
01/20/2021 High Performance Computing, G. Mencagli 8
Moore’s Law
 Initially, only a few gates or memory cells could be reliably
manufactured and packaged together
 These early integrated circuits are referred to as Small Scale
Integration (SSI)

 Evolution steps
• SSI
• LSI
• VLSI
• USLI
• GSI

 Moore’s Law (1965): observed number of transistors that could be put


on a single chip doubles every year (circuits become faster and cheaper)
01/20/2021 High Performance Computing, G. Mencagli 9
Performance Balance

Increase the number


of bits that are
retrieved at one time
by making the
memory “wider”
rather than “deeper”
and by using wider
data paths
Reduce the
frequency of memory
access by
incorporating
increasingly complex
and efficient cache
structures between
the processor and
main memory
Increase the
Change the memory
interconnect
interface to make it
bandwidth between
more efficient by
processors and
including a cache or
memory by using
other buffering
higher speed networks
scheme on the
to buffer and
memory chip
structure data flow

01/20/2021 High Performance Computing, G. Mencagli 10


Clock Speed and Power Density
 Increase the processor speed by shrinking logic gate size
• With more gates, packed more tightly, we can increase the clock rate
• Because propagation time for signals is reduced
 Increase size and speed of caches
 Change processor organization and architecture (pipelining, superscalar
processors)
 However, power
density increases with 107

density of logic and 106

105
Transistors (Thousands) 
Frequency (MHz) 
Power (W) 

clock speed. 104

103
Cores 

Dissipating heat 102

becomes a complex 10 

issue 0.1 
1970  1975  1980  1985  1990  1995  2000  2005  2010 

 A new architectural Fi g u re   2 . 1 1       P ro c e s s o r  Tre n d s    

design was needed

01/20/2021 High Performance Computing, G. Mencagli 11


Chip Multi-Processors (CMPs)
 New strategy: placing multiple processors (cores) on the same chip, with a large
shared cache. This technology is called Multicores or CMPs
 The power consumption of memory logic is much less than that of processing
logic
The use of multiple processors on
the same chip provides the
potential to increase
As the logic density performance without increasing
the clock rate
continues to rise, the
trend was to increase the Strategy is to use simpler
processors on the chip rather
than one more complex processor
number of cores
integrated on the same
With more processors larger
chip (from 2 to 8, 16 and caches are justified

more)
As caches became larger it made
performance sense to create two
and then three levels of cache on
a chip

However, software
should support the
effective use of
multiple processors

01/20/2021 High Performance Computing, G. Mencagli 12


HPC Computing Platforms
 Multicores are now used as bulding blocks for large Multiple-Instruction
Multiple-Data (MIMD) parallel systems
 CMP-based Shared-Memory Systems and CMP-based Distributed-
Memory Systems

Several Multicored CPUs share a


memory system through a proper
internal interconnection network

Several shared-memory systems are

Network
interconnected through a network. .
Each memory can be accessed only .
.
by the CPUs in the same shared-
memory node

01/20/2021 High Performance Computing, G. Mencagli 13


Graphical Processing Units
 In addition to multicores and their interconnection in shared- and
distributed-memory systems, GPUs represent a futher fundamental
building block of HPC systems
 Born to accelerate graphical tasks. However, they are now available for
general-purpose computing (GP-GPU) through proper programming
models (i.e. CUDA and OpenCL)

Major vendors
• Nvidia
• AMD
• ASUS
• Intel
• …

 Much more resources devoted to processing units


 Single-Instruction Multiple-Threads (SIMT): combination of Single-
Instruction Multiple-Data (SIMD) with hardware multi-threading

01/20/2021 High Performance Computing, G. Mencagli 14


HPC and Clouds
 HPC systems are notoriously expensive architectures
 Cloud Computing is a way to provide in a cost-effective manner HPC
resources to users by some Cloud providers
 Pay-as-you-go model

01/20/2021 High Performance Computing, G. Mencagli 15


Top500 List
Last
update
June
2020

01/20/2021 High Performance Computing, G. Mencagli 16


Course Content

01/20/2021 High Performance Computing, G. Mencagli 17


Approach of the Course
 Study basic HPC methodologies and technologies thoroughly
• Cloud Computing is based on this knowledge, and adds some technological issues (job
scheduling, dynamic QoS control, elasticity, security)
• Cloud: 2nd year with a specific course
 Parallel architectures, at multi-core level
• Shared memory
• Interconnection networks
• Clusters of shared-memory machines: data centers
 Parallel programming models
 How to build and how to use HPC machines, and what can be expected from
them (performance models)
 Metholological and conceptual imprint, with application to current
and future technologies
 Not laboratory
 2nd year: laboratory, commercial and open-source programming tools

01/20/2021 High Performance Computing, G. Mencagli 18


“Hardware-Software Gap”

Programmability Wall

01/20/2021 High Performance Computing, G. Mencagli 19


Parallel Processing Models
 Parallelization of sequential computations
• Which application design methodology?
• Which programming technologies?
• What are the key issues in the design of run-time supports for parallel
processing frameworks?
• How to design and how to dimension a parallel program
• How to map a parallel program onto a parallel machine
• How to evaluate/predict performance
• What is performance? What are the important metrics?
 Uniform ‘hardware-software’ methodology
• How to model parallel architectures
• How to model parallel programs
• How to identify bottlenecks both in the parallel program design as well as at
the architectural level
• How parallel programs are designed for, and executed by, parallel
architectures

01/20/2021 High Performance Computing, G. Mencagli 20


Course Organization
and
Working Approach

01/20/2021 High Performance Computing, G. Mencagli 21


Course Program and Lectures
 The program will cover two fundamental parts
• Parallel Programming Models
• Parallel Architectures
 We will study the two parts in an interleaved manner

Par. Models Par. Arch.s Par. Models Par. Arch.s


 We have three lectures per week for a total of 36 lectures (72 hours of
frontal lectures)
Tuesday 11:15 – 13:00
Wednesday 09:15 – 11:00
Thursday 16:15 – 18:00

 Two thirds of the hours will be used for introducing and explaining theoretical
concepts
 One third of the hours will be used for doing exercises

01/20/2021 High Performance Computing, G. Mencagli 22


Organization and Exam
 Question time
• Tuesday 14:00 – 17:00
• Send me an email to ask the question time slot
• Question time will be provided through Microsoft Teams
 Homeworks
• They are proposed generally every two weeks
• They consist of a set of useful exercises related to the last lectures
• Not mandatory (but warmly recommended)
 Exam modality
• Written part with exercises
• Oral part with small exercises and theoretical questions
• The rules can be changed due to the Covid-19 pandemic
 Exam attempts (resits)
• 6 sessions on January , February, May, June, July and September
• Students have to register (through the UNIPI web portal) to the exam session
they want to attend
• Students have to do the written part and the oral part in the same session
(except some exceptional cases)
01/20/2021 High Performance Computing, G. Mencagli 23
Home Page of the Course
 The official web page of the course is at the following URL (within my personal
homepage)
http://www.di.unipi.it/~mencagli

 Downloadable material
• Slides
• Homeworks
• Exam results and solutions
01/20/2021 High Performance Computing, G. Mencagli 24
Microsoft Teams (2020-2021)
 Due to the Covid-19 pandemic, the Department of Computer Science
(University of Pisa) decided that all the lectures of the Academic Year 2020-
2021 (first semester) will be given online only
 The course contents (i.e. slides, homeworks, exams with solutions) will be
continuously upload in my personal web page
 Such content will be available also in the official Microsoft Teams page of the
HPC course (code m66f66n) together with video recordings
 Several channels are available
• General: channel with general posts about the working
activities of the course. In the “file” section you can find
some useful files about the course
• Question time: channel to book a question time and to
start an online meeting to discuss with the professor about
problems and explanations
• Lecture recordings: channel where the lectures will be
given according to the official time schedule. Lectures
will be recorded and made available to students for their
offline studying activity

01/20/2021 High Performance Computing, G. Mencagli 25


Reference Book
Pisa University Press
Paper edition
Digital edition (2014)

100% faithful coverage


of course lectures and
exercises
(plus other topics: fully
separate from exam
program)

We will follow accurately


the book step-by-step

Slides are not sufficient


for a good preparation!

01/20/2021 High Performance Computing, G. Mencagli 26


Errata Corrige
 In the home page of the course, a small document is available
containing integations and the errata corrige of the book

 Please, donwload and print it


01/20/2021 High Performance Computing, G. Mencagli 27
Working Approach
 Critical aptitude and synthesis capability must be properly developed
 Exam: just repeating parts of the book/slides is only the necessary
condition – in no way it is sufficient!
 REQUIRED ABILITY TO SOLVE PROBLEMS BY PROPERLY
COMBINING VARIOUS COURSE CONCEPTS AND TECHNIQUES

IMPORTANT
 Interaction with the teacher is strongly recommended
• Questions during the lectures
• Presentation and discussion of exercises and problems
• Homeworks (correction mainly during question time)
• Question time to be booked by email and then provided through the
Microsoft Teams platform online

01/20/2021 High Performance Computing, G. Mencagli 28


Background and Prerequisites
 I will assume that students know some basic concepts of Computer
Architectures from their previous studies
 To be more precise a Bachelor degree-level course on Structured
Computer Architecture is assumed to be known
• Hardware level (registers, combinational and sequential circuits)
• Firmware level (firmware units, microprogramming)
• Assembler level (RISC vs CISC instruction sets)
• CPU architecture, I/O, Communication
• Virtual Memory, Addressing Spaces
• Compiler (basic concepts)
• Memory hierarchies and caching
• Process level and its run-time support
 Capability to use concepts in studying real, complex systems and
their interrelation with software and programming tools

01/20/2021 High Performance Computing, G. Mencagli 29


Entry Test
 What is? A test for self-evaluation on background concepts and
prerequisites
 Nature of the test
• Completely optional
 Is it part of the Exam?
• Obsolutely not, it is a test useful for the students to check their degree of preparation
 When?
• Second week of the course
• The test will be available online through Microsoft Teams
 Modality
• Written test with some multiple-choice questions about concepts that students should
have studied during their Bachelor Degree program
 In case of low rate (< 70%)?
• Students are kindly invited to study the Background part of the text book

01/20/2021 High Performance Computing, G. Mencagli 30


Appendix on Background

Last 100 pages of the textbook….

01/20/2021 High Performance Computing, G. Mencagli 31


Appendix on Background
 The student must be aware of the way in which the Appendix on prerequisites
has to be utilized
 No specific lecture will be dedicated to Appendix subjects
 Instead, prerequisites will be reminded when needed
 In no way such SHORT reminders can replace the full treatment of the
prerequisite subjects
 The student is strongly invited to use the Appendix in order to
• be (to become) able of applying the needed concepts and techniques
• and to fill any possible gap whenever it is necessary

 Other reference books on prerequisites (optional)


• D.A. Patterson, J.H. Hennessy, “Computer Organization and Design:
the Hardware/Software Interface”, Morgan Kaufman Publishers Inc.
• A. Tanenbaum, “Structured Computer Organization”, Prentice-Hall

01/20/2021 High Performance Computing, G. Mencagli 32

You might also like