0% found this document useful (0 votes)
33 views31 pages

Unit 3

Uploaded by

Divyansh Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views31 pages

Unit 3

Uploaded by

Divyansh Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Unit 3 : Concurrent Computing- Thread

Programming Manjrasoft
IBM Power Systems

• Introduction to Thread Programming


• Parallel Computing with Threads
• Programming Applications with Threads

1
Unit Objectives Manjrasoft
IBM Power Systems

After completing this unit you should be able to


• Concurrent Computing – Thread Programming
• Parallelism for Single Machine computation
• Programming Applications with Threads
• What is Thread ? Overview of relations between threads and
Processes
• Thread APIs
• Thread Support in Java and .NET
• Techniques for Parallel Computation with Threads
• Multithreading With Aneka
• Introducing Thread Programming Model
• Aneka Thread Vs. Common Threads
• Thread Life Cycle
• Type Serialization
• Programming Applications with Aneka Threads 2
Concurrent Computing – Thread Programming Manjrasoft
IBM Power Systems
• Throughput computing focuses on delivering high volumes of computation in the
form of transactions.
• Initially related to the field of transaction processing, throughput computing has since
been extended beyond that domain.
• Advances in hardware technologies led to the creation of multi- core systems, which
have made possible the delivery of high-throughput computations, even in a single
computer system.
• Throughput computing is realized by means of multiprocessing and multithreading,
Multiprocessing is the execution of multiple programs in a single machine, where as
multithreading relates to the possibility of multiple instruction streams within the
same program.
• Here we discuss the concept of multithreading and describes how it supports the
development of high throughput computing applications.
• It discusses multithreaded programming, originally conceived to be contained within
the boundaries of a single machine, can be extended to a distributed context and
which limitations apply.
• Aneka Thread programming model will be taken as reference model to review as a
reference model to review a pr
3
Parallelism for single machine computation Manjrasoft
IBM Power Systems

• Parallelism has been a technique for improving the


performance of computers since the early 1960’s, when
Burroughs Corporation designed the D825, the first MIMD
multiprocessor ever produced.
• From there on, a variety of parallel strategies have been
developed.
• In particular, multiprocessing, which is the use of multiple
processing units within a single machine, has gained a good
deal of interest and gave birth to several parallel architectures.
• One of the most important distinctions is made in terms of the
symmetry of processing units.
• Asymmetric multi processing involves the concurrent use of
different processing units that are specialized to perform
different functions.
4
Parallelism for single machine computation
Contd… Manjrasoft
IBM Power Systems

• Symmetric multiprocessing features the use of similar or


identical processing units to share the computation load.
• Other examples, are non uniform memory access (NUMA) and
clustered multi processing, which, respectively, define a
specific architecture for accessing a shared memory between
processors and the use of multiple computers joined together
as a single virtual computer.
• Symmetric and asymmetric multiprocessing are the techniques
used to increase the performance of commodity computer
hardware.
• The introduction of graphical processing units(GPUs), which
are de facto processors, is an application of asymmetric
processing, whereas multi core technology is the latest
evolution of symmetric multiprocessing.
5
Parallelism for single machine computation
Contd… Manjrasoft
IBM Power Systems

• Multiprocessor and especially multi core technologies are now of


fundamental importance because of the physical constraint imposed
on frequency scaling, which has been the common practice for
performance gain in recent years.
• It became no longer possible to increase the frequency of the
processor clock without paying in terms of power consumption and
cooling, and this condition became unsustainable in May 2004, when
Intel officially cancelled the development of two new microprocessors
in favor of multi core development.
• This date is generally considered the end of the frequency-scaling
era and the beginning of multi core technology. Other issues also
determined the end of frequency scaling, such as the continuously
increasing gap between processor and memory speeds and the
difficulty of increasing the instruction-level parallelism in order to
keep a single high-performance core busy.

6
Parallelism for single machine computation
Contd… Manjrasoft
IBM Power Systems
• Multicore systems are composed of a
single processor that features multiple To Memory
processing cores that share the memory.
• Each core has generally its own L1 cache,
and the L2 cache is common to all the CPU: single
cores, which connect to it by means of a die
shared bus.
• Dual- and quad-core configurations are Cache L2
quite popular nowadays and constitute the
standard hardware configuration for
commodity computers.
• Architectures with multiple cores are also
available but are not designed for the
commodity market. Cache L1 Cache L1 Cache L1
• Multicore technology has been used not
only as a support for processor design but
also in other devices, such as GPUs and Core 1 Core 2 Core N
network devices, thus becoming a
standard practice for improving
performance.
7
Parallelism for single machine computation
Contd… Manjrasoft
IBM Power Systems

• Multiprocessing is just one technique that can be used to achieve


parallelism, and it does that by leveraging parallel hardware
architectures.
• Parallel architectures are better exploited when pro- grams are
designed to take advantage of their features.
• In particular, an important role is played by the operating system,
which defines the runtime structure of applications by means of the
abstraction of process and thread.
• A process is the runtime image of an application, or better, a program
that is running, while a thread identifies a single flow of the execution
within a process.
• A system that allows the execution of multiple processes at the same
time supports multitasking. It supports multithreading when it
provides structures for explicitly defining multiple threads within a
process.
8
Parallelism for single machine computation
Contd… Manjrasoft
IBM Power Systems

• multitasking and multithreading can be implemented on top of computer


hardware that is constituted of a single processor and a single core, as was
the common practice before the introduction of multi core technology.
• In this case, the operating system gives the illusion of concurrent execution
by interleaving the execution of instructions of different processes, and of
different threads within the same process.
• This is also the case in multiprocessor/multicore systems, since the number
of threads or processes is higher than the number of processors or cores.
• Nowadays, almost all the commonly used operating systems support
multitasking and multithreading.
• Moreover, all the mainstream programming languages incorporate the
abstractions of process and thread within their APIs, whereas direct support
of multiple processors and cores for developers is very limited and often
reduced and confined to specific libraries, which are available for a subset
of the programming languages such as C/C++.

9
Programming Applications with Threads Manjrasoft
IBM Power Systems

• Modern applications perform multiple operations at the same


time. Developers organize pro grams in terms of threads in
order to express intra process concurrency.
• The use of threads might be implicit or explicit .
• Implicit threading happens when the underlying APIs use
internal threads to perform specific tasks supporting the
execution of applications such as graphical user interface
(GUI) rendering, or garbage collection in the case of virtual
machine-based languages.
• Explicit threading, is characterized by the use of threads
within a program by application developers, who use this
abstraction to introduce parallelism.

10
What is a Thread? Manjrasoft
IBM Power Systems

• A thread identifies a single control flow, which is a logical sequence of


instructions, within a process.
• By logical sequence of instructions, we mean a sequence of instructions that
have been designed to be executed one after the other one.
• More commonly, a thread identifies a kind of yarn, that is used for sewing, and
the feeling of continuity that is expressed by the interlocked fibers of that yarn is
used to recall the concept that the instructions of thread express a logically
continuous sequence of operations.
• Operating systems that support multithreading identify threads as the minimal
building blocks for expressing running code.
• This means that, despite their explicit use by developers, any sequence of
instruction that is executed by the operating system is within the context of a
thread.
• As a consequence, each process contains at least one thread but, in several
cases, is composed of many threads having variable lifetimes. Threads within
the same process share the memory space and the execution context; besides
this, there is no substantial difference between threads belonging to different
processes.
11
What is a Thread contd…? Manjrasoft
IBM Power Systems

• In a multitasking environment the operating system assigns


different time slices to each process and interleaves their
execution.
• The process of temporarily stopping the execution of one
process, saving all the information in the registers (and in
general the state of the CPU in order to restore it later), and
replacing it with the information related to another process is
known as a context switch.
• This operation is generally considered demanding, and the use
of multithreading minimizes the latency imposed by context
switches, thus allowing the execution of multiple tasks in a
lighter fashion.
• The state representing the execution of a thread is minimal
compared to the one describing a process.
12
Overview of relation between threads and
processes Manjrasoft
IBM Power Systems

Process
Shared Memory

Main Thread
Execution
Timeline

Thread Local Storage

Instructions
(program counter)

Thread

Thread
Thread

13
Thread APIs Manjrasoft
IBM Power Systems

• Even though the support for multithreading varies


according to the operating system and the specific
programming languages that are used to develop
applications, it is possible to identify a minimum set of
features that are commonly available across all the
implementations.
• Here we discuss
– POSIX Threads
– Threading Support in java and .NET

14
POSIX Threads Manjrasoft
IBM Power Systems

• POSIX (Portable Operating System Interface for Unix) is a set of standards


for the application programming interfaces for a portable development of
application over the Unix operating system flavors.
• The standards address the Unix-based operating systems but an
implementation of the same specification has been provided for Windows
based systems.
• The POSIX standard defines the following operations: creation of threads
with attributes; termination of a thread; waiting for thread completion (join
operation). In addition to the logical structure of thread, other abstractions
are introduced in order to support proper synchronization among threads
such as semaphores, conditions, reader-writer locks, and others.
• The model proposed by POSIX has been taken as a reference for other
implementations that might provide developers with a different interface but
a similar behavior.

15
POSIX Threads contd… Manjrasoft
IBM Power Systems

• What is important to remember from a programming point of view is the


following:
– A thread identifies a logical sequence of instructions.
– A thread is mapped to a function that contains the sequence of instructions to execute.
– A thread can be created, terminated, or joined.
– A thread has a state that determines its current condition, whether it is executing, stopped,
terminated, waiting for I/O, etc.
– The sequence of states that the thread undergoes is partly determined by the operating
system scheduler and partly by the application developers.
– Threads share the memory of the process and since they are executed concurrently they
need synchronization structures.
– Different synchronization abstractions are provided to solve different synchronization
problems.
• A default implementation of the POSIX 1.c specification has been provided
for the C language. All the functions and data structures available are
exposed in the pthread.h header file, which is part of the standard C
implementations.

16
Threading Support in java and .NET Manjrasoft
IBM Power Systems

• Languages such as Java and C# provide a rich set of


functionalities for multithreaded programming by using an
object oriented approach.
• Since both Java and .NET execute code on top of a virtual
machine the APIs exposed by the libraries refer to managed or
logical threads.
• These are mapped to physical threads (i.e. those made
available as abstractions by the underlying operating system)
by the runtime environment in which programs developed with
these languages execute.
• Despite such a mapping process, managed threads are
considered, from a programming point of view, as physical
threads and expose the same functionalities.

17
Threading Support in Java and .NET contd… Manjrasoft
IBM Power Systems

• Both Java and .NET express the thread abstraction with the class Thread
exposing the common operations performed on threads: start, stop,
suspend, resume, abort, sleep, join, and interrupt.
• Start and stop/abort are used to control the lifetime of the thread instance,
while suspend and resume are used to programmatically pause, and then
continue the execution of a thread.
• These two operations are generally deprecated in both of the two
implementations that favor the use of appropriate techniques involving
proper locks of the use of sleep operation.
• This operation allows pausing the execution of a thread for a predefined
period of time.
• This one is different from the join operation that makes one thread wait until
another thread is completed.
• These waiting states can be interrupted by using the interrupt operation
which resumes the execution of the thread and generates an exception
within the code of the thread to notify the abnormal resumption.

18
Threading Support in Java and .NET contd… Manjrasoft
IBM Power Systems

• The two frameworks provide different support for implementing


synchronization among threads.
• In general the basic features for implementing mutexes, critical
regions, reader-writer locks, are completely covered by means
of the basic class libraries or additional libraries.
• More advanced constructs than the thread abstraction are
available in both of the two languages. In the case of Java
most of them are contained in the java.util.concurrent package,
• while the rich set of APIs for concurrent programming in .NET is
further extended by the .NET Parallel Extension framework
http://download.oracle.com/javase/6/docs/api/java/util/concurre
nt/package-summary.html

19
Techniques for Parallel Computation with
Threads Manjrasoft
IBM Power Systems

• Developing parallel applications requires an understanding of the problem


and its logical structure.
• Understanding the dependencies and the correlation of tasks within an
application is fundamental for designing the right program structure and to
introduce parallelism where appropriate.
• Decomposition is a useful technique that helps understanding whether a
problem is divided into components (or tasks) that can be executed
concurrently.
• If such decomposition is possible, it also provides a starting point for a
parallel implementation, since it allows the breaking down into independent
units of work that can be executed concurrently with the support provided
by threads.
• The two main decomposition/partitioning techniques used area: domain
and functional decompositions.

20
Multithreading with Aneka Manjrasoft
IBM Power Systems

• As applications become increasingly complex, there is


greater demand for computational power that can be
delivered by a single multi-core machine.
• Often, this demand cannot be addressed with the computing
capacity of a single machine.
• It is then necessary to leverage distributed infrastructures
such as Clouds.
• Decomposition techniques can be applied to partition a given
application into several units of work that, rather than being
executed as threads on a single node, can be submitted for
execution by leveraging Clouds.

21
Multithreading with Aneka Contd… Manjrasoft
IBM Power Systems

• Even though a distributed facility can dramatically increase the


degree of parallelism of applications, its use comes with a cost in
term of application design and performance.
• For example, since the different units of work are not executing
within the same process space but on different nodes both the code
and the data needs to be moved to a different execution context.
• the same happens for results that need to be collected remotely and
brought back to the master process.
• Moreover, if there is any communication among the different workers
it is necessary to redesign the communication model eventually by
leveraging the APIs provided by the middleware if any.
• In other words, the transition from a single process multi-threaded
execution to a distributed execution is not transparent and
application redesign and re-implementation are often required.

22
Multithreading with Aneka Contd… Manjrasoft
IBM Power Systems

• The amount of effort required to convert an application often depends on


the facilities offered by the middleware managing the distributed
infrastructure.
• Aneka, as a middleware for managing clusters, Grids, and Clouds, provides
developers with advanced capabilities for implementing distributed
applications.
• In particular, it takes traditional thread programming a step further. It lets
you write multi-threaded applications in the traditional way, with the added
twist that each of these threads can now be executed outside the parent
process and on a separate machine.
• In reality, these “threads” are independent processes executing on different
nodes, and do not share memory or other resources, but they allow you to
write applications using the same thread constructs for concurrency and
synchronization as with traditional threads.
• Aneka threads, as they are called, let you easily port existing multi-threaded
compute intensive applications to distributed versions that can run faster by
utilizing multiple machines simultaneously, with a minimum conversion
effort. 23
Introducing thread Programming model Manjrasoft
IBM Power Systems

• Aneka offers the capability of implementing multi-threaded


applications over the Cloud by means of Thread
Programming Model.
• This model introduces the abstraction of distributed thread,
also called Aneka thread, which mimics the behavior of local
threads but executes over a distributed infrastructure.
• The Thread Programming Model has been designed to
transparently porting high-throughput multi-threaded parallel
applications over a distributed infrastructure and provides
the best advantage in the case of embarrassingly parallel
applications.

24
Introducing thread Programming model Contd… Manjrasoft
IBM Power Systems

• Each application designed for Aneka is represented by a local object that


interfaces to the middleware.
• According to the different programming models supported by the framework
such interface exposes different capabilities, which are tailored to efficiently
support the design and the implementation of applications by following a
specific programming style.
• In the case of the Thread Programming Model, the application is designed
as a collection of threads whose collective execution represents the
application run.
• Threads are created and controlled by the application developer, while
Aneka is in charge of schedule their execution once they have been started.
• Threads are transparently moved and remotely executed while developers
control them from local objects that act like proxies of the remote threads.
• This approach makes the transition from local multi-threaded applications to
distributed applications quite easy and seamless.

25
Introducing thread Programming model Contd… Manjrasoft
IBM Power Systems

• The Thread Programming Model exhibits APIs that mimic the


ones exposed by .NET base class libraries for threading.
• In this way developers do not have to completely rewrite
applications in order to leverage Aneka, but the process of
porting local multi-threaded applications is as simple as
replacing the System.Threading.Thread class and introducing
the AnekaApplication class.
• There are three major elements that constitute the object
model of applications based on the Thread Programming
Model:
– Application
– Threads
– Thread Manager

26
Introducing thread Programming model Contd… Manjrasoft
IBM Power Systems

• Application:
– This class represents the interface to the Aneka middleware and constitutes a local view of a
distributed application.
– In case of the Thread Programming Model the single units of work are created by the
programmer. Therefore, the specific class used will be Aneka.Entity.AnekaApplication<T,M>
with T and M properly selected.
• Threads
– Threads represent the main abstractions of the model and constitute the building blocks of
the distributed application.
– Aneka provides the Aneka.Threading.AnekaThread class, which represents a distributed
thread. This class exposes a subset of the methods exposed by the
System.Threading.Thread class, which has been reduced to those operations and properties
that make sense, or can be efficiently implemented in a distributed context.
• Thread Manager
– This is an internal component that is used to keep track of the execution of distributed
threads and provide feedback to the application.
– Aneka provides a specific version of the manager for this model, which is implemented in the
Aneka.Threading.ThreadManager class.

27
Introducing thread Programming model Contd… Manjrasoft
IBM Power Systems

• As a result, porting local multi-threaded application to Aneka involves


defining an instance of the AnekaApplication<AnekaThread,
ThreadManager> class and replacing any occurrence of
System.Threading.Thread with Aneka.Threading.AnekaThread.
Developers can start creating threads, control their life cycle, and
coordinate their execution similar to local threads.
• Aneka applications expose additional other properties such as
events that notify the completion of threads, their failure, the
completion of the entire application, and thread state transitions.
• These operations are also available for the Thread Programming
Model and constitute additional features that can be leveraged while
porting local multi-threaded applications, where this support needs to
be explicitly programmed.
• Also, the AnekaApplication class provides support for files, which are
automatically and transparently moved in the distributed
environment.
28
Programming applications with Aneka Threads Manjrasoft
IBM Power Systems

• In order to show how it is possible to quickly


port multi-threaded application to Aneka
threads, we provide a distributed
implementation of the examples previously
discussed for local threads.

29
Aneka Threads Application Model Manjrasoft
IBM Power Systems

• In Thread Programming Model is a programming model where the


units of work are created as Aneka threads by the programmer.
Therefore, it is necessary to utilize the AnekaApplication<W,M>
class, which is the application reference class for all the
programming models falling into this category.
• The Aneka APIs make a strong use of generics and characterize the
support given to different programming models through template
specialization.
• Hence, to develop distributed applications with Aneka threads it is
necessary to specialize the template type as follows:
AnekaApplication<AnekaThread, ThreadManager>
• This will be the class type for all the distributed applications using the
Thread Programming Model. These two types are defined in the
Aneka.Threading namespace noted in the Aneka.Threading.dll
library of the Aneka SDK.
30
Aneka Threads Application Model Manjrasoft
IBM Power Systems

• Another important component of the application model is the


Configuration class, which is defined in the Aneka.Entity
namespace (Aneka.dll).
• This class contains a set of properties that allow the application
class to configure its interaction with the middleware such as
the address of the Aneka index service, which constitutes the
main entry point of Aneka Clouds, the user credentials required
to authenticate the application with the middleware.
• some additional tuning parameters, and an extended set of
properties that might be used to convey additional information
to the middleware.

31

You might also like