Multithreading Python
Multithreading Python
Multithreading in Python
• Multithreading enables CPUs to run different parts(threads) of a
process concurrently enabling CPU utilization.
• Multithreading is a threading technique in Python programming to run
multiple threads concurrently by rapidly switching between threads
with a CPU help (called context switching).
• What does multithreading means?
• Multithreading is the ability of a central processing unit (CPU) (or a
single core in a multi-core processor) to provide multiple threads of
execution concurrently, supported by the operating system.
• What is the difference between multithreading and multiprocessing?
• Multiprocessing systems include multiple complete processing units in
one or more cores and provides parallelism. Multithreading aims to
increase utilization of a single core by using thread-level parallelism, as
well as instruction-level parallelism and provides concurrency
Thread
• A thread contains all this information in a Thread Control Block
(TCB):
1. Thread Identifier: Unique id (TID) is assigned to every
new thread
2. Stack pointer: Points to thread’s stack in the process.
Stack contains the local variables under thread’s scope.
3. Program counter: a register which stores the address of
the instruction currently being executed by thread.
4. Thread state: can be running, ready, waiting, start or
done.
5. Thread’s register set: registers assigned to thread for
computations.
6. Parent process Pointer: A pointer to the Process control
block (PCB) of the process that the thread lives on.
• non a single CPU with single core only one thread can be
executed at one point in time. Operating systems make time
slices and assign to these threads based on their priority so that
these threads get more time or less time.
• The time duration in which the instruction(s) belonging to a
thread are inside the CPU under execution status is called time
slice.
Process vs Thread
• What is process in computer architectures?
• Process is an instance of a computer program that is being executed. Any process has 3 basic
components:
• An executable program.
• The associated data needed by the program (variables, work space, buffers, etc.)
• The execution context of the program (State of process)
• What is thread in computer architecture?
• A thread is an entity within a process that can be scheduled for execution. Also, it is the smallest
unit of processing that can be performed in an OS and shares memory.
• A thread is a sequence of such instructions within a program that can be executed independently
of other code. A thread is a set of instructions that forms a control flow within the same
program/process ie it is a subset of a process.
• Threads while executed under a Single Processor System, create an illusion to the user multiple
activities within a process are executed at the same time.
Advantages and Disadvantages of Multithreading
Advantages of Multithreading
• It doesn’t block the user. This is because threads are independent of each other.
• Better use of system resources is possible since threads execute tasks parallely.
• Enhanced performance on multi-processor machines.
• Multi-threaded servers and interactive GUIs use multithreading exclusively.
Disadvantages of Multithreading
• It doesn’t block the user. This is because threads are independent of each other.
• Better use of system resources is possible since threads execute tasks parallely.
• Enhanced performance on multi-processor machines.
• Multi-threaded servers and interactive GUIs use multithreading exclusively.
Parallelism using
Multithreading
• On a single processor machine, parallelism is achieved
by thread scheduling or time slicing.
• Every process has at least one thread, i.e. the process
itself. A process can start multiple threads. The
operating system executes these threads like parallel
"processes".
• Multithreaded programs can run faster on computer
systems with multiple CPUs, because theses threads
can be executed truly concurrent.
How about memory sharing in multithreaded programs?
• Threads of a process can share the memory of global
variables. If a global variable is changed in one thread,
this change is valid for all threads. A thread can have
local variables.
Lifecycle of a
Thread
Thread Class
• In Python threading module is used to create the threads. The
thread module has been considered as "deprecated" for quite a long
time.
• Thread class of threading module provides all the
major functionalities required to create and manage a thread.
• Thread objects are the objects of the Thread class where each
object represents an activity to be performed in a separate thread
of control.
• To create threads, threading modules can be used in two ways.
1. By inheriting the Thread class and by
overriding init() and run()
2. By the use of thread function ie by passing a callable object to
the constructor
By inheriting the Thread class
• To create thread by inheriting the Thread class is as follows:
• 1.Define a class which extends the Thread class
• 2.Override the __init__ constructor
• 3.Override the run() method
• 4.Once a thread object has been made, the start() method can be used to begin the execution of
this activity. join() method can be used to block all other code till the current activity finishes.
• __init__() of the Thread class looks like def __init__(self, group=None,
target=None, name=None, args=(), kwargs=None, *, daemon=None)
• It takes following arguments:
• target: the function to be executed by thread can be specified by mentioning the name. Any Callable object
or task to be invoked by the run() method.
• args: the arguments to be passed to the target function as a tuple
• t = threading.Thread(target=f, args=(i,))
• Other arguments are optional
• group: Represents the concept ThreadGroup.
• name: Name of the thread. If not given it has a decimal value assigned to it by the implementation.
• Kwargs: It helps in initialising the child thread with data. This parameter is a dictionary of
keyword arguments.
Thread Pool
activeCount() Returns the count of Thread objects which are still alive or active.
is used to get the main thread object. In normal conditions, the main thread is the thread from
main_thread()
which the Python interpreter was started.
Starts the activity of a thread. It must be called only once for each thread because it will throw
start() a runtime error if called multiple times. When we call this method, internally
the run() method is invoked which executes the target function or the callable object.
This method denotes the activity of a thread and can be overridden by a class that extends the
run()
Thread class.
It blocks the execution of other code until the thread on which the join() method was called
join() gets terminated. In order to achieve parallelism, the join method must be called after the
creation of all the threads.
Methods of Thread class
• Start(): To start a thread use start method of Thread object. It must be called at most once per thread object. It
arranges for the object's run() method to be invoked in a separate thread of control. Start method will raise
a RuntimeError if called more than once on same thread object.
• Once the thread’s activity is started, the thread is considered ‘alive’. It stops being alive when its run() method
terminates – either normally, or by raising an unhandled exception
• Join(): To stop execution of current program until a thread is complete use join method. Other threads can call a
thread’s join() method. This blocks calling thread until the thread whose join() method is called is terminated.
• When join method is invoked, the calling thread is blocked till the thread object on which it was called
is terminated.
• Significance of join()
• when the join() is invoked from a main thread, the main thread waits till the child thread on which join is invoked
exits.
• if join() is not invoked, the main thread may exit before the child thread, which will result undetermined behavior
of programs and affect program invariants and integrity of the data on which the program operates.
• join() method can also be specified of a timeout value.
• Calling join() on the same thread will result in a deadlock. Hence a RuntimeError is raised when join() is invoked
on the same thread. Calling join() on a thread which has not yet been started also causes a RuntimeError.
By the use of thread function
#creating thread by passing a callable object to constructor
import threading
#Callable function forming the thread body for the Prime Number
Producer
def Produce():
i=0; # a global variable
for x in range(20):
print(i)
i+=1
print("Main thread started")
ChildThread =threading.Thread(target=Produce)
#Start the prime number thread
ChildThread.start()
#Let the Main thread wait for the prime thread to complete
ChildThread.join()
print("Main thread resumed")
print("Main thread exiting")
ACTIVITY
1. Write a python program that prints
fibinocci series using thread.
2. Write a Python program that creates 2
thread one for printing
ACTIVITY 1
from threading import Thread
class FibonacciThread(Thread): # A thread class to produce Fibonacci Numbers
def __init__(self):
Thread.__init__(self)
def run(self): #override the run method to define thread body
print(“FibinocciThread Started")
firstTerm = 0
secondTerm = 1
nextTerm = 0
for i in range(0,20):
if ( i <= 1 ):
nextTerm = i
else:
nextTerm = firstTerm + secondTerm
firstTerm = secondTerm
secondTerm = nextTerm
print(nextTerm)
print("Fibonacci Thread Ending")
print("Main Thread Started")
MyFibonacciThread = FibonacciThread()
MyFibonacciThread.start()
print("Main Thread Started to wait for the Fibonacci Thread to complete")
MyFibonacciThread.join()
print("Main Thread Resumed")
print("Main Thread Ending")
ACTIVITY 2
import threading
def print_cube(num):
print("Cube: {}".format(num * num * num))
def print_square(num):
print("Square: {}".format(num * num))
if __name__ == "__main__":
# creating thread
t1 = threading.Thread(target=print_square, args=(10,))
t2 = threading.Thread(target=print_cube, args=(10,))
# starting thread 1
t1.start()
# starting thread 2
t2.start()
# wait until thread 1 is completely executed
t1.join()
# wait until thread 2 is completely executed
t2.join()
Other Thread Methods
• Every thread has a name associated with it. The name can be
passed to the constructor, or we can set or retrieve name
by using setname() and getname() methods respectively.
• A flag daemon thread can be associated with any thread. The
significance of this flag is that the entire python program exits
when only daemon threads are left. The threads which are
always going to run in the background that provides supports
to main or non-daemon threads, those background executing
threads are considered as Daemon Threads. The Daemon
Thread does not block the main thread from exiting and
continues to run in the background. Eg: Garbage collector
• The main thread object corresponds to the initial thread of
control in the python program. It is not a daemon thread.
EXAMPLE
import threading
import time
def f1():
print(threading.currentThread().getName(),'Starting')
time.sleep(1)
print(threading.currentThread().getName(),'Exiting')
def f2():
print(threading.currentThread().getName(),'Starting')
time.sleep(2)
print(threading.currentThread().getName(), 'Exiting')
t1 = threading.Thread(target=f1) # use default name
t2 = threading.Thread(name='f2', target=f2)
t1.start()
t2.start()
Time Module
• the time module to make one of the thread sleep.
• sleep(sec) :- This method is used to halt the program execution for the time
specified in the arguments.
• time() :- This function is used to count the number of seconds elapsed since the
epoch.
• a library is a collection of related functionality, whereas a module only provides
a single piece of functionality. a library will typically contain multiple modules.
• module is a simple Python file that contains collections of functions and global
variables and with having a .py extension file. It is an executable file and to
organize all the modules we have the concept called Package in Python.
• library is having a collection of related functionality of codes that allows you to
perform many tasks without writing your code. It is a reusable chunk of code
that we can use by importing it in our program, we can just use it by importing
that library and calling the method of that library with period(.).
MULTITHREADING
#MULTITHREADING
import time
from threading import Thread
def sleeper(i):
print ("thread %d sleeps for 5 seconds" % i)
time.sleep(5)
print("thread %d woke up" % i)
for i in range(10):
t = Thread(target=sleeper, args=(i,))
t.start()
Thread Synchronization
• Thread synchronization is defined as a mechanism which
ensures that two or more concurrent threads do not
simultaneously execute some particular program segment
known as critical section.
• Synchronization in a multithreaded program means serialised
access to a critical resource. It can be achieved using join() also.
• What is critical section?
• Critical section refers to the parts of the program where the
shared resource is accessed.
Issue with Critical Section
• A race condition occurs when two or more threads can access
shared data and they try to change it at the same time. As a
result, the values of variables may be unpredictable and vary
depending on the timings of context switches of the processes.
• The need for Synchronization of a critical resource arises from
the fact various combinations of read and write sequences will
lead to data anomalies due to premature overwriting of values
by threads followed by reading of wrong values by the threads.
• Race conditions in a multi threaded program leads to
inconsistent reads and writes resulting in data anomalies and
erroneous behaviour of programs due to data inconsistencies.
Thread
Synchronization
• Issues in thread synchronization
• Deadlock
• Race condition
• Deadlocks are the most feared
issue that developers face when
writing concurrent/multithreaded
applications in python.
RACE CONDITION
A race condition may be defined as the occurring of a condition when two or more threads can access
shared data and then try to change its value at the same time. Due to this, the values of variables may be
unpredictable and vary depending on the timings of context switches of the processes.
#program without synchronization
# creating threads
import threading
t1 = threading.Thread(target=thread_task)
# global variable x
t2 = threading.Thread(target=thread_task)
x = 0
def increment():
# start threads
global x
t1.start()
x += 1
t2.start()
def thread_task():
"""
# wait until threads finish their job
task for thread
t1.join()
calls increment function 100000 times.
t2.join()
"""
if __name__ == "__main__":
for _ in range(100000):
for i in range(10):
increment()
main_task()
def main_task():
print("Iteration {0}: x = {1}".format(i,x))
global x
# setting global variable x as 0
x = 0
How Synchronization is achieved in Python?
1.Using Locks
5.Using Events
Locking Mechanism
• threading module provides a Lock class to deal with
the race conditions. It provides 2 methods
1. acquire([blocking]) : To acquire a lock. A lock can be
blocking or non-blocking.
• When invoked with the blocking argument set to True (the
default), thread execution is blocked until the lock is unlocked,
then lock is set to locked and return True.
• When invoked with the blocking argument set to False, thread
execution is not blocked. If lock is unlocked, then set it to
locked and return True else return False immediately.
2. release() : To release a lock.
• When the lock is locked, reset it to unlocked, and return. If any
other threads are blocked waiting for the lock to become
unlocked, allow exactly one of them to proceed.
• release() method should only be called in the locked state.
If lock is already unlocked, a ThreadError is raised.
• if same thread calls acquire() method again
without release(), the thread will be in the deadlock state.
3. locked(): This method returns true if the Lock object
is acquired.
RACE CONDITION RESOLUTION USING LOCK
MECHANISM
#Multithreading program with synchronisation using locks def main_task():
import threading
global x
# global variable x
# setting global variable x as 0
x=0
def increment():
x=0
global x # creating a lock
x += 1 lock = threading.Lock()
def thread_task(lock):
# creating threads
for _ in range(100000):
t1 = threading.Thread(target=thread_task, args=(lock,))
lock.acquire()
increment()
t2 = threading.Thread(target=thread_task, args=(lock,))
lock.release() # start threads
t1.start()
t2.start()
# wait until threads finish their job
t1.join()
t2.join()
if __name__ == "__main__":
for i in range(10):
main_task()
print("Iteration {0}: x = {1}".format(i,x))
RLock Object
• An RLock is a reentrant lock. It is a synchronization primitive that a certain thread can acquire again and again.
• When locked, an RLock belongs to a certain thread; but when unlocked, no thread owns it.
• The standard Lock doesn’t know which thread is currently holding the lock. If the lock is held, any thread that
attempts to acquire it will block, even if the same thread itself is already holding the lock. In such cases, RLock (re-
entrant lock) is used.
• acquire() can be called multiple times by the same thread without blocking. Keep in mind that release() needs to be
called the same number of times to unlock the resource.
• It is also possible to nest acquire()/release() pairs. The outermost release() resets the lock to the ‘unlocked’ state. It
also lets another blocked thread to continue.
• One good use case for RLocks is recursion, when a parent call of a function would otherwise block its nested call.
Thus, the main use for RLocks is nested access to shared resources.
RACE CONDITION RESOLUTION USING RLOCK
MECHANISM
import threading
num = 0
lock = Threading.Lock()
lock.acquire()
num += 1
lock.acquire() # This will block.
num += 2
lock.release()
• A semaphore is based on an internal counter which is decremented each time acquire() is called and
incremented each time release() is called. If the counter is equal to 0 then acquire() blocks. It is the Python
implementation of the Dijkstra semaphore concept: P() and V(). Using a semaphore makes sense when you want
to control access to a resource with limited capacity like a server.
• Semaphores are used when the resources to be shared among the threads are limited in number.
• A Semaphore can be acquired 'n' number of times without waiting where 'n' is number of resources managed by
the Semaphore. The resources can be sessions, number of threads in a thread pool and so on.
• In python Semaphore is implemented by the Semaphore class of the Threading Module. Each call to the
acquire() of Semaphore object method decreases the counter in the Semaphore, till it becomes zero. Once the
count becomes zero, any further calls to acquire() will be blocked.
Event Object and Condition Object
• Event Objects
• This is a simple mechanism. A thread signals an event and the other thread(s) wait for it.
• They are based on an internal flag which threads can set() or clear(). Other threads can wait() for
the internal flag to be set(). The wait() method blocks until the flag becomes true.
• Condition Objects
• This is a synchronization mechanism where a thread waits for a specific condition and another thread
signals that this condition has happened. Once the condition happened, the thread acquires the lock to
get exclusive access to the shared resource.
• A Condition object is simply a more advanced version of the Event object. It too acts as a
communicator between threads and can be used to notify() other threads about a change in the state
of the program.
Thank You
[email protected]