0% found this document useful (0 votes)
7 views19 pages

OS بحث 1

Uploaded by

hazemzomahashem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views19 pages

OS بحث 1

Uploaded by

hazemzomahashem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

‫حازم ابو هاشم محمد ‪Name:‬‬

‫‪Sec: 4‬‬
‫‪Id: 323240115‬‬
An Analysis of Python Threads:
Parallelism vs. Concurrency in
Single-Core and Multi-Core
CPU Environments
Introduction to Python Threads:
Threading is the go-to way to implement concurrency or parallelism in programming. Python
offers numerous constructs and classes to leverage threads for better performance and
responsiveness. However, it is also important to follow thread safety best practices to avoid
critical issues like race conditions and deadlocks.

In this article, we will take a deep dive into threading in Python. Let’s discuss what threading
is and why you might want to use it. Then we will look at ways to create and manage
threads in Python. We will also explore some of the challenges of using threading and how
to avoid them.

What is threading?
Threading (or multi-threading) is an execution model that enables programmers to implement
concurrency or parallelism. A thread is a lightweight unit of a program that can run independently. All
threads share the same memory space and resources of the main program.

Concurrency vs. parallelism:


In a single-core environment, when you implement multi-threading, the processor switches
between different threads, allowing each to run for a brief time slice. This process is known
as concurrency.
It's important to note that concurrency doesn't guarantee true parallel execution. Threads
take turns using the single core, creating an illusion of parallelism to make the program feel
more dynamic.

However, the real power of multi-threading is realized in a multi-core environment, where


each thread can run on a separate processor core simultaneously. This simultaneous
execution is termed parallelism. It results in significant performance gains, as each thread
can run on a different core at the same time without the need for context switching.

Why do we use threading?


There are several benefits of using threading in Python or programming in general:

 Improved performance: Threads allow you to do more work in less time. For
example, if you have to make API calls to two different servers, you can create and
run two threads simultaneously, one for each API call.
 Responsiveness: Threading boosts program responsiveness by allowing it to
handle multiple requests simultaneously. For example, a web server may create a
new thread for each incoming request. This allows the web server to concurrently
respond to the requests of multiple users, enhancing overall experience.
 Simplified code: When done right, threading can simplify your code by allowing you
to break down large tasks into smaller, more manageable chunks. This adds to the
overall maintainability of a codebase.
 Increased scalability: Threading can also improve the scalability of your program by
allowing it to be adapted to run on multiple cores or machines. For example, if you
are migrating from a single-core to a multi-core architecture, you can leverage thread
parallelism to scale up your application.
 Simplified communication: Threads share the same memory space, which makes
communication between them more straightforward than with multiple processes.
This simplifies the implementation of tasks that require sharing data or coordination
between different parts of a program.
Multithreading vs multiprocessing:
Multi-processing involves running multiple processes simultaneously. Unlike a thread, each
process gets its own dedicated memory space. Multi-processing is well-suited for intensive
CPU-bound tasks, and it can take full advantage of multi-core processors. Each process
may run on a separate CPU core, increasing overall performance.

Processes typically consume more system resources than threads due to their independent
memory space. This can also limit the number of processes you can run concurrently. Inter-
process communication (IPC) between processes is generally more complicated and
expensive than inter-thread communication.

Threading vs. asynchronous programming:


Asynchronous programming is a way of writing event-driven, non-blocking code. In an
asynchronous program, time-consuming tasks (like I/O tasks) are performed in the
background, without blocking the main thread. This task delegation is done through
callbacks, promises, or other asynchronous techniques.

Asynchronous programming is a great fit for I/O-bound scenarios, such as web scraping,
network requests, or database queries. It prevents blocking and maximizes the utilization of
a single thread.

When to use threads vs. processes vs.


asynchronous programming:
Use threads when:

 You want to boost responsiveness by handling I/O tasks efficiently.


 You want efficient communication between different tasks.
 You want more granular control over the execution and scheduling of tasks in a
single process.
 Resource efficiency is a priority. If you want to execute a large number of concurrent
tasks without overloading system resources, threads are a good choice.

Use processes when:

 You have multi-core CPUs and have to perform CPU-bound tasks.


 You require strong isolation between tasks.
 You want to scale to multiple machines, with each process running on a different
machine.
 Resource intensiveness is acceptable.

Use asynchronous programming when:

 You want to simplify the code by avoiding the need to worry about thread
synchronization.
 You want to improve the performance of the program by avoiding the overhead of too
many context switches.
 You are building an application that handles real-time updates to data, like chat or
streaming applications. Asynchronous apps excel at handling streams of data without
blocking.
 You are building a responsive web application using JavaScript frameworks or
libraries, like Node.js or React.

Threading in Python – A definitive guide:


Now that we have a good understanding of what threading is and when to use it, let's transition to
talking about implementing threading in Python.

Overview of the Threading module in Python:


The Python standard library offers a handy “threading” module to work with threads. The
module makes it easy to create and manage threads in Python programs. Let’s get started!

A note regarding the Global Interpreter Lock


It’s worth mentioning here that the CPython implementation uses a Global Interpreter Lock
(GIL) for thread synchronization. The GIL restricts the execution of Python bytecode to one
thread at a time, even on multi-core processors.

For applications that require maximum resource utilization on multi-core machines, the
official Python documentation recommends using the “multiprocessing” module. However,
it's important to note that I/O-bound tasks can still benefit greatly from threading. During I/O-
bound operations, like file I/O or database queries, the GIL is released, allowing multiple
threads to progress concurrently.

Creating threads

You can create a new thread by calling the Threading.Thread() constructor. The constructor
accepts different arguments, including the thread target function, the thread name, and a list
of thread arguments. The target function contains the code that the thread will execute when
it starts.

For example, the following piece of code imports the threading module, defines a target
function, and then creates a new thread object using the Threading.Thread() constructor.

import threading

def my_function():

# Your thread's task goes here

my_thread = threading.Thread(target=my_function)
Starting threads

The above code created a thread object, but didn’t start it. To start the execution of a thread,
we use the start() method exposed by the Thread object. Invoking this function executes the
thread’s target function concurrently with the main program.

my_thread.start()

Calling join() on a thread

To wait for a thread to complete its execution, we can call the join() method on the Thread
object. This causes the calling thread (often the main program) to block until the thread
terminates.

my_thread.join()

Daemon threads

Daemon threads are threads that run in the background and don't prevent the main program
from exiting. You can make a thread a daemon either:

 When you create it, by setting a flag to True in the constructor.

Or

 By setting the Thread object's daemon property to True before invoking the start()
method.

For example, the following code sets the daemon property to true, and then calls start().

my_thread.daemon = True
my_thread.start()

Other useful functions

There are several other functions exposed by the Threading module that a developer should
know:

 threading.active_count(): Returns the number of thread objects that are currently


alive.
 threading.excepthook(): A hook for handling unhandled exceptions in threads.
 threading.current_thread(): Returns the thread object in the current context.
 threading.get_native_id(): Returns the kernel-assigned native thread identifier for the
current thread.
 threading.main_thread(): Returns the main thread object, representing the initial
thread of the program.
 threading.stack_size(): Retrieves the thread stack size for new threads. Optionally,
you can provide a "size" argument to set a new stack size.

Synchronizing threads using locks, rlocks,


semaphores, and condition variables:
Thread synchronization ensures that multiple threads can access shared data safely. This is
important to prevent race conditions and deadlocks.

Race conditions are errors that can occur when multiple threads access the same data at
the same time. Deadlocks are situations where two or more threads are waiting for each
other to release a resource. This can cause the threads to block indefinitely, halting the
program.

Locks
Locks are synchronization primitives that ensure that only one thread can access a block of
code at a time. The Threading module offers a Lock class that can be used for this purpose.
The Lock class has two main functions: acquire() and release().

At any time, a lock object can be in one of two possible states: “locked” or “unlocked”.

 When you call acquire() on a locked Lock object, it blocks the current thread until
another thread calls release() on the same lock.
 When you call acquire() on an unlocked Lock object, the state of the Lock is
immediately changed to “locked”.
 When you call release() on a locked Lock object, the object’s state is immediately
changed to “unlocked”. Calling release() on an already unlocked object leads to a
runtime error.

The following code gives a simplified example on how to create and use a lock.

import threading

# Create a lock

lock = threading.Lock()

# Acquire the lock

lock.acquire()

# Critical section

# ……

# Release the lock when done


lock.release()

RLocks (Reentrant locks):

An RLock, or Reentrant Lock, is an extension of the basic lock that can be acquired multiple
times by the same thread. It's especially useful in recursion scenarios, or when a function
calls another function that also needs the lock already held by the calling function.

The threading module provides the RLock class for this purpose. Consider the following
example where the same thread acquires and releases the rlock multiple times:

import threading

class SharedData:

def __init__(self):

self.counter1 = 1

self.counter2 = 2

self.lock = threading.RLock()

def incrementCounter1(self):

self.lock.acquire() #acquire again

try:
self.counter1 = self.counter1 + 1

finally:

self.lock.release()

def updateCounter2(self):

self.lock.acquire() #acquire again

try:

self.counter2 = self.counter2 + self.counter1

finally:

self.lock.release()

def updateCounters(self):

self.lock.acquire() #first acquire

try:

self.incrementCounter1()

self.updateCounter2()

finally:

self.lock.release() #This will release the lock


Semaphores

Semaphores are objects that maintain counters for controlling access to a resource. They
allow a specific number of threads to access a resource concurrently. Each acquire() call
decrements the counter, and each release() call increments it. If the counter reaches 0, the
next acquire() call blocks until a release is called() by another thread.

The threading module includes the Semaphore class for this purpose:

import threading

# Create a semaphore with a maximum of 3 allowed threads

semaphore = threading.Semaphore(3)

# Acquire the semaphore

semaphore.acquire()

# Critical section – that the semaphore protects

# ……

# Release the semaphore when done

semaphore.release()

Condition variables

Condition variables are synchronization primitives that allow threads to wait for specific
conditions to become true before proceeding. A condition variable is always linked to a lock.
It is typically used to coordinate the execution of different threads in response to some
shared state.
The Condition class in the threading module allows us to implement condition variables.
Calling the “wait” or “wait_for” functions of a condition variable object releases the linked lock
and waits for another thread to call “notify()” or “notify_all()”.

Consider this example where a job processing thread waits for a job producing thread to
create a job before starting its processing. The line comments provide explanations for the
different lines of code.

import threading

condition = threading.Condition()

def consume_job():

with condition:

condition.wait_for(job_available)#wait for the producer thread


to notify

fetch_new_job()

def produce_job():

with condition:

create_new_job()

condition.notify() #notify the waiting thread

Writing synchronization primitives using the “with” statement

All synchronization primitives that the Threading module provides can be expressed using
the “with” statement syntax. “with” is a form of “Resource Acquisition Is Initialization” (RAII),
a principle used to manage resources in a way that automatically releases them when they
go out of scope.
By using the “with” statement, you can prevent potential deadlocks and enhance the
readability and maintainability of your code.

with my_lock:

#important code here

is the same as:

my_lock.acquire()
try:
#important code here
finally:
my_lock.release()

Creating thread pools:


The “concurrent.futures” module in Python offers a “ThreadPoolExecutor” class, which
allows developers to create and manage thread pools for handling asynchronous tasks.
Thread pools are a great way to optimize resource utilization by reusing existing threads,
instead of constantly creating new ones and destroying them.

The following code creates a thread pool and uses it to perform some asynchronous tasks.
The line comments provide explanations for the different lines of code.

import concurrent.futures
# Function to simulate a time-consuming task

def perform_task(task_id):

print(f"Task {task_id} started.")

# Simulate some work

result = task_id * 2

print(f"Task {task_id} completed with result: {result}")

return result

# Create a ThreadPoolExecutor instance

with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:

# List of tasks to execute

tasks = [11,12,13,14,15,16,17]

# Submit tasks to the executor

results = [executor.submit(perform_task, task_id) for task_id in tasks]

# Wait for the list of tasks to finish

concurrent.futures.wait(results)

# Get the results of the completed tasks


for future in results:

result = future.result()

print(f"Result: {result}")

Using synchronized queues for safer thread communication:


Using synchronized queues is a common way to implement safe communication between
threads in Python. The Queue class from the queue module allows you to implement a
synchronized queue that can have multiple producers and multiple consumers.

The queue module supports three types of queues:

 First in, first out (FIFO): The items are removed in the order they were added.
 Last in, first out (LIFO): This queue functions like a stack, where the most recently
added items are the first to be removed
 Priority: Items are removed based on their assigned priority, with the lowest priority
items being removed first.

The following code creates a LIFO queue and defines a producer thread that inserts some
items into the queue. It also initializes a consumer thread that processes values from the
queue. The line comments provide explanations for the different lines of code.

import threading
import queue
# Create a synchronized LIFO queue
lifo_queue = queue.LifoQueue()

# Function to simulate a producer adding items to the queue


def producer():
for i in range(1, 6):
lifo_queue.put(i)
print(f"Produced: {i}")

# Function to simulate a consumer removing items from the queue


def consumer():
while not lifo_queue.empty():
item = lifo_queue.get()
print(f"Consumed: {item}")
lifo_queue.task_done()
# Create producer and consumer threads
producer_thread = threading.Thread(target=producer)
consumer_thread = threading.Thread(target=consumer)

# Start the threads


producer_thread.start()
consumer_thread.start()

# Wait for the producer to end


producer_thread.join()
# Wait for the consumer to end
lifo_queue.join()
# Signal the consumer to exit
lifo_queue.put(None)
consumer_thread.join()
Worker threads vs. per-request threads:
Worker threads and per-request threads are two common threading models used in Python
and other programming languages.

In a worker thread model, a pool of pre-defined threads is created at the start of the
application. These threads are designed to be long-lived, with the main thread consistently
distributing incoming workloads across them.

Conversely, in a per-request model, the main thread spawns a new thread for each
incoming request. These threads are short-lived — i.e., they terminate after processing the
request.

Depending on your resource configurations and application requirements, you can use
either worker threads or per-request threads.

Use worker threads when:

 You want to have a smaller memory footprint by reducing the overhead of thread creation
and destruction.
 Your users can tolerate slight delays in responses, especially during peak hours, as worker
threads may be busy processing other tasks.
 You have long-running tasks that shouldn’t block the main thread.

Use per-request threads when:

 You have abundant system resources, and can handle a large memory footprint, even during
peak usage.
 Your application performs short-lived tasks in response to user requests, such as serving
web requests.
 Your users require near-instantaneous responses, and your infrastructure can support the
rapid creation and management of short-lived threads.
Conclusion:
Multithreading is an important concept for developers to grasp, regardless of the language
they are using. Python offers built-in classes and constructs that can be used to efficiently
and safely manage a large number of threads.

This article has introduced you to some of the most important classes and constructs. You
can use this knowledge to build scalable, multi-threaded applications that are free of race
conditions and deadlocks.

You might also like