SlideShare a Scribd company logo
Understanding Concurrency
- Anshul Sharma
Agenda
● About the Author
● What is concurrency?
● Threading vs AsyncIO
● When is concurrency useful?
● How to speed up an I/O bound program?
● Synchronous version
● threading version
● asyncio version
● multiprocessing version
● How to speed up a CPU bound program?
● Synchronous version
● threading version
● asyncio version
● multiprocessing version
● Conclusion
What you are about to learn today?
● What is concurrency?
● What are the different features available in python that support concurrency?
● When to use which feature to use to speed up execution?
What you are not going to learn today?
● Detailed programming features available in python to support concurrency.
About the Author
● Product Engineer at Udaan
● Career Coach at Scaler Academy
● Previously worked at Springboard, Amazon & Jabong
● Technology stack familiar with:
● Python/Django
● Java
● Currently learning Kotlin and cooperative multitasking
● Social media handles:
● Linkedin(https://www.linkedin.com/in/raunify/)
● Github(https://github.com/raun)
What is concurrency?
● Dictionary definition of concurrency is simultaneous occurrence.
● In Python, the things that are occurring simultaneously are called by different names (thread, task,
process)
● Each one can be stopped at certain points, and the CPU or brain that is processing them can switch to a
different one.
● In essence only multiprocessing actually does work simultaneously as it uses multiple processors
available on the machine.
● Threading and AsyncIO both run on a single processor. They just cleverly find ways to take turns to
speed up the overall process.
Wondering how Threading and AsyncIO different then?
Threading vs AsyncIO
● Threading
○ In threading, the operating system actually knows about each thread and can interrupt it at any
time to start running a different thread. This is called Pre-emptive multitasking.
○ Pre-emptive multitasking is handy in the sense that code in the thread doesn’t need to do
anything to make the switch. It can also be difficult because of that “at any time” phrase.
● AsyncIO
○ Tasks must cooperate by announcing when they are ready to be switched out. This is called
Cooperative multitasking.
○ That means that the code in the AsyncIO task has to change slightly to announce that it is ready
to be switched out.
○ The benefit of doing this extra work up front is that you always know where your task will be
swapped out.
○ This can simplify parts of your software design.
When is concurrency useful?
● Concurrency can make a big difference for two types of problems.
○ CPU-bound
■ There are classes of programs that do significant computation without talking to the
network or accessing a file.
■ These are the CPU-bound programs, because the resource limiting the speed of your
program is the CPU, not the network or the file system.
○ I/O-bound
■ I/O-bound problems cause your program to slow down because it frequently must wait for
input/output (I/O) from some external resource.
■ They arise frequently when your program is working with things that are much slower than
your CPU.
How to speed up an I/O bound program?
Sample Problem:
Downloading content over the network. For our example, we will be downloading web pages from a few sites,
but it really could be any network traffic.
Synchronous Version
threading Version
asyncio basics
● A single Python object, called the event loop, controls how and when each task gets run.
● The event loop is aware of each task and knows what state it’s in.
● Let's assume a simplified event loop which has 2 states(i.e. Ready to work & Waiting for external
thing)*Python Event Loop can have more state than listed above
● Your simplified event loop maintains two lists of tasks, one for each of these states.
● Your simplified event loop selects one of the ready tasks and starts it back to running. That task is in
complete control until it cooperatively hands the control back to the event loop.
● When the running task gives control back to the event loop, the event loop places that task into
either the ready or waiting list and then goes through each of the tasks in the waiting list to see if it
has become ready by an I/O operation completing.
● An important point of asyncio is that the tasks never give up control without intentionally doing so.
They never get interrupted in the middle of an operation. This allows us to share resources a bit
more easily in asyncio than in threading.
async & await keywords
● async :
○ It is flag to Python telling it that the function about to be defined uses await.*Not always strictly true
○ async with statement, which creates a context manager from an object you would normally
await. But general idea is to flag this context manager as something that can get swapped out.
● await : When your code awaits a function call, it’s a signal that the call is likely to be something
that takes a while and that the task should give up control.
As I’m sure you can imagine, there’s some complexity in managing the interaction between the event loop
and the tasks. But this will get clear with our next example.
asyncio Version
multiprocessing version
How to speed up an CPU bound program?
Sample Problem:
We will use a somewhat silly function to create something that takes a long time to run on the CPU. This
function computes the sum of the squares of each number from 0 to the passed-in value(n)
Synchronous Version
threading & asyncio versions
multiprocessing Version
Conclusion
We have covered a lot of ground here. It’s time to review those ideas and discuss about decision points.
● The first step of this process is deciding if you should use a concurrency module. While the examples here make
each of the libraries look pretty simple, concurrency always comes with extra complexity and can often result in bugs
that are difficult to find.
● Hold out on adding concurrency until you have a known performance issue and then determine which type of
concurrency you need.
● Once you’ve decided that you should optimize your program, figuring out if your program is CPU-bound or I/O-bound.
● CPU-bound problems only really gain from using multiprocessing. threading and asyncio did not help this
type of problem at all.
● For I/O-bound problems, there’s a general rule of thumb in the Python community: “Use asyncio when you can,
threading when you must.”
● asyncio can provide the best speed up for this type of program, but sometimes you will require critical libraries that
have not been ported to take advantage of asyncio.
Q&A
References
● Demo code: https://github.com/raun/concurrency-demo
● Presentation link:

More Related Content

What's hot (20)

PDF
RxSwift
Sally Ahmed
 
PDF
JAVA Threads explained
Pradhan Rishi Sharma
 
PPTX
Multithreading in java
Arafat Hossan
 
PPTX
Concurrency in c#
RezaHamidpour
 
PPTX
Threading in C#
Medhat Dawoud
 
PPTX
Why Concurrency is hard ?
Ramith Jayasinghe
 
PDF
Concurrency in Java
Lakshmi Narasimhan
 
PPTX
Java Multi Thead Programming
Nishant Mevawala
 
PDF
Async Web Frameworks in Python
Ryan Johnson
 
PDF
The working architecture of NodeJs applications
Viktor Turskyi
 
PDF
Hibernate concurrency
priyank09
 
PDF
TestWorks Conf Performance testing made easy with gatling - Guillaume Corré
Xebia Nederland BV
 
PPTX
Inter - thread communication
BhumikaDhingra3
 
PPTX
Testing in Scala. Adform Research
Vasil Remeniuk
 
PPTX
Concurrency - Why it's hard ?
Ramith Jayasinghe
 
PDF
Introduction to Monix Coeval
Knoldus Inc.
 
PDF
.Net Threading
Erik Ralston
 
PPTX
MERIMeeting du 27 mai 2014 - Parallel Programming
Olivier NAVARRE
 
PDF
Let it crash! The Erlang Approach to Building Reliable Services
Brian Troutwine
 
PDF
Java threads
javaicon
 
RxSwift
Sally Ahmed
 
JAVA Threads explained
Pradhan Rishi Sharma
 
Multithreading in java
Arafat Hossan
 
Concurrency in c#
RezaHamidpour
 
Threading in C#
Medhat Dawoud
 
Why Concurrency is hard ?
Ramith Jayasinghe
 
Concurrency in Java
Lakshmi Narasimhan
 
Java Multi Thead Programming
Nishant Mevawala
 
Async Web Frameworks in Python
Ryan Johnson
 
The working architecture of NodeJs applications
Viktor Turskyi
 
Hibernate concurrency
priyank09
 
TestWorks Conf Performance testing made easy with gatling - Guillaume Corré
Xebia Nederland BV
 
Inter - thread communication
BhumikaDhingra3
 
Testing in Scala. Adform Research
Vasil Remeniuk
 
Concurrency - Why it's hard ?
Ramith Jayasinghe
 
Introduction to Monix Coeval
Knoldus Inc.
 
.Net Threading
Erik Ralston
 
MERIMeeting du 27 mai 2014 - Parallel Programming
Olivier NAVARRE
 
Let it crash! The Erlang Approach to Building Reliable Services
Brian Troutwine
 
Java threads
javaicon
 

Similar to Understanding concurrency (20)

PDF
Asynchronous Python A Gentle Introduction
PyData
 
PDF
Introduction to Python Asyncio
Nathan Van Gheem
 
PDF
AsyncIO in Python (Guide and Example).pdf
PreetAujla6
 
PDF
The journey of asyncio adoption in instagram
Jimmy Lai
 
PDF
Syncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
Pôle Systematic Paris-Region
 
PPT
AsyncIO To Speed Up Your Crawler
Linggar Primahastoko
 
PPTX
Async programming and python
Chetan Giridhar
 
PPTX
Asynchronous programming with django
bangaloredjangousergroup
 
PDF
Webscraping with asyncio
Jose Manuel Ortega Candel
 
PDF
Async programming in Python_ Build non-blocking, scalable apps with coroutine...
Peerbits
 
PDF
BUILDING APPS WITH ASYNCIO
Mykola Novik
 
PDF
Python, do you even async?
Saúl Ibarra Corretgé
 
PDF
concurrency
Jonathan Wagoner
 
PDF
Introduction to asyncio
Saúl Ibarra Corretgé
 
PDF
Python_Asynchronous_Programming_FP_MCQs_Answerspdf
tannisam
 
PDF
asyncio
aschlapsi
 
PPTX
Concurrency models in python
YitzikCasapu
 
PPTX
Concurrency and Parallelism, Asynchronous Programming, Network Programming
Prabu U
 
PDF
Asynchronous programming intro
cc liu
 
PDF
Concurrency, Parallelism And IO
Piyush Katariya
 
Asynchronous Python A Gentle Introduction
PyData
 
Introduction to Python Asyncio
Nathan Van Gheem
 
AsyncIO in Python (Guide and Example).pdf
PreetAujla6
 
The journey of asyncio adoption in instagram
Jimmy Lai
 
Syncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
Pôle Systematic Paris-Region
 
AsyncIO To Speed Up Your Crawler
Linggar Primahastoko
 
Async programming and python
Chetan Giridhar
 
Asynchronous programming with django
bangaloredjangousergroup
 
Webscraping with asyncio
Jose Manuel Ortega Candel
 
Async programming in Python_ Build non-blocking, scalable apps with coroutine...
Peerbits
 
BUILDING APPS WITH ASYNCIO
Mykola Novik
 
Python, do you even async?
Saúl Ibarra Corretgé
 
concurrency
Jonathan Wagoner
 
Introduction to asyncio
Saúl Ibarra Corretgé
 
Python_Asynchronous_Programming_FP_MCQs_Answerspdf
tannisam
 
asyncio
aschlapsi
 
Concurrency models in python
YitzikCasapu
 
Concurrency and Parallelism, Asynchronous Programming, Network Programming
Prabu U
 
Asynchronous programming intro
cc liu
 
Concurrency, Parallelism And IO
Piyush Katariya
 
Ad

More from Anshul Sharma (12)

PPT
Interm codegen
Anshul Sharma
 
PPT
Programming using Open Mp
Anshul Sharma
 
PPT
Open MPI 2
Anshul Sharma
 
PPT
Open MPI
Anshul Sharma
 
PPT
Paralle programming 2
Anshul Sharma
 
PPT
Parallel programming
Anshul Sharma
 
PPT
Cuda 3
Anshul Sharma
 
PPT
Cuda 2
Anshul Sharma
 
PPT
Cuda intro
Anshul Sharma
 
ODP
Intoduction to Linux
Anshul Sharma
 
Interm codegen
Anshul Sharma
 
Programming using Open Mp
Anshul Sharma
 
Open MPI 2
Anshul Sharma
 
Open MPI
Anshul Sharma
 
Paralle programming 2
Anshul Sharma
 
Parallel programming
Anshul Sharma
 
Cuda intro
Anshul Sharma
 
Intoduction to Linux
Anshul Sharma
 
Ad

Recently uploaded (20)

PDF
Alur Perkembangan Software dan Jaringan Komputer
ssuser754303
 
PDF
AWS Consulting Services: Empowering Digital Transformation with Nlineaxis
Nlineaxis IT Solutions Pvt Ltd
 
PDF
Automated Test Case Repair Using Language Models
Lionel Briand
 
PPTX
IDM Crack with Internet Download Manager 6.42 Build 41 [Latest 2025]
pcprocore
 
PDF
Code Once; Run Everywhere - A Beginner’s Journey with React Native
Hasitha Walpola
 
PDF
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
arabelatso
 
PPTX
IObit Uninstaller Pro 14.3.1.8 Crack Free Download 2025
sdfger qwerty
 
PPTX
Iobit Driver Booster Pro 12 Crack Free Download
chaudhryakashoo065
 
PDF
Azure AI Foundry: The AI app and agent factory
Maxim Salnikov
 
PDF
From Data Preparation to Inference: How Alluxio Speeds Up AI
Alluxio, Inc.
 
DOCX
Zoho Creator Solution for EI by Elsner Technologies.docx
Elsner Technologies Pvt. Ltd.
 
PDF
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
arabelatso
 
PDF
Best Software Development at Best Prices
softechies7
 
DOCX
Best AI-Powered Wearable Tech for Remote Health Monitoring in 2025
SEOLIFT - SEO Company London
 
PPTX
arctitecture application system design os dsa
za241967
 
PDF
What Is an Internal Quality Audit and Why It Matters for Your QMS
BizPortals365
 
PDF
Why Edge Computing Matters in Mobile Application Tech.pdf
IMG Global Infotech
 
PDF
Best Practice for LLM Serving in the Cloud
Alluxio, Inc.
 
PDF
Writing Maintainable Playwright Tests with Ease
Shubham Joshi
 
PPTX
For my supp to finally picking supp that work
necas19388
 
Alur Perkembangan Software dan Jaringan Komputer
ssuser754303
 
AWS Consulting Services: Empowering Digital Transformation with Nlineaxis
Nlineaxis IT Solutions Pvt Ltd
 
Automated Test Case Repair Using Language Models
Lionel Briand
 
IDM Crack with Internet Download Manager 6.42 Build 41 [Latest 2025]
pcprocore
 
Code Once; Run Everywhere - A Beginner’s Journey with React Native
Hasitha Walpola
 
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
arabelatso
 
IObit Uninstaller Pro 14.3.1.8 Crack Free Download 2025
sdfger qwerty
 
Iobit Driver Booster Pro 12 Crack Free Download
chaudhryakashoo065
 
Azure AI Foundry: The AI app and agent factory
Maxim Salnikov
 
From Data Preparation to Inference: How Alluxio Speeds Up AI
Alluxio, Inc.
 
Zoho Creator Solution for EI by Elsner Technologies.docx
Elsner Technologies Pvt. Ltd.
 
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
arabelatso
 
Best Software Development at Best Prices
softechies7
 
Best AI-Powered Wearable Tech for Remote Health Monitoring in 2025
SEOLIFT - SEO Company London
 
arctitecture application system design os dsa
za241967
 
What Is an Internal Quality Audit and Why It Matters for Your QMS
BizPortals365
 
Why Edge Computing Matters in Mobile Application Tech.pdf
IMG Global Infotech
 
Best Practice for LLM Serving in the Cloud
Alluxio, Inc.
 
Writing Maintainable Playwright Tests with Ease
Shubham Joshi
 
For my supp to finally picking supp that work
necas19388
 

Understanding concurrency

  • 2. Agenda ● About the Author ● What is concurrency? ● Threading vs AsyncIO ● When is concurrency useful? ● How to speed up an I/O bound program? ● Synchronous version ● threading version ● asyncio version ● multiprocessing version ● How to speed up a CPU bound program? ● Synchronous version ● threading version ● asyncio version ● multiprocessing version ● Conclusion
  • 3. What you are about to learn today? ● What is concurrency? ● What are the different features available in python that support concurrency? ● When to use which feature to use to speed up execution? What you are not going to learn today? ● Detailed programming features available in python to support concurrency.
  • 4. About the Author ● Product Engineer at Udaan ● Career Coach at Scaler Academy ● Previously worked at Springboard, Amazon & Jabong ● Technology stack familiar with: ● Python/Django ● Java ● Currently learning Kotlin and cooperative multitasking ● Social media handles: ● Linkedin(https://www.linkedin.com/in/raunify/) ● Github(https://github.com/raun)
  • 5. What is concurrency? ● Dictionary definition of concurrency is simultaneous occurrence. ● In Python, the things that are occurring simultaneously are called by different names (thread, task, process) ● Each one can be stopped at certain points, and the CPU or brain that is processing them can switch to a different one. ● In essence only multiprocessing actually does work simultaneously as it uses multiple processors available on the machine. ● Threading and AsyncIO both run on a single processor. They just cleverly find ways to take turns to speed up the overall process. Wondering how Threading and AsyncIO different then?
  • 6. Threading vs AsyncIO ● Threading ○ In threading, the operating system actually knows about each thread and can interrupt it at any time to start running a different thread. This is called Pre-emptive multitasking. ○ Pre-emptive multitasking is handy in the sense that code in the thread doesn’t need to do anything to make the switch. It can also be difficult because of that “at any time” phrase. ● AsyncIO ○ Tasks must cooperate by announcing when they are ready to be switched out. This is called Cooperative multitasking. ○ That means that the code in the AsyncIO task has to change slightly to announce that it is ready to be switched out. ○ The benefit of doing this extra work up front is that you always know where your task will be swapped out. ○ This can simplify parts of your software design.
  • 7. When is concurrency useful? ● Concurrency can make a big difference for two types of problems. ○ CPU-bound ■ There are classes of programs that do significant computation without talking to the network or accessing a file. ■ These are the CPU-bound programs, because the resource limiting the speed of your program is the CPU, not the network or the file system. ○ I/O-bound ■ I/O-bound problems cause your program to slow down because it frequently must wait for input/output (I/O) from some external resource. ■ They arise frequently when your program is working with things that are much slower than your CPU.
  • 8. How to speed up an I/O bound program? Sample Problem: Downloading content over the network. For our example, we will be downloading web pages from a few sites, but it really could be any network traffic.
  • 11. asyncio basics ● A single Python object, called the event loop, controls how and when each task gets run. ● The event loop is aware of each task and knows what state it’s in. ● Let's assume a simplified event loop which has 2 states(i.e. Ready to work & Waiting for external thing)*Python Event Loop can have more state than listed above ● Your simplified event loop maintains two lists of tasks, one for each of these states. ● Your simplified event loop selects one of the ready tasks and starts it back to running. That task is in complete control until it cooperatively hands the control back to the event loop. ● When the running task gives control back to the event loop, the event loop places that task into either the ready or waiting list and then goes through each of the tasks in the waiting list to see if it has become ready by an I/O operation completing. ● An important point of asyncio is that the tasks never give up control without intentionally doing so. They never get interrupted in the middle of an operation. This allows us to share resources a bit more easily in asyncio than in threading.
  • 12. async & await keywords ● async : ○ It is flag to Python telling it that the function about to be defined uses await.*Not always strictly true ○ async with statement, which creates a context manager from an object you would normally await. But general idea is to flag this context manager as something that can get swapped out. ● await : When your code awaits a function call, it’s a signal that the call is likely to be something that takes a while and that the task should give up control. As I’m sure you can imagine, there’s some complexity in managing the interaction between the event loop and the tasks. But this will get clear with our next example.
  • 15. How to speed up an CPU bound program? Sample Problem: We will use a somewhat silly function to create something that takes a long time to run on the CPU. This function computes the sum of the squares of each number from 0 to the passed-in value(n)
  • 19. Conclusion We have covered a lot of ground here. It’s time to review those ideas and discuss about decision points. ● The first step of this process is deciding if you should use a concurrency module. While the examples here make each of the libraries look pretty simple, concurrency always comes with extra complexity and can often result in bugs that are difficult to find. ● Hold out on adding concurrency until you have a known performance issue and then determine which type of concurrency you need. ● Once you’ve decided that you should optimize your program, figuring out if your program is CPU-bound or I/O-bound. ● CPU-bound problems only really gain from using multiprocessing. threading and asyncio did not help this type of problem at all. ● For I/O-bound problems, there’s a general rule of thumb in the Python community: “Use asyncio when you can, threading when you must.” ● asyncio can provide the best speed up for this type of program, but sometimes you will require critical libraries that have not been ported to take advantage of asyncio.
  • 20. Q&A
  • 21. References ● Demo code: https://github.com/raun/concurrency-demo ● Presentation link:

Editor's Notes

  • #10: Why the synchronous version rocks? It is easy! Easy to debug. One train of thought Problems with synchronous version Its dead slow! Although being slower is not necessarily a big issue If it runs rarely If the completion time is within SLA What if your program run frequently(Those daily tasks)? What if it takes hours to run? Then we have to think about something better!
  • #11: download_all_sites() changed from calling the function once per site to a more complex structure In this version, you’re creating a ThreadPoolExecutor, which seems like a complicated thing. Let’s break that down: ThreadPoolExecutor = Thread + Pool + Executor. This object is going to create a pool of threads, each of which can run concurrently. The Executor is the part that’s going to control how and when each of the threads in the pool will run. map() method runs the passed-in function on each of the sites in the list. The great part is that it automatically runs them concurrently using the pool of threads it is managing. The other interesting change in our example is that each thread needs to create its own requests.Session() object. Because the operating system is in control of when your task gets interrupted and another task starts, any data that is shared between the threads needs to be protected, or thread-safe. Threading.local() creates an object that look like a global but is specific to each individual thread. How to decide the number of thread in the pool? This require a little hit and trial. You might expect that having one thread per download would be the fastest but, at least on my system it was not. Why is that? There are two main competing force working here: Speed up due to concurrency Overhead of creating & destroying thread On my machine, the optimal value was 10 threads. Another interesting thing is, with 10 threads, the time has reduced slightly more than 1/10th of the original time. Why is that? Why the threading version rocks? Its fast! Or atleast faster than the synchronous version. But why? Allow your program to overlap the waiting times and get the final result faster Problems with threading version Little more code to make it happen and think about how to keep objects and code thread safe. Threads can interact in ways that are subtle and hard to detect. These interaction can cause race conditions which can result random, intermittent bugs that are difficult to debug.
  • #14: Why the asyncio version rocks? It is faster than the threading version. It forces you to think about when a given task will get swapped out, which can help you create a better, faster, design. It is much more scalable. With threading version you have to fine tune the number of thread based on problem and hardware you are working with. Problems with asyncio version You need special async versions of libraries to gain the full advantage of asyncio. Had you just used requests for downloading the sites, it would have been much slower because requests is not designed to notify the event loop that it’s blocked. All of the advantages of cooperative multitasking get thrown away if one of the tasks doesn’t cooperate. A minor mistake in code can cause a task to run off and hold the processor for a long time, starving other tasks that need running. There is no way for the event loop to break in if a task does not hand control back to it.
  • #15: Code looks is much shorter than asyncio version multiprocessing on high level create a new instance of Python interpreter to run on each CPU and then farming out part of our program to run on it. Why the multiprogramming version rocks? Relatively easy to setup & requires little extra code Takes full advantage of the CPU power on your computer Problems with asyncio version Slower than asyncio and threading Require some setup from synchronous version Spent some time thinking about which variables will be accessed in each process
  • #17: This code calls cpu_bound() 20 times with a different large number each time. It does all of this on a single thread in a single process on a single CPU. Unlike the I/O-bound examples, the CPU-bound examples are usually fairly consistent in their run times. Why is that? Clearly we can do better than this. This is all running on a single CPU with no concurrency. How much do you think rewriting this code using threading or asyncio will speed this up?
  • #18: In your I/O-bound example, much of the overall time was spent waiting for slow operations to finish. threading and asyncio sped this up by allowing you to overlap the times you were waiting instead of doing them sequentially. On a CPU-bound problem, however, there is no waiting. The CPU is cranking away as fast as it can to finish the problem. One CPU is doing all of the work of the non-concurrent code plus the extra work of setting up threads or tasks.
  • #19: Why the multiprocessing version rocks? Easy to setup with very little code Take full advantage of CPU power on your computer Problems with multiprocessing version Splitting your problem up so each processor can work independently can sometimes be difficult. Many solutions require more communication between the processes. This would add more complexity to your synchronous version and sometime these communication overhead can outweigh the benefit.