concurrent.futures#

Needed updates:

  • difference between thread and process

  • how to choose the number of workers

  • add descriptive text to explain the process

import time
import random
import concurrent.futures
def sisyphus(upper_bound_input):
    '''
    This function generates a random number and then waits for 1 second before returning it.
    The upper bound is passed to the function as the only input.
    '''
    r_int = random.randint(1, upper_bound_input)
    time.sleep(1)
    return r_int
%%time

list_of_uppers = [7, 10, 50, 22, 3, 49, 100, 231]

main_results = [ sisyphus(x) for x in list_of_uppers ]

main_results
CPU times: user 1.27 ms, sys: 982 µs, total: 2.25 ms
Wall time: 8.02 s
[2, 10, 4, 12, 2, 41, 20, 51]

Threads vs processes#

From Concurrency in Python in TutorialsPoint see threads and processes :

Now that we have studied both the Executor classes — ThreadPoolExecutor and ProcessPoolExecutor — we need to know when to use which executor. We need to choose ProcessPoolExecutor in case of CPU-bound workloads and ThreadPoolExecutor in case of I/O-bound workloads.

If we use ProcessPoolExecutor, then we do not need to worry about the global interpreter lock (GIL) because it uses multiprocessing. Moreover, the execution time will be less when compared to ThreadPoolExecution.

num_workers = 4
%%time

with concurrent.futures.ThreadPoolExecutor(max_workers=num_workers) as executor:
    fast_results = executor.map(sisyphus, list_of_uppers)

list(fast_results)
CPU times: user 1.58 ms, sys: 961 µs, total: 2.54 ms
Wall time: 2.01 s
[7, 5, 29, 4, 1, 49, 59, 91]