difference between thread and process
how to choose the number of workers
add descriptive text to explain the process
import time import random import concurrent.futures
def sisyphus(upper_bound_input): ''' This function generates a random number and then waits for 1 second before returning it. The upper bound is passed to the function as the only input. ''' r_int = random.randint(1, upper_bound_input) time.sleep(1) return r_int
%%time list_of_uppers = [7, 10, 50, 22, 3, 49, 100, 231] main_results = [ sisyphus(x) for x in list_of_uppers ] main_results
CPU times: user 1.27 ms, sys: 982 µs, total: 2.25 ms Wall time: 8.02 s
[2, 10, 4, 12, 2, 41, 20, 51]
Threads vs processes#
Now that we have studied both the Executor classes — ThreadPoolExecutor and ProcessPoolExecutor — we need to know when to use which executor. We need to choose ProcessPoolExecutor in case of CPU-bound workloads and ThreadPoolExecutor in case of I/O-bound workloads.
If we use ProcessPoolExecutor, then we do not need to worry about the global interpreter lock (GIL) because it uses multiprocessing. Moreover, the execution time will be less when compared to ThreadPoolExecution.
num_workers = 4
%%time with concurrent.futures.ThreadPoolExecutor(max_workers=num_workers) as executor: fast_results = executor.map(sisyphus, list_of_uppers) list(fast_results)
CPU times: user 1.58 ms, sys: 961 µs, total: 2.54 ms Wall time: 2.01 s
[7, 5, 29, 4, 1, 49, 59, 91]