Python Quick Tip: Simple ThreadPool Parallelism
Parallelism isn't always easy, but by breaking our code down into a form that can be applied over a map, we can easily adjust it to be run in parallel!
A map is a built-in higher-order function that applies a given function to each element of a list, returning a list of results.
multiprocessing library is usually used for separate processes, however it has a neat
dummy module that works over threads. In fact, it's so trivial that you only need to set the number of threads and give it your function to be mapped over. This method does all of the hard work for you.
Don't believe me? Check it out for yourself, let's square a bunch of numbers in parallel:
from multiprocessing.dummy import Pool as ThreadPool def squareNumber(n): return n ** 2 # function to be mapped over def calculateParallel(numbers, threads=2): pool = ThreadPool(threads) results = pool.map(squareNumber, numbers) pool.close() pool.join() return results if __name__ == "__main__": numbers = [1, 2, 3, 4, 5] squaredNumbers = calculateParallel(numbers, 4) for n in squaredNumbers: print(n)
The number of threads will usually be equivalent to the number of cores you have. If you have hyperthreading on your processor, than you will be able to double the number of threads used.
The results returned will simply be a standard list of all of the squared numbers in this example. Too easy, right?