Python Quick Tip: Simple ThreadPool Parallelism

Published Oct 28, 2015Last updated Feb 09, 2017
Python Quick Tip: Simple ThreadPool Parallelism

Parallelism isn't always easy, but by breaking our code down into a form that can be applied over a map, we can easily adjust it to be run in parallel!

A map is a built-in higher-order function that applies a given function to each element of a list, returning a list of results.

The multiprocessing library is usually used for separate processes, however it has a neat dummy module that works over threads. In fact, it's so trivial that you only need to set the number of threads and give it your function to be mapped over. This method does all of the hard work for you.

Don't believe me? Check it out for yourself, let's square a bunch of numbers in parallel:

Code Example

from multiprocessing.dummy import Pool as ThreadPool

def squareNumber(n):
    return n ** 2

# function to be mapped over
def calculateParallel(numbers, threads=2):
    pool = ThreadPool(threads)
    results = pool.map(squareNumber, numbers)
    pool.close()
    pool.join()
    return results

if __name__ == "__main__":
    numbers = [1, 2, 3, 4, 5]
    squaredNumbers = calculateParallel(numbers, 4)
    for n in squaredNumbers:
        print(n)
    

The number of threads will usually be equivalent to the number of cores you have. If you have hyperthreading on your processor, than you will be able to double the number of threads used.

The results returned will simply be a standard list of all of the squared numbers in this example. Too easy, right?

Discover and read more posts from Lance Pioch
get started