How to Pool Map With Multiple Arguments in Python

Muhammad Waiz Khan Feb 12, 2024
  1. Parallel Function Execution Using the pool.map() Method
  2. Parallel Function Execution With Multiple Arguments Using the pool.starmap() Method
  3. Parallel Function Execution With Multiple Arguments Using the Pool.apply_async() Method
  4. Conclusion
How to Pool Map With Multiple Arguments in Python

This article will explain different methods to perform parallel function execution using the multiprocessing module in Python.

The multiprocessing module provides the functionalities to perform parallel function execution with multiple inputs and distribute input data across different processes.

We can parallelize the function’s execution with different input values by using the following methods in Python.

Parallel Function Execution Using the pool.map() Method

The pool.map(function, iterable) method returns an iterator that applies the function provided as input to each item of the input iterable. Therefore, if we want to perform parallel execution of the function with different inputs, we can use the pool.map() method.

The below example code demonstrates how to use the pool.map() method to parallelize the function execution in Python.

from multiprocessing import Pool


def myfunc(x):
    return 5 + x


if __name__ == "__main__":
    with Pool(3) as p:
        print(p.map(myfunc, [1, 2, 3]))

Here, myfunc is a simple function that takes a single argument x and returns the result of 5 + x. This function is designed to be easily parallelizable, as it operates independently on its input.

We then create a context manager with with Pool(3) as p:, initializing a pool of three worker processes managed by Pool. This setup allows for automatic closure and cleanup of the pool once we exit the block.

The parallel processing is carried out by p.map(myfunc, [1, 2, 3]), where we use the map function of the Pool object, akin to Python’s built-in map. This function applies myfunc to each element in the iterable [1, 2, 3], with each application executed in parallel across the pool’s worker processes.

Output:

python pool map multiple arguments - output 1

If the input function has multiple arguments, we can execute the function in parallel using the pool.map() method and partial() function with it.

The below example demonstrates how to parallelize the function execution with multiple arguments using the pool.map() in Python.

import multiprocessing
from functools import partial

# Example function with three arguments


def multiply(x, y, z):
    return x * y * z


# Main execution block
if __name__ == "__main__":
    # Creating a pool of workers
    with multiprocessing.Pool() as pool:
        # Freezing the first argument of the function 'multiply' to 10
        partial_multiply = partial(multiply, 10)

        # List of tuples, each containing two arguments
        args_list = [(2, 3), (4, 5), (6, 7)]

        # Mapping the partial function across the pool
        results = pool.starmap(partial_multiply, args_list)

    # Displaying the results
    print("Results:", results)

In this example, we have defined a function multiply(x, y, z) that simply multiplies its three arguments.

Our goal is to execute this function in parallel for various combinations of y and z, while keeping x constant. To achieve this, we utilize functools.partial().

We create the partial_multiply object using partial(multiply, 10). Here, multiply is the target function, and 10 is the fixed value for the first argument x.

This partial object behaves like the multiply function where x is always 10.

Our args_list contains tuples, each representing a pair of arguments (y and z) for which the multiply function needs to be executed. Notice that we don’t need to specify the first argument (x) as it’s already set in partial_multiply.

We then use the multiprocessing.Pool to create a pool of worker processes. With pool.starmap(), we map the partial_multiply function over args_list.

The starmap method is essential here, allowing the elements of args_list to be unpacked into the function arguments.

Finally, we print out the results. Each element in results corresponds to the product of 10 (our fixed x), and the respective elements in args_list.

Output:

python pool map multiple arguments - output 2

As can be noticed in the above example, the shortcoming of this method is that we can not change the value of the first argument.

Parallel Function Execution With Multiple Arguments Using the pool.starmap() Method

If we want to execute a function parallel with multiple arguments, we can do so using the pool.starmap(function, iterable) method.

Like the pool.map(function, iterable) method, the pool.starmap(function, iterable) method returns an iterator that applies the function provided as input to each item of the iterable. Still, it expects each input item iterable to be arranged as input function argument iterables.

By using the pool.starmap() method we can provide different values to all arguments of the function, unlike the pool.map() method.

We can perform parallel function execution with multiple arguments in Python using the pool.starmap() method in the following way.

import multiprocessing

# A sample function that takes multiple arguments


def calculate_area(length, width):
    return length * width


# Main execution block
if __name__ == "__main__":
    # Define a list of tuples for arguments
    rectangle_dimensions = [(1, 2), (3, 4), (5, 6)]

    # Create a pool of worker processes
    with multiprocessing.Pool() as pool:
        # Use starmap to apply 'calculate_area' across the list of tuples
        areas = pool.starmap(calculate_area, rectangle_dimensions)

    # Print the results
    print("Areas:", areas)

In our example, we define a function calculate_area(length, width) to calculate the area of a rectangle. We intend to use this function across a list of rectangle dimensions, represented as tuples (length, width).

We then create a pool of worker processes using with multiprocessing.Pool() as pool:. This context manager ensures that the pool is properly managed (e.g., closed) after its block of code is executed.

Inside this block, we call pool.starmap(calculate_area, rectangle_dimensions). The calculate_area function is our target function, and rectangle_dimensions is an iterable of tuples, where each tuple contains a pair of length and width for a rectangle.

What starmap() does here is essentially unpack each tuple from rectangle_dimensions and pass them as arguments to calculate_area. For instance, the first tuple (1, 2) is unpacked and passed as calculate_area(1, 2).

This operation is performed in parallel across different processes in the pool, making it highly efficient for large datasets or computationally intensive tasks.

Finally, we print the results stored in areas. Each element in areas is the area of a rectangle corresponding to the respective tuple in rectangle_dimensions.

Output:

python pool map multiple arguments - output 3

Parallel Function Execution With Multiple Arguments Using the Pool.apply_async() Method

In Python’s multiprocessing landscape, Pool.apply_async() offers a flexible way to execute functions asynchronously in parallel processes. This method is particularly useful for functions that require multiple arguments, allowing each process to handle complex computations independently and concurrently.

The apply_async() method from the multiprocessing.Pool class is designed to apply a function asynchronously to a pool of worker processes. This method is ideal for tasks that require non-blocking computations and where each task might take a variable amount of time to complete.

import multiprocessing

# Function to compute the power of a number


def power(base, exponent):
    return base**exponent


# Main execution block
if __name__ == "__main__":
    # Create a pool of worker processes
    with multiprocessing.Pool() as pool:
        # List of arguments
        args_list = [(2, 3), (4, 2), (3, 3), (5, 2)]

        # List to store the result objects
        results = [pool.apply_async(power, args) for args in args_list]

        # Retrieve and print the results
        for res in results:
            print("Result:", res.get())

In our code, we define a function power(base, exponent) that calculates the power of a number. To handle this function concurrently across multiple processes, we utilize the apply_async() method from Python’s multiprocessing module.

We start by creating a pool of worker processes using the context manager with multiprocessing.Pool() as pool:. This setup allows us to manage the pool efficiently, ensuring that resources are properly allocated and freed up after use.

Inside this context manager, we loop over our args_list, which contains tuples of arguments to be passed to the power function. Each tuple represents a distinct set of parameters for which we want to calculate the power.

For each tuple in args_list, we call pool.apply_async(power, args). It returns a result object immediately, without waiting for the function to complete.

We collect these result objects in the results list. Each result object is a placeholder for the eventual output of the power function.

By using apply_async(), we effectively distribute the computation of power across multiple processes.

Output:

python pool map multiple arguments - output 4

Conclusion

In this comprehensive exploration, we delved into Python’s multiprocessing module, uncovering methods for parallel function execution. The pool.map() method showcased parallelization with single-argument functions, while pool.starmap() demonstrated handling multiple arguments efficiently.

For asynchronous and non-blocking tasks, we examined the versatility of Pool.apply_async(). Through practical code examples, we illustrated the utilization of these methods, catering to both novices and seasoned developers.

The presented techniques empower Python programmers to harness the potential of multi-core processors, optimizing code for enhanced performance. By mastering these methods, developers can seamlessly integrate parallel processing into their applications, unlocking efficiency gains and maximizing computing resources.

Related Article - Python Multiprocessing