Python Multiprocessing Logging -

But seriously, it can be quite useful for certain scenarios. In this guide, we’ll show you how to implement multiprocessing logging in Python using QueueHandler and some other cool stuff.

Before anything else let’s create a simple script that demonstrates basic multiprocessing with logging. We’re going to use the `multiprocessing` module for this, which is built-in to Python 3.4+ (and can be installed separately in older versions). Here’s what our script looks like:

# Import necessary modules
import time # Importing the time module to use for sleep function
from multiprocessing import Process # Importing the Process class from the multiprocessing module
import logging # Importing the logging module to log messages

# Define a function for the worker process
def worker(name):
    logger = logging.getLogger(__name__) # Creating a logger object with the name of the current module
    logger.info('Started {}'.format(name)) # Logging an info message with the name of the current process
    for i in range(10):
        time.sleep(random()) # Using the sleep function from the time module to simulate some work being done
        logger.debug('Doing something...') # Logging a debug message to show that the process is doing something
    logger.info('Finished {}'.format(name)) # Logging an info message to show that the process has finished

# Check if the current module is being run as the main program
if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG) # Setting the logging level to DEBUG
    for i in range(5):
        p = Process(target=worker, args=(i,)) # Creating a Process object with the target function and arguments
        p.start() # Starting the process

This script defines a `worker` function that does some basic work and logs messages using the built-in logging module. We’re also setting up basic configuration for our logger (which we’ll explain in more detail later). When you run this script, it will start 5 worker processes and log their progress to the console.

But what if we want each process to have its own logger? That’s where QueueHandler comes in! Here’s an updated version of our script that uses a QueueHandler for multiprocessing logging:

# Import necessary libraries
import time # Import time library for time-related functions
from multiprocessing import Process, Queue # Import Process and Queue from multiprocessing library
import logging # Import logging library for logging functionality

# Define worker function with name parameter
def worker(name):
    logger = logging.getLogger(__name__) # Create a logger object with the name of the current module
    logger.setLevel(logging.DEBUG) # Set the logging level to DEBUG
    handler = logging.QueueHandler(queue=q) # Create a QueueHandler object with the queue parameter
    formatter = logging.Formatter('%(asctime)s [%(processName)-10s] %(levelname)-8s %(message)s') # Create a formatter object with the specified format
    handler.setFormatter(formatter) # Set the formatter for the handler
    logger.addHandler(handler) # Add the handler to the logger
    logger.info('Started {}'.format(name)) # Log an info message with the name parameter
    for i in range(10):
        time.sleep(random()) # Sleep for a random amount of time
        logger.debug('Doing something...') # Log a debug message
    logger.info('Finished {}'.format(name)) # Log an info message with the name parameter
    
# Check if the script is being run directly
if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG) # Set the logging level to DEBUG
    q = Queue() # Create a Queue object
    for i in range(5):
        p = Process(target=worker, args=(i,), daemon=True) # Create a Process object with the worker function and arguments
        p.start() # Start the process

In this version of the script, we’re creating a new logger instance and setting up our own handler (using QueueHandler). We’re also formatting our messages to include the process name for each log entry. When you run this script, it will start 5 worker processes that all have their own logger instances. The logs will be written to the queue instead of directly to the console or a file.

But wait what happens if we want to see those logs in real-time? That’s where our trusty friend `tail` comes in! We can use it to monitor the output of our QueueHandler:

# This script is used to monitor the output of a QueueHandler in real-time using the `tail` command.

# First, we run the script `multiprocessing_logging.py` in the background using the `&` symbol.
python multiprocessing_logging.py &

# We then wait for the script to start by using the `sleep` command with a 1 second delay.
sleep 1

# Next, we use the `tail` command to monitor the output of the QueueHandler. 
# The `-f` flag allows us to continuously monitor the output, while the `/dev/stdout` argument specifies the standard output as the source.
# The `grep` command is used to filter the output and only display log messages starting with "Started".
tail -f /dev/stdout | grep 'Started'

This command will run our Python script in the background, then sleep for a second. After that, it will use `tail` to monitor the output of our QueueHandler (which is being written to stdout) and filter out any lines that don’t start with “Started”. This way, we can see when each worker process starts without having to scroll through all the other log messages.

And there you have it multiprocessing logging in Python! It may seem like a small thing, but it can be incredibly useful for certain scenarios (especially if you’re working with large datasets or running long-running processes). Give it a try !

Python Multiprocessing Logging

Social

About

Privacy