Python Serialization and Deserialization

To start: what the ***** is serialization? Well, it’s basically taking complex data (like objects or dictionaries) and turning them into something that can be stored on a disk or transmitted over a network. It’s like converting your favorite recipe from English to Spanish so you can share it with your amigos in Mexico.

Now, why would we want to do this? Well, let’s say you have a massive dataset that takes up way too much memory on your computer. Instead of keeping all that data loaded into RAM (which can slow down your machine), you can serialize it and store it on disk or transmit it over the internet. This is called “data persistence” storing data so it doesn’t disappear when you close your program.

So, how do we actually do this in Python? Well, there are a few different modules that can help us out: pickle, JSON, and marshal. But for the sake of simplicity (and because it’s operationally the easiest), let’s focus on pickle.

Here’s an example script to get you started:

# Import the pickle module
import pickle

# Create a dictionary with some data
my_data = {
    "name": "John Doe",
    "age": 30,
    "occupation": "Software Engineer"
}

# Serialize the data using pickle and save it to a file named "output.pkl"
with open("output.pkl", "wb") as f:
    pickle.dump(my_data, f) # Use the dump() function to serialize the data and save it to the file "output.pkl" in binary mode ("wb")

# The data is now saved in the file "output.pkl" and can be retrieved later using the load() function from the pickle module.

In this example, we’re creating a dictionary called `my_data`, which contains some basic information about someone named John Doe. We then use the `pickle.dump()` function to write that data to a file called “output.pkl”.

Now let’s say you want to load that serialized data back into Python:

# Import the pickle module
import pickle

# Open the output file in read mode and use pickle to deserialize the data
with open("output.pkl", "rb") as f:
    my_data = pickle.load(f) # Load the serialized data from the file and assign it to the variable my_data

# Print out the deserialized data
print(my_data) # Print the contents of my_data, which should contain the information about John Doe

In this example, we’re opening up our output file and reading in the serialized data using `pickle.load()`. This function automatically converts the byte stream back into a Python object (in this case, a dictionary).

And that’s it! You now have a basic understanding of serialization and deserialization in Python using pickle.

SICORPS