You might have heard of it before, or maybe you haven’t. But either way, we’re going to take a closer look into Python’s implementation and hopefully make it a little less intimidating for everyone.
First things first: what is garbage collection? In simple terms, it’s the process by which your computer automatically frees up memory that’s no longer being used by your program. This can be especially helpful in languages like Python where we don’t have to explicitly manage our own memory allocation (like C or Java).
So how does Python do garbage collection? Well, it uses a technique called reference counting. Essentially, every object in Python has a counter that keeps track of the number of references pointing to it. When an object is no longer being used by your program (i.e., all its references have been removed), its counter goes down to zero and the garbage collector kicks in.
Here’s how you can see this in action:
# Create a list with three elements
x = [1, 2, 3]
# Get the memory address of x
print(id(x))
# Remove the first element from the list (decrement reference count)
del x[0]
# Check that the length has decremented by one
print(len(x))
# See if the memory address is still the same
print(id(x))
# Output:
# 140708032032640
# 2
# 140708032032640
# Explanation:
# The first line creates a list with three elements and assigns it to the variable x.
# The second line uses the id() function to get the memory address of the list object x.
# The third line uses the del keyword to remove the first element from the list, which also decrements the reference count for that element.
# The fourth line uses the len() function to check the length of the list, which should now be two.
# The fifth line uses the id() function again to check if the memory address of the list object x has changed after removing an element. Since the address is the same, it means that the list object was modified in place rather than creating a new object.
In this example, we create a new list `x`, print its memory address using the `id()` function, remove the first element from it (which should decrease the reference count), and then check that the length has decremented by one. Finally, we print out the memory address again to see if it’s still the same this is important because Python uses a technique called object sharing where multiple variables can point to the same object in memory.
So what happens when an object’s reference count goes down to zero? Well, that’s when things get interesting! In Python, there are two different garbage collection algorithms: Reference Counting and Cyclic Garbage Collection (CGC). Let’s see this up close at each one.
Reference counting is the default algorithm used by CPython (the most popular implementation of Python), and it works like this: whenever an object is created, its reference count is set to 1. Whenever another variable points to that object, its reference count increases by 1. And when a variable stops pointing to an object, its reference count decreases by 1. When the reference count goes down to zero, the garbage collector kicks in and frees up the memory for that object.
Cyclic Garbage Collection (CGC) is used to handle circular references between objects this can happen when you have two lists pointing to each other or a function calling itself recursively. In these cases, reference counting alone won’t work because the reference count will never go down to zero for those objects. To solve this problem, CPython uses a technique called mark-and-sweep garbage collection (which is also used by Java and C#).
The basic idea behind mark-and-sweep is that you first “mark” all the reachable objects in your program (i.e., those that can be accessed from any variable), then you “sweep” through memory and free up any unreachable objects. This process can take a while, especially for large programs with lots of circular references.
While reference counting is the default option, CGC comes into play when dealing with cyclic references. And if you ever find yourself struggling with memory management or performance issues, don’t hesitate to reach out for help.