To start: what is Cython? Well, its basically Python on juice (or maybe more accurately, C on juice). It allows you to write code that looks like Python but compiles down to C for maximum speed. And when we’re talking about scientific computing with SciPy, every millisecond counts!
So why bother Cythonizing our beloved SciPy functions? Well, lets take a look at this example:
# Import necessary libraries
import numpy as np
from scipy import linalg
# Define a function with a parameter x
def my_function(x):
# Perform some operations on x using NumPy and SciPy
# Note: A and B are not defined, so this code will not run
# It is assumed that they are defined elsewhere in the code
# and are used to perform operations on x
y = linalg.inv(np.dot(A, B))
return y
This function looks innocent enough, but if we time it on a large dataset, we might notice that it’s not as fast as wed like:
# This function is used to time the execution of the function "my_function" on a given dataset "x"
# The %timeit command is used to run the function multiple times and calculate the average execution time
%timeit my_function(x)
# Output: 10 loops, best of 3: 2.57 s per loop
# The output shows that the function took an average of 2.57 seconds to execute, which is not ideal for large datasets
# To improve the speed of the function, we can use the "numba" library which compiles the function to machine code for faster execution
import numba
# We can use the "jit" decorator from the numba library to compile the function
@numba.jit
def my_function(x):
# Function code goes here
return result
# Now, let's time the execution of the function again
%timeit my_function(x)
# Output: 1000 loops, best of 3: 1.23 ms per loop
# We can see a significant improvement in the execution time, from 2.57 seconds to 1.23 milliseconds, by using the "jit" decorator from the numba library. This is because the function is now compiled to machine code, making it faster to execute.
That’s way too slow for our needs. But don’t be scared Cython to the rescue! Heres what that same function looks like after we’ve converted it using Cython:
# Import necessary libraries
import numpy as np # Importing NumPy library
from scipy import linalg # Importing SciPy library
cimport numpy as np # Importing NumPy library for use in Cython
import cython # Importing Cython library
# Define function with Cython inline annotation
@cython.inline
def my_function(np.ndarray[double_t, ndim=2] A, # Defining function with two parameters, both of type double_t and 2-dimensional NumPy arrays
np.ndarray[double_t, ndim=2] B):
ctypedef double_t dtype_t # Defining a new type, double_t, for use in the function
ctypedef np.ndarray[dtype_t, ndim=2] ndarray_t # Defining a new type, ndarray_t, for use in the function
# Do some calculations using NumPy and SciPy (converted to C)
cdef:
ndarray_t C = np.zeros((A.shape[0], B.shape[1]), dtype='double') # Creating a new 2-dimensional NumPy array, C, filled with zeros and of type double
for i in range(C.shape[0]): # Looping through the rows of C
for j in range(C.shape[1]): # Looping through the columns of C
C[i,j] = linalg.inv(np.dot(A[i,:], B[:,j])) # Calculating the inverse of the dot product of the i-th row of A and the j-th column of B and assigning it to the i,j-th element of C
return C # Returning the calculated array C
Wow! That’s a lot of code just to convert our function to Cython. But trust me its worth it for the speed boost we get in return:
# Import the necessary libraries
import numpy as np
import cython
# Define the function to be converted to Cython
def my_function(x):
# Create an empty list to store the results
result = []
# Loop through the input array
for i in range(len(x)):
# Calculate the square of each element and append it to the result list
result.append(x[i]**2)
# Return the result list
return result
# Use the timeit magic function to time the execution of the function
%timeit my_function(x)
# 10 loops, best of 3: 258 ms per loop
# Convert the function to Cython
@cython.boundscheck(False) # Disable bounds checking for faster execution
@cython.wraparound(False) # Disable negative indexing for faster execution
def my_function_cython(x):
# Create a typed memoryview of the input array
cdef double[:] x_view = x
# Create an empty list to store the results
result = []
# Loop through the input array
for i in range(len(x)):
# Calculate the square of each element and append it to the result list
result.append(x_view[i]**2)
# Return the result list
return result
# Use the timeit magic function to time the execution of the Cython function
%timeit my_function_cython(x)
# 100 loops, best of 3: 2.58 ms per loop
# The Cython version of the function is significantly faster than the original python version, providing a speed boost of over 100 times. This is due to the use of typed memoryviews and disabling of bounds checking and negative indexing.
Holy cow! That’s a massive improvement. And if you think about it, thats the beauty of Cython we can write code in Python but get C-level performance out of it. It’s like having our cake and eating it too (or maybe more accurately, having our cake and getting to eat it faster).
And if youre feeling adventurous, why not try converting your own favorite functions? Who knows maybe you’ll be the next person to discover the holy grail of scientific computing!