MA 346 LOYO -- Cython
By Zach and Dennis
Why was Cython invented?
Python is widely used because it has relatively easy to use syntax and is supported by many libraries, making it a very popular language used in many different fields.
However, while Python may be very flexible, it is lacking in speed and efficiency. In situations that call for computations with lots of data or numerically intensive code such as algorithms, Python's inefficiencies become very pronounced, with run times reaching several minutes to hours, or simply timing out if you're using an online service such as DeepNote.
Cython addresses these issues by attempting to combine the efficient qualities of the C programming language with the easy-to-use syntax of Python. In fact, C implementation with Python is nothing new. For example, several popular Data Science libraries that we have used in class such as Numpy and SciPy are all mostly written in C.
How does Cython work?
Cython essentially extends the capabilities of Python by providing a custom set of type declarations and syntax elements that allow the code to be compiled in C, which is known as the most time efficient programming language.
For example, Python supports "dynamic typing", meaning that you're not required to enter the data type of the variable (e.g. String, int, float, boolean, etc.) when declaring it. While this is very convenient for programming, it is less efficient in regards to run times. Cython allows users to add explicit static type declarations which result in much bigger gains in performance.
Example: Dynamic Typing vs. Static Typing
Under Python, users do not enter the data type of their variables when declaring them. Python will automatically recognize the data type of the variable when at run time.
In Cython, users have the option to type out the data type of their variables when declaring. As a result, each variable will already be known by compile time.
Other Benefits of Cython
Cython can work as a Python compiler. This means that (most) Python code will compile perfectly in Cython, even if it probably wouldn't gain much in performance.
For the programmer, this means that when converting their code to be used in Cython, they can change their code incrementally instead of rewriting it from the ground-up. This also means that they can keep most of their code the same while only replacing the computation-heavy parts with Cython to get its benefits.
Python vs. Cython Efficiency Examples
Below we have included several code comparisons between Python and Cython to illustrate the efficiency benefits, measured in seconds needed to run the code.
This is an example implementation of an integral function taken from Cython's documentation. Because Python has to convert back and forth between primitive data types (int, float, etc.) and internal object types (the f(x) function), the Python compile time will be slow.
However, converting the Python code to Cython with static data types yields some significant performance gains.
At the time of writing, the Python code cell took 0.0037 seconds to run, while the Cython cell took 5.1022e-05 seconds to run, a decrease of almost 99%!
Exact Same Code -- Bubble Sort in Python & Cython
Even with absolutely no changes in the code, simply copy and pasting our bubble sort algorithm into Cython yields some performance gains.
For our test we decided to generate a list of 10,000 random numbers ranging from 0 to 10,000 (there are likely to be duplicates)
The Cython code snippet (~6 seconds) took less than half the time of the Python code snippet (~13 seconds) , and this is just for indexing and comparison operations.
Bubble sort is a fun algorithm to test because it's the simplest of all the sorting algorithms, and runs with a relatively poor time complexity of О(n2). Meaning that as the size of our data set (n) grows, the time it takes for our algorithm to sort the data increases at a rate of n2.
One Last Example -- Factorial Computation
Factorials would be interesting to explore, because we can program them using recurison! Lets see how Python & Cython handle this code
As of the time of writing, the Python code snippet took ~0.14 seconds, while Cython took ~0.07 seconds.
While there is some semblance of a difference between the runtimes of computing 1900! in Python and Cython, the difference was not as extreme as in prior examples.
Through our examples we saw some of the really cool things that Cython can do in terms of static typing and speed boosts, but we also came to realize that Python in and of itself is an incredible and all purpose programming language. There is always a time and place for finding the right tool for the job, and with all of its libraries, extentions, and plugins python is a wise choice for an aspiring developer or data scientist. No matter the compute power of the task the python toolbox is versatile enough to contain something for everyone.
How do I get started with Cython?
Cython is usually included along with your Anaconda installation. If this isn't the case, you can install it through Anaconda or pip.
conda install cython # or through pip pip install cython
Cython in a Jupyter Notebook
To enable support for Cython compilation in the Jupyter Notebook, you need to import the Cython Jupyter Extension: '%load_ext cython'.
This code snippet should take a cell all by itself:
To use Cython in a code snippet, prefix the code snippet with this line, which tells the cell to be compiled to C:
%%cython # the rest of your code goes here!
Importing a Cython module in a Script File
To write Cython modules to be used in your Python scripts without the interaction of a notebook, the recommended option is to use a helper script.
You will need the Cython pip package to compile the code:
pip install cython
Your main Cython file should have the extension
.pyx and your helper file can be called something like
setup.py file should look something like this:
from setuptools import setup from Cython.Build import cythonize setup( name='Cython example application', ext_modules=cythonize("cython_example.pyx"), zip_safe=False, )
cython_example.pyx file should contain your Cython code however you see fit:
def example_function(): cdef char* cython_string = "Hello World From a Cython Module!" print(cython_string)
To compile your Cython module run the helper function as follows:
python setup.py build_ext --inplace
From the directory in which you compiled, there should be a number of mysterious folders and files. What actually happened is that you just built a module (aka library) for use in your other python scripts (so long as you access the module from the directory in which you built it)!
To test this you can run an interactive Python shell in your terminal:
attempt to import the function from the module
from cython_example import example_function
Now, you can run the function and return a string that was created in Cython!
The opportunities for building longer modules are endless, and you can integrate them directly into your Python scripts as long as your code lives within the same directory that you built your Cython module from.