Python C Extensions
An interesting feature offered to developers by the CPython implementation is the ease of interfacing C code to Python.
Reasons to interface C with Python?
Speed. C is about 50x faster than Python.
Certain legacy C libraries work just as well as you want them to, so you don’t want to rewrite them in python.
Low level resource access - from memory to file interfaces.
3 key methods
3 key method developers use to call C functions from their python code
ctypes
SWIG
Python/C API
Each method comes with it’s own merits and demerits
CTypes
The easiest way to call C functions from Python.
Provides C compatible data types and functions to load DLLs so that calls can be made to C shared libraries without having to modify them. The fact that the C side needn’t be touched adds to the simplicity of this method.
Example
Simple C code to add two numbers, save it as add.c
Compile the C file to a .so
file (DLL in windows) (will generate an adder.so file)
Now in python:
Output:
The ctypes interface allows us to use native python integers and strings by default while calling the C functions.
For other types such as boolean or float, we have to use the correct ctypes. This is seen while passing parameters to the adder.add_float()
. We first create the required c_float types from python decimal values, and then use them as arguments to the C code.
Clean, but limited: it’s not possible to manipulate objects on the C side.
SWIG
Simplified Wrapper and Interface Generator.
The developer must develop an extra interface file which is an input to SWIG (the command line utility).
Python developers generally don’t use this method, because it is in most cases unnecessarily complex.
Great method when you have a C/C++ code base, and you want to interface it to many different languages.
The C code, example.c
that has a variety of functions and variables
The interface file - this will remain the same irrespective of the language you want to port your C code to :
And now to compile it
Finally, the Python output
Slightly more involved effort. But it’s worth it if you are targeting multiple languages.
Python/C API
The most widely used method.
You can manipulate python objects in your C code.
Requires your C code to be specifically written for interfacing with Python code.
All Python objects are represented as a PyObject struct and the Python.h
header file provides various functions to manipulate it. For example if the PyObject is also a PyListType (basically a list), then we can use the PyList_Size()
function on the struct to get the length of the list. This is equivalent to calling len(list)
in python. Most of the basic functions/opertions that are there for native Python objects are made available in C via the Python.h
header.
Example: write a C extension that adds all the elements in a python list (numbers)
The final interface we’d like to have, the python file that uses the C extension :
The above looks like any ordinary python file, but the addList module is written in C!
The C code that get’s built into the addList
Python module:
A step by step explanation :
The
<Python.h>
file consists of all the required types (to represent Python object types) and function definitions (to operate on the python objects).Next we write the function which we plan to call from python. Conventionally the function names are {module-name}_{function-name}, which in this case is
addList_add
. More about the function later.Then fill in the info table - which contains all the relevant info of the functions we desire to have in the module. Every row corresponds to a function, with the last one being a sentinel value (row of null elements).
Finally the module initialization block which is of the signature
PyMODINIT_FUNC init{module-name}
.
The function addList_add
accepts arguments as a PyObject type struct (args is also a tuple type - but since everything in python is an object, we use the generic PyObject notion). The incoming arguments is parsed (basically split the tuple into individual elements) by PyArg_ParseTuple()
. The first parameter is the argument variable to be parsed. The second argument is a string that tells us how to parse each element in the args tuple. The character in the Nth position of the string tells us the type of the Nth element in the args tuple, example - ‘i’ would mean integer, ‘s’ would mean string and ‘O’ would mean a Python object. Next multiple arguments follow, these are where you would like the PyArg_ParseTuple()
function to store all the elements that it has parsed. The number of such arguments is equal to the number of arguments which the module function expects to receive, and positional integrity is maintained. For example if we expected a string, integer and a python list in that order, the function signature would be
In this case we only have to extract a list object, and store it in the variable listObj
. We then use the PyList_Size()
function on our list object and get the length. This is similar to how you would call len(list)
in python.
Now we loop through the list, get each element using the PyList_GetItem(list, index)
function. This returns a PyObject*. But since we know that the Python objects are also PyIntType
, we just use the PyInt_AsLong(PyObj *)
function to get the required value. We do this for every element and finally get the sum.
The sum is converted to a python object and is returned to the Python code with the help of Py_BuildValue()
. Here the “i” indicates that the value we want to build is a python integer object.
Now we build the C module. Save the following code as setup.py
and run
This should now build and install the C file into the python module we desire.
After all this hard work, we’ll now test if the module works -
And here is the output
So as you can see, we have developed our first successful C Python extension using the Python.h API. This method does seem complex at first, but once you get used to it it can prove to be quite useful.
Other alternatives
Last updated