Build CFFI wheels with Hatch

Jul 01, 2023 Tags: Python Cffi

In my previous post I suggested that you should use setuptools if you wanted to build and publish cffi extensions. At first I followed my own suggestion to publish these cffi bindings, but later I decided I wanted to try something different, to challenge myself but also to improve my understanding of how python packages are built. To achieve this I decided to use a relatively new python package manager, Hatch.

The following post is not a complete tutorial, I will only explain the general ideas and some tricky concepts. The complete code can be found here: cffi_build.py and hatch_build.py

Hatch plugin

I chose Hatch because, as I wrote in my previous post, it has a very nice and well documented plugin API. It is also very powerful, you can customize almost everything of the project life-cycle.

If we want to build a wheel for our ffi bindings we need to compile it during the build process and then include the compilation output in our wheel. To achieve this we need to write a build hook plugin. Hatch provides an easy way to do this, we only need to create a new file named hatch_build.py in the root of our repo, and then subclassing the BuilHookInterface, just like this

from hatchling.builders.hooks.plugin.interface import BuildHookInterface

class CustomBuildHook(BuildHookInterface):
    def initialize(self, version, build_data):
        if self.target_name != "wheel":
            return

Compilation step

Now we need to actually compile our project. Unfortunately during the build process our package isn't installed yet, so we can't import our build script like we usually do, we need to get a bit creative. Fortunately cffi has already solved this problem, so I copied their solution from here and simplified it a little bit.

class CustomBuildHook(BuildHookInterface):
    def get_ffi_object(self, script: Path, ffi_name: str) -> FFI:
        with open(script) as f:
            src = f.read()
        code = compile(src, script, "exec")
        build_vars = {"__name__": "__cffi__", "__file__": script}
        exec(code, build_vars, build_vars)
        if ffi_name not in build_vars:
            raise Exception
        ffi = build_vars[ffi_name]
        if not isinstance(ffi, FFI):
            ffi = ffi()
        if not isinstance(ffi, FFI):
            raise Exception
        return ffi
    ...

Instead of importing the module we get the source file, compile it, and then extract the ffi object from it. Now we just need to execute this function in the initialize function, with the correct parameters. Instead of hardcoding them in the code we can get them from pyproject.toml.

...
[tool.hatch.build.targets.wheel.hooks.custom]
cffi_modules = ["cffi_build.py:create_cffi"]
...

We follow the same syntax as setuptools based cffi projects, with the first part of the string being the path to the script, and the second the name of the CFFI object or a function that returns it.

...
def initialize(self, version, build_data):
    if self.target_name != "wheel":
        return

    cffi_config = [x.split(":") for x in self.config.get("cffi_modules", [])]

    for script, ffi_name in cffi_config:

        ffi = self.get_ffi_object(script, ffi_name)
        ffi.compile()
...

Including the build artifacts

Amost there, now we only need to tell hatch to include the compiled file in our target wheel. To do this we only need to modify build_data, a dictionary which is passed to our initialize function (You can read more about it here). We need to

from pathlib import Path
...
def initialize(self, version, build_data):
    if self.target_name != "wheel":
        return

    cffi_config = [x.split(":") for x in self.config.get("cffi_modules", [])]

    for script, ffi_name in cffi_config:

        ffi = self.get_ffi_object(script, ffi_name)
        filepath = ffi.compile()

        build_data["force_include"][filepath] = Path(filepath).name

    build_data["pure_python"] = False
    build_data["infer_tag"] = True
...

build_data["force_include"] is itself a dictionary, in which a key is a path on the filesystem to an artifact file, and the value is the path of the artifact in the wheel. The path in the filesystem is the return value of ffi.compile(). To get the path in the wheel we need only the last part of it, the filename; I used pathlib.Path to do this, but you can use also os.path.

We also we need to tell hatch that our wheel is not python-only, as it contains compiled files. To do this we simply have to set the pure_python field to False. Finally we need to specify the tag of our wheel. Fortunately hatch can infer it automatically from the files included in the wheel, so we only need to set build_data["infer_tag"] to True.

We have finally a working hatch plugin. To build it just type in your terminal

hatch build -t wheel

You can now happily delete your setup.py.

Compile without setuptools

Unfortunately we are not finished yet. You may not realize it, but under the hood ffi.compile() uses distutils, a python project which used to be part of the standard library, but now it lives inside setuptools. So we still need setuptools to build our wheel!

To ditch setuptools completely we thus need to rool our own equivalent for the compile function. We assume you are on linux, or a unix like system, so we will use gcc to compile our code. We will need two invocations, one to compile our code and one to link it to create a shared object that can be imported by python.

But first we have to extract some information from the ffi object, more precisely its name, the libraries with which we have to compile it and their location. We also need the correct file extension for our shared object, and we can obtain this information using sysconfig , a python module in the standard library.

import sysconfig
...
for script, ffi_name in cffi_config:

    ffi = self.get_ffi_object(script, ffi_name)

    name, source, source_extension, kwds = ffi._assigned_source

    libraries = kwds["libraries"]
    library_dirs = kwds["library_dirs"]

    c_filename = name + ".c"namely
    o_filename = name + ".o"
    so_filename = name + f".{sysconfig.get_config_var('EXT_SUFFIX')}"

    build_dir = Path("build")
    c_path = build_dir / c_filename
    so_path = build_dir / so_filename

    ...

    build_data["force_include"][str(so_path)] = so_filename
...

Now we just need to call the compiler with the appropriate flags. To do this we will use the subprocess module.

import subprocess
...
    ffi.emit_c_code(str(c_path))
    compile_command = [
        "gcc",
        f"-I{sysconfig.get_path('include')}",
        f"-I{sysconfig.get_path('platinclude')}",
        "-fPIC",
        "-c",
        str(c_filename),
        "-o",
        str(o_filename),
    ]
    link_command = [
        "gcc",
        "-shared",
        str(o_filename),
        *[f"-L{libs_dir}" for libs_dir in library_dirs],
        *[f"-l{lib}" for lib in libraries],
        "-o",
        str(so_filename),
    ]
    subprocess.call(compile_command, cwd=build_dir)  # nosec B603 B607
    subprocess.call(link_command, cwd=build_dir)
...

There are two flags that requires an explanation:

sysconfig.get_path('include') and sysconfig.get_path('platinclude') are needed to compile the code on MacOS, as the compiler doesn't include the python library automaticaly.
-fPIC is needed to succesffully compile a cffi extension. You can learn more about it here

Apart from those two I think that hte flags are very easy to understand if you have compiled a c file once. Now everything is complete, you have successfully compile a cffi extension without setuptools!!

Conclusion

As I have said in the introduction, here and here you can find the complete source code. For the sake of brevity in this post I excluded some things, like adding support for ABI level, out-of-line mode of cffi, and the structure of the code is a little different. Still I think after reading this posts navigating the code will be quite easy.

Anyway, I hoped you liked this post! Let me know if you followed my suggestions and decided to build wheels using the Hatch package manager. `