Wednesday, 20 June 2012

Cross-compiling Python for arm, with muddle

This week I needed to cross-compile Python. The "local" platform is an x86, and the target is an ARM. My specific purpose is to be able to use the KBUS Python bindings on the target platform, which means I also care about things like ctypes.

Unfortunately, Python is not terribly simple to cross-compile, partly because it relies on building itself twice. Luckily, someone else has already documented what to do - namely Paul Gibson at http://randomsplat.com/id5-cross-compiling-python-for-embedded-linux.html. That blog post describes the problem, provides the necessary patches for several versions of Python, and then describes how to build it at the command line.

(At time of writing, it doesn't have a patch for Python 2.7.3, but 2.7.2 is recent enough for our purposes, and still available from http://www.python.org/download/releases/2.7.2)

So the issue becomes simply how to amend the procedure for a muddle environment.

As is explained in the RandomSplat blog post, we need to build Python twice, once for the host platform, and once for the target. In muddle, that means we need two roles, so our muddle build description needs to contain something like:

    # Python. Once on the host and once for the target
    make.twolevel(builder, 'python', [HOST_TOOLS], 'thirdparty', 'Python-2.7.2')
    make.twolevel(builder, 'python', [THIRDPARTY], 'thirdparty', 'Python-2.7.2')
    pkg.append_env_for_package(builder, 'python', [THIRDPARTY],
                                                'TARGET_CONFIGURE_TARGET', 'arm-none-linux-gnueabi')
    # We need the host Python to build the target Python
    do_depend(builder, 'python' , [THIRDPARTY], [ ('python', HOST_TOOLS) ])

The TARGET_CONFIGURE_TARGET environment variable is used to tell the muddle Makefile what our target architecture actually is.

Whilst we are at it, it is also worth telling other packages where to put their Python packages - site-packages is traditional, but we have to say where that is on the target system. So something like:

    # Tell KBUS where to put its Python module
    pkg.append_env_for_package(builder, 'kbus', [SUPPORT],
                                                'TARGET_PYTHON_SITE_DIR',
                                                 '/usr/lib/python2.7/site-packages')

is useful - the muddle Makefile for KBUS can then use the TARGET_PYTHON_SITE_DIR environment variable in its install target (and I hope eventually to update the muddle Makefile we supply with the KBUS in google code to include use of such a value).

So, I downloaded the Python package, unpacked it into its source directory, and made it available to the build - for instance:

  $ pushd src/thirdparty
  $ tar -zxvf Python-2.7.2.tgz
  $ cd Python-2.7.2
  $ git init
  $ git add *
  $ git commit -m 'Initial commit of Python 2.7.2'

I then applied the appropriate patches from the RandomSplat blog, and commited them. I carefully included the URL of the blog post in the commit message, for future reference.

(I also needed to make a remote repository available for Python 2.7.2, and do the "muddle import; muddle push" dance to link up to it, and so on - but I assume you know how to do that.)

Finally, we need a muddle Makefile to drive all of this. Given we've got two roles, we could specify a muddle Makefile for each of them, but I think it is actually easier, in this case, to  have everything in a single file. So we start with the traditional:

# Muddle makefile for python.
CFLAGS  += $(MUDDLE_INCLUDE_DIRS:%=-I%)
LDFLAGS += $(MUDDLE_LIB_DIRS:%=-L%)

and then some comments explaining why we've got two roles.
(By the way, blogger doesn't seem to know what to do with tabs in my quotes, so I'm afraid there are many spaces instead. Still, you weren't going to cut-and-paste my Makefile examples anyway, were you?)
Then we handle the first (host) role:

ifeq ($(MUDDLE_ROLE), host-tools)
# Building on the host
all:
  (cd $(MUDDLE_OBJ_OBJ); make python Parser/pgen)

# When Python builds, it compiles all the .py files in $(MUDDLE_SRC)/Lib,
# even if we're otherwise doing an out-of-tree build.
# I can't see a way around it, so we'll copy the whole tree instead.
config:
  -mkdir -p $(MUDDLE_OBJ_OBJ)
$(MUDDLE) copywithout $(MUDDLE_SRC) $(MUDDLE_OBJ_OBJ) .git
(cd $(MUDDLE_OBJ_OBJ); PYTHONPATH=  PYTHONHOME=  ./configure)

install:
  # No need to install the host-tools Python

As the comments say, it seemed necessary to copy the source tree because of .py files getting compiled. Ah well.

Next we have the target part of the build. In the actual makefile, this includes some comments referencing the RandomSplat blog post, and repeating the actual command lines suggested there for building. This is useful context for the lines I actually use, but I shan't repeat here. So:

else
# Cross compiling
# We use the host-tools Python we assume we already built
HOSTPYTHONDIR = $(shell $(MUDDLE) query objdir package:python{host-tools})/obj
HOSTPYTHON = $(HOSTPYTHONDIR)/python
HOSTPGEN = $(HOSTPYTHONDIR)/Parser/pgen

all:
(cd $(MUDDLE_OBJ_OBJ); \
         make \
PYTHONPATH= \
PYTHONHOME= \
CFLAGS='$(CFLAGS)' \
LDFLAGS='$(LDFLAGS)' \
HOSTPYTHON='$(HOSTPYTHON)' \
HOSTPGEN='$(HOSTPGEN)' \
CROSS_COMPILE_TARGET=yes \
CROSS_COMPILE=arm-none-linux-gnueabi- \
HOSTARCH=$(TARGET_CONFIGURE_TARGET) \
BUILDARCH=$(shell uname -m) \
)

# Interestingly, building this doesn't try to amend $(MUDDLE_SRC)/Lib, so
# we don't need to copy the sources. Of course, the host-tools Python we
# use in our build will be using its copy of the sources when it needs them,
# but they are identical to those in $(MUDDLE_SRC), so we don't mind.
#
# To constrain the configure stage to only look at libraries in our
# MUDDLE_LIB_DIRS, we set LD_LIBRARY_PATH to MUDDLE_PKGCONFIG_DIRS_AS_PATH
config:
  -mkdir -p $(MUDDLE_OBJ_OBJ)
(cd $(MUDDLE_OBJ_OBJ); \
PYTHONPATH= \
PYTHONHOME= \
HOSTPYTHON='$(HOSTPYTHON)' \
CFLAGS='$(CFLAGS)' \
LDFLAGS='$(LDFLAGS)' \
LD_LIBRARY_PATH=$(MUDDLE_PKGCONFIG_DIRS_AS_PATH) \
      $(MUDDLE_SRC)/configure \
--host=$(TARGET_CONFIGURE_TARGET) \
--prefix=/usr \
)

install:
  (cd $(MUDDLE_OBJ_OBJ); \
  make install \
PYTHONPATH= \
PYTHONHOME= \
CFLAGS='$(CFLAGS)' \
CPPFLAGS='$(CFLAGS)' \
LDFLAGS='$(LDFLAGS)' \
HOSTPYTHON='$(HOSTPYTHON)' \
  DESTDIR=$(MUDDLE_INSTALL) \
CROSS_COMPILE_TARGET=yes \
CROSS_COMPILE=arm-none-linux-gnueabi- \
HOSTARCH=$(TARGET_CONFIGURE_TARGET) \
)
endif

The thing I had wrong here, for a while, was not specifying BUILDARCH. This is needed by the patches used to tell the setup.py code used to build _ctypes what to do, and not specifying it caused ctypes support not to be compiled. This is the sort of mistake one needs to look out for when taking someone else's command lines (which work) and removing things from them "because they're not needed" - sometimes that's not so!

Finally, we have some common trailing code:

clean:
  (cd $(MUDDLE_OBJ_OBJ); make clean)

distclean:
  rm -rf $(MUDDLE_OBJ_OBJ)

just as you might expect.

Meanwhile, I note that there was a talk on cross-compiling Python at PyCon 2012: https://us.pycon.org/2012/schedule/presentation/11/. There doesn't seem to be anything other than the abstract for the talk, but I do note that the last item in it is "Invitation to discuss ways to make Python more accessible to embedded developers", so maybe there's some impetus towards a solution building up.



No comments:

Post a Comment