24.4.07. Python's Make Rake and Bake, another and again

Ian Bicking wrote a post recently titled “Python’s Makefile”. He advocates using / re-using distutils… er… setuptools. (I can’t keep them straight - they’ve both become absolute nightmares in my opinion). He then goes off about entry points, separate setup.cfg files, and other things that still go way over my head. The example he shows is convoluted, and I’m ultimately not entirely sure what he’s really advocating (besides the idea - which isn’t bad - of using the near-standard setup.py file/system instead of re-inventing).

But he mentions, earlier:

Because really people are talking about something more like rake — something where you can put together a bunch of code management tools. These aren’t commands provided by the code, these are commands used on the code.

We do have the infrastructure for this in Python, but no one is really using it. So I’m writing this to suggest people use it more: the setup.py file. So where in another environment someone does rake COMMAND, we can do python setup.py COMMAND.

For me, having an easy way to say bla bla COMMAND isn’t as important as having a good system for automating common tasks that I and/or my colleagues do frequently. As we started to depend on more and more code from internal and external repositories, due to our increased re-use when building on Zope 3, I really needed to automate checkouts and exports. Not everything was neatly packaged as an egg, or the released egg didn’t have a bugfix applied, and I still don’t understand how to make eggs work well with Zope 3 in a manner that I’m comfortable with.

I was initially excited about zc.buildout as a way to automate the monotonous but important tasks that revolve around setting up both deployment and development environments. But I didn’t like how zc.buildout specified its tasks/commands in INI format. It was relatively easy to write new ‘recipes’, so I wrote some recipes to do Subversion and CVS checkouts/exports.

But the INI format just pissed me off. It didn’t fit my needs, basically, wherein I needed more conditional control. More code control. And managing complex sets of parameters required making new top-level sections instead of nesting. Before long I was staring at a very long and very narrow file. And in the end, it was building Zope in a way that wouldn’t work for us. So I abandoned it.

I briefly looked at some tools that let you write these task files in “pure” Python. In this way, Scons appeared to be the closest thing in Python to Rake, which uses Ruby. But Scons seemed far more focused on general compilation issues (compiling C, Java, etc), but that’s never a problem that crosses my path.

I just wanted something like rake. What I liked about every Rakefile that I’ve seen is that it’s been quite readable. Rake makes common file / path commands readily available as Ruby methods, classes, and objects. Rake takes advantage of Ruby’s syntax, particularly blocks (and optional parenthesis) in a way that makes it not seem like, well, Ruby. It looks like something makefile-ish, something shell-scripting-ish, etc. That’s what I wanted; but, of course, in Python.

So I came up with a system. It’s not yet released to the world - far from finished, and there are many competing ideas out there that I don’t feel like competing with - but it’s already proven to be very useful internally. Generally, it’s been used to automate what I mentioned above: retrieving software from multiple repositories, both Subversion and CVS, and placing them in the proper directories. In particular, we try to stick with certain revisions for third party dependencies, and I got tired of trying to capture this information in READMEs and other files that we could refer to when installing certain configurations. It’s even been useful for downloading such software and applying internal patches::

patch = Command('patch')

@task('mysqldbda')
def mysqldbda():
    """ Installs mysqldbda from subversion and applies patch """
    svn = Subversion('svn://svn.zope.org/repos/main')
    svn.co('mysqldbda/tags/mysqldbda-1.0.0', target='mysqldbda')

    # patch mysqldbda
    log.info("patching mysqldbda")
    patchfile = path('fixes/mysqlda.1-5-07.patch')
    if patchfile.exists():
        print patch.read('-p1', '-i', patchfile)

@task('formencode')
def formencode():
    svn = Subversion('http://svn.colorstudy.com/FormEncode')
    svn.co('tags/0.6/formencode')

task('install', ['mysqldbda', 'formencode'])

It’s also been useful for tasks like getting MochiKit and generating all sorts of packed versions. A lot of what makes this possible is the path.py module, which provides a more object-oriented interface over os, os.path, and other Python file utilities.

ROCKFILEPATH = globals().get('ROCKFILEPATH', path('.'))
MOCHIKIT_LIB = ROCKFILEPATH/'libs'/'mochikit'
MOCHIKIT_DL = ROCKFILEPATH/'mochikit_dl'
MOCHIKIT_SRC = MOCHIKIT_DL/'MochiKit'
SCRATCH = MOCHIKIT_LIB/'_scratch.js'
mochikit = namespace('mochikit')

@mochikit.task('get')
def getmochikit():
    if MOCHIKIT_DL.exists() and bool(MOCHIKIT_DL.listdir()):
        return
    svn = Subversion('http://svn.mochikit.com/mochikit')
    svn.co('trunk', target=MOCHIKIT_DL)

@mochikit.task('clearmochilib')
def clearmochilib():
    for jscript in MOCHIKIT_LIB.files('*.js'):
        jscript.remove()

@mochikit.task('make-noexport')
def makenoexport():
    info = Subversion().info(MOCHIKIT_DL)
    src = NOEXPORT.safe_substitute(**info)
    file(MOCHIKIT_LIB/'NoExport.js','w').write(src)

@mochikit.task('build', ['get', 'clearmochilib', 'make-noexport'])
def mochi_install():
    for source in MOCHIKIT_SRC.files('*.js'):
        log.info('copy %s -> %s' % (source, MOCHIKIT_LIB))
        source.copy(MOCHIKIT_LIB)

# Javascript Packing tools (JSPack not shown - essentially it's a wrapper
# around combining and piping Javascript through Dojo's custom_rhino.jar
# to use its compression system)
def packmodules(sourcedir, modules, target):
    mods = [ (sourcedir/mod) for mod in modules ]
    log.info('Packing %s modules', path(target).name)
    JSPack(mods, target).run()

    if SCRATCH.exists():
        SCRATCH.remove()

def jsmin(sources, target):
    packmodules(MOCHIKIT_LIB, sources, MOCHIKIT_LIB/'min'/target)

@mochikit.task('minimize')
def mochiMinimize():
    """
    Generates packed versions of most individual MochiKit files, while
    combining a few core ones together.
    """
    mindir = MOCHIKIT_LIB/'min'
    for jscript in mindir.files('*.js'):
        jscript.remove()
    jsmin(['NoExport.js', 'Base.js', 'Iter.js', 'DOM.js'], 'base-iter-dom.js')
    jsmin(['Style.js', 'Signal.js'], 'style-signal.js')
    jsmin(['Async.js'], 'async.js')
    jsmin(['Color.js'], 'color.js')
    # ...

mochikit.task('install', ['build', 'minimize']).comment('INSTALL!')

I don’t think this falls under the jurisdiction of setup.py (distutils/setuptools). Nor would I want to specify these as zc.buildout recipes and have a separate configuration file to then name all of the files and directories. And, being Python, I don’t really have to deal with compilation steps so I don’t need wrappers around gcc and friends. I’m not (yet) specifying how to build large deployment scenarios. I just need to automate some development tasks, and I need to be able to write them easily. I want to write them in Python, but I want to ensure that they don’t accidentally get imported into normal projects (hence, the files above don’t have a .py extension). And as this is a specialized task, I’ll allow myself to get away with Python shortcuts that I would never touch in normal development, such as import *. In fact, it’s the import * that gives me a lot of the common commands/tools, such as the classes for interacting with Subversion and CVS, managing working directories, etc.

This really stemmed from reading this article by Martin Fowler about people wanting to replace ant with Rake with the advent of JRuby. In the post, Martin states:

The thing with build scripts is that you need both declarative and procedural qualities. The heart of a build file is defining tasks and the dependencies between them. This is the declarative part, and is where tools like ant and make excel. The trouble is that as builds get more complex these structures aren’t enough. You begin to need conditional logic; in particular you need the ability to define your own abstractions. (See my rake article for examples.)

Rake’s strength is that it gives you both of these. It provides a simple declarative syntax to define tasks and dependencies, but because this syntax is an internal DomainSpecificLanguage, you can seamlessly weave in the full power of Ruby.

At that point, I decided that this was the way to go: use Python decorators to wrap ‘task’ functions. The wrapper maintains dependency links, comments, and other things of interest to the internal system; and the wrapper allows the task name to be independent of the function name, allowing easier-to-type tasks for use from the file system. But the ‘task’ function is plain Python. Or, like some of the examples above show, task can be called without the @ symbol that makes it a decorator. Multiple callable actions can be added to a task, potentially allowing for more ‘declarative’ style:

mochikit.task('minimize').using_action(
  JSMinMap(
    {'style-signal.js': ['Style.js', 'Signal.js']},
    {'async.js': ['Async.js']},
  ))

Useful, I imagine, for very common patterns. Er. “Recipes”. In any case, it’s a very useful kind of tool. Beats setup.py, INI, or XML based automation language any day.

Labels: , , , , , , ,