4.5.07. ABC may be easy as 123, but it can't beat zope.interface

I guess the deadline may have come and gone for getting in PEPs for Python 3000. Guido’s already written up a PEP Parade.

Of particular interest to me has been the appearance of PEPs for Abstract Base Classes (PEP 3119) and the more exhaustive PEP 3124 which covers “Overloading, Generic Functions, Interfaces, and Adaptation.”

Both of these aim to provide ways of saying “this is file-ish”, “this is string-ish,” without requiring subclassing from a concrete “built-in” type/class. But I think they both fall short a little bit, while zope.interface (from the Zope 3 family) provides the best solution.

PEP 3119 (Abstract Base Classes) has a section covering comparisons to alternative techniques, and it specifically mentions “For now, I’ll leave it to proponents of Interfaces to explain why Interfaces are better.” So this is my brief attempt at explaining why.

A quote from PEP 3119 that I particularly like is “Like all other things in Python, these promises are in the nature of a gentlemen’s agreement…” The Interfaces as specified and used in Zope 3 and some other systems are the same way. They are not “bondange and discipline” Interfaces. They are not the ultra-rigid Eiffel contracts, nor are they the rigid and limited Interfaces as used by Java. They are basically a specification, and they can be used (as mentioned in PEP 3119) to provide additional metadata about a specification. There are some simple tools in zope.interface.verify to check an implementation against a specification, but those are often used in test suites; they’re not enforced hard by any system. The agreement might be “I need a seekable file”, which might mean it expects the methods/messages ‘read’, ‘seek’, and ‘tell’. If you only provide ‘read’ and ‘seek’, then it’s your fault for not living up to the agreement. That’s no different than the Python of today. What Interfaces and Abstract Base Classes aim to provide is a better clarification of what’s expected. Sometimes “file-like” in Python (today) means it just needs a ‘read’ method. Sometimes it means the full suite of file methods (read, readlines, seek, tell). Same thing with sequences: sometimes it just means “something iterable”. Other times it means “support append and extend and pop”.

Another side benefit of Interfaces as specification is that they provide a common language for, well, specifications. Many PEPs propose some sort of API, especially informational PEPs like WSGI (PEP 333) or API for Cryptographic Hash Functions (PEP 247). I’ll use PEP 247 as an example for my attempt at explaining why Zope 3’s Interfaces are Better.

A problem with Abstract Base Classes is this: they’re limited to classes. Even when PEP 3119 mentions Interfaces, it does so like this:

“Interfaces” in this context refers to a set of proposals for additional metadata elements attached to a class which are not part of the regular class hierarchy…

It then goes on to mention that such specifications (in some proposals and implementations) may be mutable; and then says that’s a problem since classes are shared state and one could mutate/violate intent. That’s a separate discussion that I’m not going to go into here.

What is important is this severely limited focus on classes. zope.interface works on objects as well, and not just normal ‘instances of a class’ object, but on classes themselves, and also modules.

There are two important verbs in zope.interface: implements and provides. provides is the most important one - it means that this Object, whatever that object may be, provides the specified interface directly.

implements is often used in class definitions. It means “instances of this class will provide the specified interface”. It can also be thought of in terms of Factories and/or Adaptation - “calling this object will give you something that provides the desired interface.”

“What does that matter?” you might ask. Well, there are all sorts of ways to compose objects in Python. A module is an object. It has members. A class is an object. An instance of a class is, of course, an object. Functions and methods are also objects in Python, but for the most part what we care about here are Modules, Classes, and Instances.

Because when it comes down to actual usage in code, it doesn’t particularly matter what an object is. In PEP 3124, the author (Phillip J Eby) shows the following interface:

class IStack(Interface):
    def push(self, ob)
        """Push 'ob' onto the stack"""

    def pop(self):
        """Pop a value and return it"""

Ignore the @abstract decorators, as they’re artifacts of the rest of his PEP and/or related to PEP 3119. What is important is the use of self.

“self” is an artifact of implementation that is invisible to use. Sure, you can write a Stack implementation like this. (Note: I’m going to use zope.interface terminology and style from here on out):

import zope.interface

class Stack(object):

    def __init__(self):
        self._stack = []

    def push(self, ob):

    def pop(self):
        return self._stack.pop()

But when it’s being used, it’s used like this:

def do_something_with_a_stack(stack):
    # ...
    top = stack.pop()

stack_instance = Stack()
# True
# False

# works fine
# raises an exception because `Stack.push(1)` is passing `1` 
# to `self`.. unbound method, bla bla bla.

Notice that there is no ‘self’ reference visibly used when dealing with the IStack implementation. This is an extremely important detail. What are some other ways that we may provide the IStack interface.

One way is to do it with class methods and properties, effectively making a singleton. (This isn’t a good way to do it, and is just here as an example).

import zope.interface

class StackedClass(object):

    _stack = []

    def push(class_, ob):

    def pop(class_):
        return class_._stack.pop()

# True

# this time it works, because `StackedClass.push(1)` is a class method,
# and is passing `StackedClass` to the `class_` parameter, and `1` 
# to `ob`.

Another variation of the above is using Static Methods:

import zope.interface

class StaticStack(object):

    _stack = []

    def push(ob):

    def pop():
        return StaticStack._stack.pop()

Again, StaticStack.push(1) and StaticStack.pop() work fine. Now lets try a third way - in a module! Let’s call this module mstack (file - mstack.py)

import zope.interface


_stack = []

def push(ob):

def pop():
    return _stack.pop()

Then in other code:

import mstack

# True

print mstack.pop()
# 2

So whether we’re dealing with the instance in the first example (stack_instance), the classes in the second two examples (StackedClass and StaticStack), or the module in the last example (mstack), they’re all objects that live up to the IStack agreement. So having self in the Interface is pointless. self is a binding detail.

Jim Fulton, the main author of zope.interface, taught me this a long time ago. Because in Zope 2, you could also make an IStack implementation using a Folder and a pair of Python scripts. Well, those Python scripts (as used in Zope 2 “through-the-web” development) have at least 4 binding arguments. Instead of ‘self’, the initial arguments are context, container, script, traverse_subpath. Just like self is automatically taken care of by the class-instance binding machinery, the four Zope Python Script binding arguments are automatically taken care of by Zope 2’s internal machinery. You never pass those arguments in directly, you just use it like push(ob) and pop().

So there it is - many ways to provide this simple “Stack” Interface. And I believe that both [PEP 3119] and [PEP 3124] are short sighted by focusing on the class-instance relationship exclusively (or so it appears).

And since many objects, particularly instances, are mutable, one could compose an IStack implementation on the fly.

class Prototype(object):
    """ Can be anything... """

pstack = Prototype()
pstack._stack = []

def pstack_push(ob):

def pstack_pop():
    return pstack._stack.pop()
pstack.push = pstack_push
pstack.pop = pstack_pop

# Now we can say that this particular instance provides the IStack
# interface directly - has no impact on the `Prototype` class
zope.interface.directlyProvides(pstack, IStack)

print pstack.pop()

# We can remove support as well
del pstack.push
zope.interface.noLongerProvides(pstack, IStack)

Examples of dynamically constructed objects in the real world - a network services client, particularly one that’s in an overwraught distributed object system (CORBA, SOAP, and other things that make you cry in the night). Dynamic local ‘stub’ objects may be created at run time, but those could still be said to provide a certain interface.

So now let’s look at whether it matters that you’re dealing with a class or not:

def PStack():
    pstack = Prototype()
    pstack._stack = []

    def pstack_push(ob):

    def pstack_pop():
        return pstack._stack.pop()

    pstack.push = pstack_push
    pstack.pop = pstack_pop
    zope.interface.directlyProvides(pstack, IStack)

    return pstack

def StackFactory():
    # Returns a new `Stack` instance from the earlier example
    return Stack()

import mstack
import random

def RandomStatic():
    # chooses between the two class based versions and module
    return random.choice([StackedClass, StaticStack, mstack])

All three are factories that will return an object that provides an IStack implementation, which is exactly the same as the Stack class in the first example. That also claimed that it implements(IStack). When the class is instantiated / called, a new object is made that provides the IStack interface. In Python, another thing that doesn’t really matter is whether something is a class or function. All of the following lines of code yield a result that is the same to the consumer. The internal details of what is returned may vary, but the IStack interface works on all of them:

Stack()         # class
PStack()        # 'Prototype' dynamically constructed object
StackFactory()  # Wrapper around basic class
RandomStatic()  # Chooses one of the class/static method implementations.

And whether we’re looking at the class implementation, or any of the factory based implementations, the result should be the same:

IStack.implementedBy(Stack) # class
# True
# False
# True

IStack.implementedBy(PStack)    # Factory
# True
# False
# True

No matter which method of instantiation is used, they should all pass the verifyObject check, which checks to see whether all of the specified members are provided and that the method/function signatures match the specification

from zope.interface import verifyObject
verify_stack = partial(verifyObject, IStack)

all(verify_stack, [Stack(), PStack(), StackFactory(), RandomStatic()])
# True

Now the class-based options will fail on the implementedBy check, because it’s the Class that provides the implementation, not an instance like with Stack

# False
# True
# False

“OK”, you might say, “but still, why does it matter? Why might we really care about whether these abstract specifications work only with classes? It seems smaller, simpler.”

The main advantage is that specification should (generally) make no assumptions about implementation. If the specification, aka “gentlemen’s agreement” is generally met, it shouldn’t matter whether it’s provided by a Class, an instance, a module, an extension module, or some dynamically constructed object. The specification language should be the same

Going back to PEP 247, the “cryptographic hash API”: there is a specification in that module about what the ‘module’ must provide, and for what the hash objects must provide. Consider also the WSGI spec, the DB-API specs, and all of the other formal and informal specs that are floating around just in the PEPs. Using zope.interface, those specifications can be spelled out in the same fashion. WSGI just cares about a particular function name signature. It can be provided by a single function in a simple module, or as a method from an object put together by a large system like the full Zope 3 application framework and server. It just wants a callable. This is a little bit ugly in zope.interface… but in reality, actually, I think it works. Here’s how it could be specified:

class IWSGIApplication(Interface):
    def __call__(environ, start_response):
        """ Document the function """
    # and/or use tagged values to set additional metadata

This just means that a WSGIApplication must be a callable object taking environ and start_response arguments. A callable object may be a function (taken from PEP 333):

def simple_app(environ, start_response):
    """Simplest possible application object"""
    status = '200 OK'
    response_headers = [('Content-type','text/plain')]
    start_response(status, response_headers)
    return ['Hello world!\n']

Or a class (the __init__ is what is callable here). Maybe the WSGI spec might also state that the result “should be iterable (support __iter__)” Maybe that’s loosely enforced, but the following example shows how the class can make separate declarations about what the class directly provides, and what its instances implement. Instead of using any decorators or magic-ish “class decorators” (the implements, classProvides calls above), we’ll make the declarations for both AppClass and simple_app in the same manner, which matches the style in PEP 3124.

class AppClass(object):
    def __init__(self, environ, start_response):
        self.environ = environ
        self.start = start_response

    def __iter__(self):
        status = '200 OK'
        response_headers = [('Content-type','text/plain')]
        self.start(status, response_headers)
        yield "Hello world!\n"

from zope.interface import directlyProvides, classImplements

# Both 'simple_app' and 'AppClass' are callable with the same arguments,
# so they both *provide* the IWSGIApplication interface

directlyProvides(simple_app, IWSGIApplication)
directlyProvides(AppClass, IWSGIApplication)

# And we can state that AppClass instances are iterable by supporting
# some phantom IIterable interface
classImplements(AppClass, IIterable)

What are the benefits of this, beyond just having a common way of spelling specifications? Instead of, or in addition to, abstract base classes, the core Python libraries can include all of these specs, even if they don’t provide any concrete implementation. Then I could have a unit test in my code that uses verifyClass or verifyObject to ensure I stay inline with the specification.

def test_verifySpec(self):
    verifyClass(ICryptoHash, MyHashClass)

Then, if the specification changes in a new version of Python or in a new version of someone elses library or framework, I can be notified.

Of if the specification undergoes a big change, a new spec could be written, such as IWSGI2Application. Then by process of adaptation (not covered in this post) or interface querying, a WSGI Server could respond appropriately to implementations of the earlier spec:

if IWSGI2Application.providedBy(app):
    # Yay! We don't have to do anything extra!
    # ... do wsgi 2 work
elif IWSGIApplication.providedBy(app):
    # We have to set up the old `start_response` object
    # ... do wsgi 1 work
    raise UnsupportedOrUndeclaredImplementation(app)

Adaptation could provide a means of doing the above… (still, not going into the details.. trying not to!)

def wsgi1_to_wsgi2(app):
    return wsgi2wrapper(app)

# And then, replacing the `if, else` above:
wsgi_app = IWSGI2Application(app, None)
if app is None:
    raise UnsupportedOrUndeclaredImplementation(app)
# ... do wsgi2 work

When you have both specification and adaptation, then you can write your code against the spec. In the above example, the main code does IWSGI2Application(app, None) which means “for the object app, give me an object that provides IWSGI2Application, or None if there is no means of providing that interface.”

If app provides that interface directly, then app is returned directly. Otherwise an adaptation registry is found, and it’s queried for a callable object (an adapter) that will take ‘app’ as its argument and return an object that provides IWSGI2Application.

Another example: knowing that Python 3000 is going to change a lot of core specifications and implementations, such as the attributes for functions (func_code, func_defaults, etc). If an IPy2Function interface were made (and zope.interface or something like it was added to Python 2.x), then code that works with function object internals could program against their preferred spec by adding a line of code:

func = IPy2Function(func)
if my_sniffer(func.func_code):
    raise Unsafe(func)

On Python 2, you’d get the regular function straight through. In Python 3000 / 3.0, an adapter would translate __code__ into func_code, for example. I don’t expect this to happen in reality, but it’s an example of how migration paths could be made between two major software versions, allowing code to run in both.

By taking advantage of this system, my company has seen more re-use with Zope 3 than at any time in our company history. And because (most of) Zope 3 is programmed against specification, we’ve been able to plug in or completely make over the whole system by providing alternative implementations of core specs. This is very hard to do in native Zope 2 (the CMF, on which Plone is based, was probably the first Zope system that started these concepts, which Plone and others were able to take advantage of by providing new tools that matched the provided spec).

At the heart of it, again, is the gentlemen’s agreement, but brought out in full: it doesn’t matter who you are or where you came from (ie, it doesn’t matter what classes are in your family tree or if you are a simple module), as long as you get the job done. There’s a simple contract, and as long as the contract is fulfilled, then everybody is happy.

But if the gentlemen involved can only come from the class system, then there’s still a nasty aristocracy that excludes a large chunk of the populace, all of whom can potentially fulfill the contract. Let’s not cause an uprising, OK?

Labels: , , ,