27.9.05. Easy Complex Content, nth try

Throughout the years, I've been fascinated by compound document technology. I still pine for OpenDoc. Most of my work has been on the web, and every few jobs I seem to come into doing content management and every time I do, it's seldom as simple as "entering some text on a page".

I have proposals, diagrams, and outlines going back to my previous major employer. But most of my memory of getting to really deal with the problem stems from my time when I came back to Utah. I continued to use the Zope platforms through various stages of minor and self employment before landing in my current situation (where Zope is still used). Three generations of working with building compound documents come to mind.

Generation 1 - CMF Based, OpenDoc Influenced, Dynamic Structure
We had a customer that provided us with the basic structural layout of the site that they needed to manage. They needed to be able to add new pages, and those pages needed to include images, lists, etc, as well as text. I decided this was a great time to try my compound document ideas. Taking a page from OpenDoc, I decided that a document was a very basic element that was just composed of parts. In fact, the Document basically held a 'root container' part. Container parts held other parts, such as text, images, and so on. Parts could, theoretically, be quite complex. Container parts would manage sub-part creation, access, and ordering. In implementation, I believe they derived from Zope's ObjectManager base class(es). For this customer, they could create a new document and then just add parts to it. They could only edit one part at a time, but its editor would display in-place with the rest of the document. Parts could also be re-ordered. Images could be dropped in between paragraphs and told to flow on the right, left, or center. The content editor did not have to type in any code to refer to images. Navigation to other documents was also managed by another drop-in part. Most folders have index pages that contain a navigation part (listing sibling and one-folder-down content) and a text part to give a folder table of contents some meaning.

This was a pretty good system. Unfortunately, it was developed at a time when the CMF (Content Management Framework) had stalled out, it seemed, with releases. It's since picked up. But we had basically a munged version on our hands.

We were able to take advantage of much of what the CMF offered - indexing, workflow, user management, and a nice separation of UI code from content (which can sometimes be tricky with Zope 2 development). I still felt very frustrated with the experience though. It felt like I had to do too much work to get the CMF to get "out of the way" where I needed it to in order to satisfy my customer's needs. While the CMF paved the way for Zope 3 by providing some much needed new concepts to Zope development - replaceable components, service/policy objects that were easy to look up and use while also being configurable locally, the concept of "skins" to build layered and swappable user interfaces, the ability for many objects to participate on behalf of a basic content object so that a document could just be a document - it always felt quite heavy to me. Starting out with it from 'scratch' without the default content management application implementation was overwhelming. Hiding much of the default implementation was tedious. And to me - it's actually quite a nice system. It was often just too heavy for our customers budgets or our own time constraints.

But that system that I built - which was for a non profit literacy group - is still in use today and has more content than I ever envisioned. My own struggles with getting my implementation going put aside, it's not a bad system and I'm glad that it still sees use long after most of those involved with the launch of the project have since moved on. Later, when other small companies and organizations came to us with need for simple content management, I felt that a different approach was needed.

Generation 2 - Pure Zope, 'CMF Lite' in a way, still OpenDoc based, fixed structure
For the second generation it was decided that we needed something that was more directly under our control and understanding. It didn't need to solve all problems at first, but it should be flexible for growth. I had recently gotten a lot of benefit of implementing and deploying D4 (Dynamic, Declarative, Data-Driven) architectures on Zope, whereby model designs were used to dynamically build user interfaces, provide validation, and so on (things that are increasingly taken for granted today, but seemed to take a long time to come around to). Most of our D4 solutions at that point were against relational databases, with one application tied to LDAP. I wanted to take what we had learned and applying to these content documents. We had customers coming to us now that didn't have such free form content, but instead had pages that held more common structure. I wanted it to be easy to define this content and store it in the ZODB while having my code avoid direct use of the Zope 2 framework if possible. The new solution consisted of:

Parts
Still the chunks that documents were made of - parts were separate but complete small content items. A part could be text, an image, a headline, or any kind of structured record (contact information, links and descriptions about last years festival, etc).
Policies
There were policies for folders and for documents. This is where a lot of the pluggable and extensible architecture came from. Folder policies would set things like "what is the default document to show for this folder?" or "what kind of documents may be added here, and how many?". They also provided APIs for listing and organizing content. An events folder might plug in an event listing policy that would sort events on their starting time. Like CMF Tools, policies had interfaces and common names, and you could replace them at will. Unlike CMF Tools, folder policies were kept in a special containing object on folders. There were no global site settings (although I've been wanting to add this feature). Documents got their policies out of a document policy manager.
Document Schema
A document schema basically put the parts together that would make up a particular document type. A 'contact information' page might have a text part, a picture part, and an address part. The schema was just another policy on the document type. The schema policy was also responsible for getting form data out, validating it, and saving it.
Simple Events
The individual policies for a folder or document could respond to events that were passed on from the containing policy manager. The Schema policy would respond to a 'document added' event and populate the document with empty parts based on the schema definition.
Skins, and composable interfaces
Another common policy was the 'views policy', which said which template/view objects to use. These objects were given common names, like 'edit', and could be configured differently for different skins. The skins tool we used was written by Chris McDonough and was basically lifted directly out of the CMF. Skins are built into the CMF and Zope 3. A skin is composed of layers. View objects are assigned to these different layers, and basically the view object is looked up in each layer until it is found. Thus you can have both an 'admin' skin and a 'public' skin. In each case, objects are traversed to directly (this is Zope after all). But how they'll be rendered and controlled depends on the skin that's placed on top of them. It's not a great explanation, I know, but it's a pretty significant concept.

Basically, what I had managed to do was mimic a lot of functionality that I liked about Zope 3 (which was still deep in pre-release development at that time). It's a successful system - I was even able to integrate it into our e-commerce solution which is a very different styled application. It even reuses a fair amount of code from the first generation, since many of the internals (documents and parts) were CMF agnostic.

The big difference between the second generation and the first was the move away from free-form documents. The system could allow them with some other policies plugged in. But the advantage of the rigid documents is that we could design certain pages more easily because we knew exactly what slots were going to be on there, and what slots might be on there. The customer still has flexibility with the content that they can enter, but the pages are crisper than they might be if all we could to to render it is loop over the parts and tell them to draw themselves.

Generation 3 - Zope 3 based. No OpenDoc heritage. Lightweight, in less time
The main reason that I'm writing this entry is that I didn't realize until today that I had basically redone much of the work from Generation 2 in a customer application we're developing in Zope 3. The concept of parts are gone. Generation 2 schema was defined through the web. Generation 3 uses Zope 3 concepts. Instead of parts, Fields are used - even for complex objects. A page is defined through its Interface, a very strong Zope 3 concept. And that interface's schema is then used to build edit forms and has provided other exciting benefits as well. For example, I can take any of my content objects and find all of their fields that have text in them that can be used for indexing. The field in question might be simple - just a line or block of text. Or it might be complex - like a table with a title, a caption, and rows of cells. In fact - it's this table field that I wrote that made me realize that I had already made a more complicated 'part' than I had ever made for either of the previous generations. And it wasn't until this evening as I was showing progress to my boss that I realized that this set of work - much of it done just in the last three work days - pretty much blows away the 2nd generation.

Zope 3 delivers a lot of functionality that I can then build my own application stack on top of. It's very rich, like the CMF. But at the same time, it doesn't feel like it gets in the way nearly as much. It's still a rich and complex system - you can't really write a "20 minute wiki demo and video from scratch" on top of it without also having to explain the default user interface or skin configuration or URL namespaces. But at the same time, many of these concepts are (relatively) easy to deploy when setting up a new application. This one that I'm working on I've been at for less than a week now. And it's pretty much caught up to our first 'generation 2' site which took a few weeks, mostly because the Generation 2 compound document system was being written along side with it. This 3rd generation implementation is still missing some key elements that I'd like to have implemented, but I don't think they'll be that big of a concern. And the UI is much more impressive than anything we've done in the past. Thanks to MochiKit and much of the work done for the customer prior to this, we have a nice dynamic / AJAX based UI with sortable folder content listings (in javascript). Drag and drop content reordering. Inline renaming and retitling. In the past couple of days I've written a complex Table field for Zope 3 and gave it an interface that allows for dynamic Javascript editing (adding / removing rows - the column definitions are fixed). In less than an hour today I wrote a Textile field (basically a plain descendent of a text field), whose widget uses AJAX to allow for in page previewing of the Textile output.

And all of this is within an architecture that I understand. Granted - there are a lot of Zope 3 concepts that one has to learn, and there are many that I haven't learned yet. But even my own "Generation 2" system I had trouble understanding in comparison to what we've done on Zope 3. The deep dark caverns of the "Generation 2" schema system - which is quite powerful - are scary places for even me, the author, to go.

What impresses me most about Zope 3 is how easy it is to build ones own layer of toolkits / frameworks on top if it all. A common stack we have, which came together within the first week or two of using Zope 3, is this:

  • Customer Specific - skins, components, utilities specific to this customers needs. May build on the next layer.
  • Similar Customers - content and utility components that are common across a certain set of customers, which is the situation we're in right now with this first batch of Zope 3 work.
  • Our CMS / Framework - Provides the management UI, common components, configurations, etc, that are common across many sites that we expect to build.
  • zope.app - The Zope 3 application framework. Provides many of the basic components for a full Zope 3 application - containers, utilities, apis, etc. A pretty beefy layer.
  • Core Zope 3 systems (zope.interface, zope.schema, zope.component, persistence, transaction, etc). These are the basic layers underneath Zope. One could, in theory, use many of these outside of 'zope.app' to build a different application server that did not depend on or provide much of what zope.app requires. There is a 'bobo' branch, for example, which builds on just a few of these core things. Using the zope.bobo, one could in theory write a 20 minute Wiki or 20 minute todo-list in a manner similar to what Rails or TurboGears shows off.

Along with all of this, there are other python libraries, including internal ones, and Zope 3 based libraries like Hurry, used as needed.

What surprises me most of all is how quickly this latest solution came together, almost without my seeing it. Nice.