Despite all the recent hoopla promoting dynamic languages as a silver bullet to lead out us out of our productivity malaise, I’ve always wondered if dynamic languages could scale. I use the word ‘scale’ here to mean development scalability rather than system scalability. Development scalability means that we can build a solution with many teams coordinating in relatively independence. The burning question is “Can a dynamic language be successfully used for developing large scale complex software systems?”.
One of my prime examples of a large complex software system is the Eclipse project. One would be extremely hard pressed to find a project of similar scope and complexity. The Eclipse project is incomparable with regards to how it has created a component system that allows multiple development organizations to coordinate (see: Callisto’s 7 million lines of code and Europa’s 17 million lines of code) their independently developed solutions in a single framework. This integration goes beyond a one time integration event, but must continue as all the different components evolve independently. That seriously is an unprecedented accomplishment in terms of software engineering (read: “How to Build Damn Good Software” )
One of the biggest shortcomings of this blog is that it’s built on top of Plone. Plone is a web framework that’s built on Zope that’s built on Python. Plone is appealing in that you get a lot of functionality right out of the box. Unfortunately, there is one major downside, as the Plone and Zope frameworks evolve, a lot of the plugins that are developed independently become out of sync and unusable. This situation is exacerbated by the fact that Plone, Zope and Python doesn’t have a sophisticated module system (this may have been fixed with the Archetypes system). If we could make the analogy with static versus dynamic typed systems, the Plone system was built on a mindset that makes evolution just difficult. Nuxeo has some interesting observations as to why they moved away from Zope and into Java.
System evolution is hard. It is so hard that one can’t expect that the built in language mechanisms (i.e. Interfaces, Dynamic Class Loading etc.) are insufficient for building large scale systems. It is clearly apparent that we need to build mechanisms such as Eclipse’s plugin framework or OSGI’s module system. Certainly a dynamic language like Ruby or Python can create a similar kind of system (see: Java is Not Python Either). Unfortunately, I’m generally convinced that the hack and slash mindset of dynamic languages are incompatible with the rigidness of strict interface/type checking. After all, why should one who discovers the freedom of dynamic languages force oneself into a straitjacket?
We’ve all heard of Conway’s Law (i.e. “Any piece of software reflects the organizational structure that produced it” ), well I think there might be a law that’s analogous to this. This would be “Any piece of software reflect the social aspects of the language it is written in”. The preference of a programming language may have a lot to do with the preferred thinking style of the stakeholders.
PHP based software excelled in the Web UI space because they focused solely on the UI functionality rather than consuming unnecessary effort trying to build the perfect catch-all web framework. Java developers never understood that plain JSP was good enough.
Unfortunately, PHP developers are depressingly weak with regards to modularity. For example, Ning’s mistake was to believe that a PHP developer could design modularity into its UI framework. Ning’s PHP UI framework is a disaster in terms of evolvability and extensibility.
Java was a non-starter in the system’s admin space because of it’s large footprint, slow startup time and lack of brevity. Perl’s implicit pronouns which encourages brevity however comes with the undesired effect that code tends to be unreadable for even the original author and makes refactoring difficult.
The cultural difference between Object Oriented languages are a bit more subtle. Beyond the common desire to finding organizational structure in their code. Python developers tend to be more pragmatic minded.
Ruby developers tend towards brevity and elegance. Smalltalk are more Object Oriented purists. C++ developers tend to lack consensus on which of the many programming paradigms are best. C# developers are content as sharecroppers working within a narrow set of frameworks. Java developers tend to have a desire for building frameworks, unfortunately have a penchant for unnecessary complexity.
Object Oriented languages are ideal languages for developing large scale systems. However, I generally believe that you hit the limit earlier with dynamic typed Object Oriented languages. The lack of static typing hinders not only code exploration but also the implementation of automated refactoring tools. The ease of employing meta-object protocols add an additional burden to understandability by making behaviors less traceable.
One should not underestimate the need to make it easy for new developers to get on board. The key to successful architectures of participation is in the inclusion of the largest of audiences. Languages like Ruby and Scala introduce new programming constructs and paradigms that are beyond the cognitive abilities of the average programmer. These constructs raises the bar of participation effectively reducing the size of the target audience. Unfortunately, there isn’t a language in existence that allows one to selectively deny features ( does BlueJ do this?).
Now going back to the title of this piece. The Chandler project ( see: “Dreaming in Code” ) was a project that was a supposed to be a showcase project for Python. However, almost 3 1/2 years ago, I had some reservations about Chandler’s viability:
It’ll be interesting to see how bigger projects based on languages like Python will fare. So far, the Chandler project which was announced with a lot of fan fare is not progressing very briskly. May I suggest writing some sophisticated refactoring tools first (albeit in Java).
I haven’t really followed their development since and am unaware of the true reasons for the Chandler project’s demise. Scoble has a podcast on Chandler that may be worth watching. My suspicion is that not beginning with a stable and static substrate, in an environment with fluid and unconstrained requirements, is a recipe for disaster. Just ask the Mozilla folks who spent years working on the original non-modular code of Netscape without much success. Only to get into their stride once the XPCOM component model (note: XPCOM is unnecessary for Java with its dynamic component binding ) was implemented. They began to succeed when they got focused on a pure web browser (i.e. firefox) and not a be-all information appliance.
What I find interesting is that Chandler started as a Python project with bold claims such as “Python programs are both concise and readable, this makes it excellent for rapid development by a distributed team” and “Chandler design goals: … design a platform that supports an extensible modular architecture“. I was hoping to see this project as a testament to Python’s viability to large scale development. Unfortunately, I discovered that the only remnants of Python could be found in the desktop client. The Chandler server apparently appears to have gutted all Python and replaced it with an implementation based on Java technologies running on Tomcat:
Chandler Desktop is written in a mixture of Python and C-based extensions. Important components include wx/wxPython, Berkeley DB, PyLucene, ICU, twisted, vobject, and many others.
Chandler Desktop is extensible using simple Python plugins which can be distributed on Cheeseshop. Plugins can add new views and new data types using schemas defined in the plugin.
Chandler Server is both a “database for PIM data” and an Ajax web UI to managing that data. All access to PIM data is done via HTTP calls using various protocols like CalDAV, Atom, WebDAV, Morse Code, and others. Server technologies include Java, Hibernate, Spring, iCal4j, and Abdera. The web UI is developed using the Dojo framework.
It’s indeed strange that these two components use different storage engines. If I were to invoke Conway’s law, it appears that the project had a schism between the Python and Java developers. I’m curious as to how the decision was arrived at to not develop the server side code in Python. Even if the project’s requirements were in constant contention, the least the developers could have done was to build a solid and useful framework. I certainly would have loved to see a flexible UI that can slice and dice data or even a reusable storage engine for unstructured data and communication. I’ve downloaded the client and server, and it looks like a prototype that a competent developer could coble together in a couple of weeks. It’s not something one would expect from a high visibility and heavily funded open source project.
Chandler started life as very public software development effort. In the spirit of agile development, I would have like to have seen a more public postmortem. That is, I would like to see some details of why a well funded, well staffed project could fail so miserably. Unfortunately, I’m am still left in the dark to speculate. Can large scale projects be successful implemented around a dynamic programming language? The burden of proof was out there several years ago, now more than ever, the development community needs some incontrovertible evidence that dynamic programming languages can scale.
Update: Some may have remarked the Eclipse may not be an apples to apples comparison. Well would Zimbra be something comparable in scope with Chandler? Zimbra; an open source email and calendaring application; its core is written in Java; it was successfully sold to Yahoo for 350 million dollars.
Certainly, there are many factors that come in play in software development, unfortunately these include the development methodologies used and the selection of development tools. Some have argued that the Chandler project made every classic mistake in the book. At the same time failing to acknowledge that relying solely on a dynamic language could also have been a mistake.
One final note, I’ve found a good insider piece on what went wrong. The author sums it up as “There was no objective basis for decision-making. Thus, there was no unified design, architecture, vision, nothing. We had fiefdoms…”. Which again I have to opine, “large scale projects are all about managing fiefdoms”.