Meta-Programming Considered Harmful?

Share the article!

When the Human Genome project began it’s quest to decode the Human Genome it was with the optimism that if we could read the DNA code we could decipher it and therefore cure the ills that afflict us. Unfortunately, as years have passed since its decoding, we’ve realized that the programming is much more complex that we have originally imagined. A majority of the DNA consists of junk code, essential the equivalence of dead code bloat. Much of the code is littered with meta programming constructs that filters out functionality only in certain contexts (i.e. “stem cells). The double helix itself is a redundant copy of itself, furthermore, the code itself is cluttered with even more code who’s only purpose is to provide more redundancy. Over the eons, viruses have hitched a ride with our DNA and not only in a detrimental way but in ways that are essential to us. The DNA itself isn’t the sole source of programming instructions, the DNA creates RNAs that creates proteins that perform the instructions. Unfortunately, the RNAs and proteins can modify the DNAs instructions.

The genome has over the eons become an inscrutable meta-programming nightmare that has evolved over the eons. Conventional software however needs to be different. Humans who are the agents of software evolution have fixed mental capacity and cannot wait forever. The trial and error methodology of biology isn’t really transferable in the world of software engineering. The primary goal of software engineer is to control complexity as software evolves with new features. Our main tools for this are abstractions and structure. The haphazard mechanisms of biology are simply too intractable for us to build predictable systems.

Programming languages have evolved to provide us with the tools to define the abstractions and structure to contain the complexity. These include the main tenants of object oriented programming, that is encapsulation, polymorphism and identity. Functional programming has given us side-effect free variables and code amenable to static analysis. Garbage collection provided us with a mechanism to allocate structures dynamically without the presence of counting errors. Introspection has given us a way to integrate third party code in a well behaved manner. Annotations permitted a base language artifacts (i.e. classes, methods, attributes etc.) to be decorated with custom semantics. Aspects provided a non-intrusive mechanism to augment behavior. Finally, Meta-programming provided the mechanism to change the behavior of an existing system in a dynamic and sometimes unsuspecting manner.

Meta-programming, programs that write programs, can be the most insidious of programming constructs. It is the kind of construct that can so easily lead to inscrutable programs. It is precisely this construct that makes understanding the genome so overly complex. Coincidentally, it is also this construct that makes the new kinds of programming languages so powerful. The question that must be asked is something like this “how much rope should I use so I can’t hang myself with it?” What guidelines do we have so that we can leverage this powerful tool at the same time contain the complexity of our programs?

It is conceivable that one can construct a programs that write programs that write programs ad infinitum. That’s what we see in the genome. There definitely should be a strict limit as to how ‘meta’ we should be allowed to go. For example, in UML the metamodel is defined only for three levels. At the highest level, there is defined the ‘meta-meta model’ that actually only used to describe the UML meta-model. The UML meta-model defines the relationships between constructs like Class, Attribute and Operation. An instantiation of the meta-model, is the more familiar class modeling constructs that we see in programming languages.

In programming languages it is typical that the semantics is fixed at the meta-model. For example, one cannot define a new kind of inheritance semantics between two Classes. However, with the advent of Annotations, we are essentially decorating our meta-model constructs with additional semantics. The newer dynamic languages like Groovy and Ruby take this ability to next level of convenience and ease of use. In fact, for Meta-programming has become the equivalent of Design Patterns in Java. That is, everyone wants to use them whether they need to or not.

This ease of use is clearly enough rope to hang one self with. There should be some strict guidelines for it’s use, like for example, “One should use meta-programming constructs only for writing Domain Specific Languages”. In other words, we should only see it’s use at the surface and never in deep in the bowels of our code.

The Javascript language however can build meta-models ad infinitum. That’s because it makes no distinction between a model layer and an object layer. It is a prototype based language where behavior can be modified dynamically via object based prototype inheritance. This is what makes Javascript so unsuspectingly powerful. You can see this in the many Javascript frameworks that all seem to define their own unique object models. It is also what makes Javascript programs unnecessarily intractable.

Object based extension mechanisms are not alien concepts in Java. It’s a fact that the JavaBeans specification originally influenced by the Delphi programming language is a meta programming construct. What is happening is a JavaBean is customized at runtime by setting up it’s instance variables and associating listeners to it’s event methods. The equivalent of defining a new Class is being done at runtime. This of course was all made possible by the introduction of introspection into Java 1.2.

One does not require language to support to go meta. Any code that acts like an interpreter is essentially going meta. That is, I can write a program that alters its behavior based on an instruction set I define. I can go further, by adding variables to that said program, that alters how it interprets the instruction set. Those variables could maintain state such that interpretive behavior changes as it progresses. This can go ad infinitum. Certainly a sure fire way to obfuscate code and an avenue to dishonest programming.

It really all starts with the introduction state in our variables. The proponents of functional languages knew this and prohibited it out right. Preferring instead languages who’s behavior can be statically analyzed. Unfortunately, this kind of a straitjacket despite it’s engineering benefits make it difficult to build real world solutions. State is required because the universe always moves forward in time. In otherwords, state needs to exist because we need to model time. Functional constructs are certainly valuable even if the merits of a pure functional languages are questionable.

The core problem of meta-programming is the intractability of it. Debuggers are useful tools when one has difficultly reasoning about code. Unfortunately, they are illuminating only at the object level. There are no debuggers in existence that make explicit what’s happening at the meta-level. It is interesting to note that in most Smalltalk IDEs there existed an object browser that was able to visually view runtime objects of an executing program. I wonder if there’s something equivalent in the Ruby, Python or Javascript worlds?

The stack traces that occur in meta-programming are inscrutable. The stack trace of an interpreter obfuscates the error that exists with the program it is interpreting. As a general guideline, an interpreter should always have a reporting mechanism such that errors in one of its programs are brought to the surface. An excellent example of this in action is the “line precise error reporting” feature that’s available in Tapestry. It’s just unfortunate that this kind of functionality it to much of a burden to introduce by the the casual meta-programmer.

So if the meta-programmer doesn’t have the bandwidth to role up his own exception reporting mechanism then what then is the alternative? The alternative is what is well known an code generation (see “When to choose Code Generation over Reflection“. It is a meta-programming with the additional benefit that debugging is more tractable. With code generation you build a compiler rather than an interpreter. The target instruction set is your programming language which happens to have the requisite debugger and exception report system that a meta-programmer so conveniently ignored when he went meta.

Gregor Kiczales the original proponent of AspectJ has a former background in the Scheme. He in fact wrote a book “The Art of the Meta-Object Protocol“. What is particularly interesting about AspectJ is that rather than build a dynamic meta object protocol as that was found in Scheme, Kiczales instead he designed a static compiler to implement the meta-programming features of AspectJ. The hope was that if the aspects were weaved into the base code prior to execution, a more tractable debugging environment would be available. AspectJ could have more easily been implemented using dynamic methods, however Kiczales weighed the tradeoffs and selected a more difficult implementation route.

So in summary, we’ve collected here several guidelines on for meta-programming:

  1. Confine Meta-Programming only to creating DSLs or a system’s interface.
  2. Try to stick to a single assignment rule when possible.
  3. Build error reporting into one’s interpreters. Provide a way to debug interpretive structures.
  4. Employ Code Generation instead of Reflection for complex logic.
  5. Use AspectJ in lieu of a Dynamic Meta-programming.

I hope you find these guidelines helpful in avoiding the pitfalls of going meta.

Share the article!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>