Bill de Hora spots a thesis “Making reliable distributed systems in the presence of software errors” by Joe Armstrong:
Isolation has several consequences:
- Processes have xe2x80x9cshare nothingxe2x80x9d semantics. This is obvious since they are imagined to run on physically separated machines.
- Message passing is the only way to pass data between processes. Again since nothing is shared this is the only means possible to exchange data.
- Isolation implies that message passing is asynchronous. If process communication is synchronous then a sofware error in the receiver of a message could indefinitely block the sender of the message destroying the property of isolation.
- Since nothing is shared, everything necessary to perform a distributed computation must be copied. Since nothing is shared, and the only way to communicate between processes is by message passing, then we will never know if our messages arrive (remember we said that message passing is inherently unreliable.) The only way to know if a message has been correctly sent is to send a confirmation message back.
What striking about these properties is that they seem to match almost like a glove the definition I had previously proposed for Services (with the exception of the dynamic introspection). In a recent blog conversation with Ted Neward, I was arguing for the inclusion of asynchronous communication in the definition of Services. Asynch communication makes feasible extensibility via interception and support for a common envelope (i.e. layer 6) that supports functional cross cutting concerns in similar spirit to Aspect oriented programming. Furthermore, as explained by Armstrong, asychronous communication leads to better isolation and thus more reliability.
Despite the obvious advantages of asynchronous communication, there are a lot of reservations against it. My guess is that there’s a belief that these systems are much more complex and difficult to develop in. However, Doug Armstrong surprisingly argues based on experience that this may in fact not be the case.
It’s also extremely interesting how this all is tied back to functional programming. Joe Armstrong’s theses is in coincidentally about the functional programming language Erlang. As noted earlier, function composition is related to decorators (an object oriented pattern for supporting interception). This leads to a very intriquing spin on the definition of SOA, that is SOA = message passing + functional semantics + introspection + meta attibutes. This definition comes back full circle to REST, that is, uniform interfaces and state transfer. It’s surprising that Services is plain and simple, a functional language in a distributed context. I’m wondering why others haven’t picked up on this relation?
Its really noting new, have we not seen this before in the language E, Scala, Oz and Erlang? Does a functional language with message passing constructs make event driven programming easier to do? I don’t know why it would be easier so I’ll need to kick the tires a bit more to figure out why.
Speaking about reliability, the paper “crash only software” describes some interesting inter-component properties:
- Components have externally enforced boundaries.
- All interactions between components have a timeout.
- All resources are leased.
- Requests are entirely self-describing.
The first three properties are typically associated with building robust distributed systems, however I’m not familiar with the last property in relation to reliability. It’s explained as follows:
Requests are entirely self-describing, by making the state and context needed for their processing explicit. This allows a fresh instance of a rebooted component to pick up a request and continue from where the previous instance left off. Requests also carry information on whether they are idempotent, along with a time-to-live; both idempotency and TTL information can initially be set at the system boundary, such as in the web tier. For example, the TTL may be determined by load or service level agreements, and idempotency flags can be based on application-specific information … Over the course of its lifetime, a request will split into multiple sub-operations, which may rejoin, in much the same way nested transactions do. Recovering from a failed idempotent sub-operation entails simply reissuing it; for non-idempotent operations, the system can either roll them back, apply compensating operations, or tolerate the inconsistency resulting from a retry. Such transparent recovery of the request stream can hide intra-system component failures from the end user.
That’s really surprising, and this gives further justification for the need for standard meta-attributes to embelish message passing.