Design Patterns for Almost-Infinite Scalability

Share the article!

Pat Helland has written a very illuminative paper entitled “Life beyond Distributed Transactions: an Apostate’s Opinion“. Pat Helland in a former life worked on TP monitors (the pre-cursor to EJB) at Tandem and advocated SOA (the pre-cursor to WS-*) at Microsoft. A few people have remarked that Pat’s paper shows that he re-discovered ReST the ‘hard way‘. Pat never mentions ReST in througout his paper, but the similarities are clearly apparent. However, there are other interesting observations that are worthwhile noting down. These observations I believe make an additional step forward beyond the original ReST principles.

As a side note, I’ve been an advocate for years in favor of ReST. However, I don’t believe it’s the end-all in building web scale applications. There are a lot of useful patterns being discovered everyday that aren’t incompatible with ReST, but rather can be used complementary to it.

First of all, I completely subscribe to his title, that is distributed transactions as he calls it, a ‘Maginot Line’. I’ve made similar arguments about transactions in a past blog. Pat Helland’s paper reveals some of these patterns in building what he calls ‘alsmost-infinite scalability’.

  • Entities are uniquely identified – each entity which represents disjoint data (i.e. no overlap of data between entities)
    should have a unique key.
  • Multiple disjoint scopes of transactional serializability – in other words there are these ‘entities’ and that you cannot perform atomic transactions across these entities.
  • At-Least-Once messaging – that is an application must tolerate message retries and out-of-order arrival of messages.
  • Messages are adressed to entities – that is one can’t abstract away from the business logic the existence of the unique keys for addresing entities. Addressing however is independent of location.
  • Entities manage conversational state per party – that is, to ensure idemptency an entity needs to remember that a message has been previously processed. Furthermore, in a world without atomic transactions, outcomes need to be ‘negotiated’ using some kind of workflow capability.
  • Alternate indexes cannot reside within a single scope of serializability – that is, one can’t assume the indices or references to entities can be update atomically. There is the potential that these indices may become out of sync.
  • Messaging between Entities are Tentative – that is, entities need to accept some level of uncertainty and that messages that are sent are requests form commitment and may possibly be cancelled.

These are interesting observations that jive quite well with my table of loosely coupled APIs, as well as my previous arguments about distributed transactions and the need for provisional information and contracts.
I believe that these patterns can be implemented on top of ReST using in the manner I prescribe in my Speech Acts blog entry.

Pat Helland now works at Amazon, although I’m unaware of his influence on the development of S3, but it’s interesting to contrast some of the design principles of that product:

  • Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.

  • Asynchrony: The system makes progress under all circumstances.

  • Autonomy: The system is designed such that individual components can make decisions based on local information.

  • Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.

  • Controlled concurrency: Operations are designed such that no or limited concurrency control is required.

  • Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.

  • Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.

  • Decompose into small well-understood building blocks: Do not try to provide a single service that does everything for every one, but instead build small components that can be used as building blocks for other services.

  • Symmetry: Nodes in the system are identical in terms of functionality, and require no or minimal node-specific configuration to function.

  • Simplicity: The system should be made as simple as possible (- but no simpler).

There clearly is an emerging consensus on the architecture of ‘almost-infinitely scalable’ applications. Furthermore, revealing insights from the experiences of Google, Amazon, Yahoo and eBay are sheding a lot of light on this subject.

Share the article!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>