Tag Archives: patterns

A Pattern Language for High Scalability

Share the article!

A couple years ago (i.e. 2007), I wrote a short blog entry commenting on Pat Helland’s paper “Life beyond Distributed Transactions: an Apostate’s Opinion” (Worthy of a second and third read). I found it curious that it was re-discovered by highscalabilty.com (see: “7 Design Patterns for Almost Infinite Scalability“). Though Highscalability.com is a treasure trove of implementation ideas on achieving high scalability. It made me wonder if anyone else had created a pattern language for high scalability? I have seen a few attempts and this entry is a quick attempt to extend those and conjure a new one up. Hopefully it serves as a good starting point for further refinement and improvements.

At the most abstract level there is Daniel Abadi’s PACELC classification for distributed systems. IMHO, PACELC, as compared to Brewster’s CAP theorem, is a more pragmatic description of the trade-offs one will make when designing a distributed system. PACELC says that if there is a network (P)artition does the system favor (A)vailability or (C)onsistency; (E)lse in the normal state does it favor (L)atency or (C)onsistency.

Cameron Purdy (founder of Oracle’s Coherence product) has a presentation where he proposes these building blocks for scaling-out:

  • Routing
  • Partitioning
  • Replication (for Availability)
  • Coordination
  • Messaging

This short list is rumored to comprehensively cover every distributed system that can be encountered in the wild. If I applied the PACELC to this classification, I may be able to select Routing, Replication and Coordination techniques that favor either Consistency or Availability. Also, I may select Routing, Coordination and Messaging that favors Latency or Consistency.

Jonas Boner, who I have a big fan of for a very long time (see: AspectWerkz ), has a great slide deck that comprehensively enumerates in detail existing techniques to achieve scalability, with availability and stability thrown in for good measure. Shown below is how this list may be mapped into Purdy’s classification (I have taken the liberty to refine the original classification), I’ve marked which trade-off that is favored, either Latency or Consistency, where I thought made sense.

  • State Routing
    • Distributed Caching(Latency)
    • HTTP Caching (Latency)
  • Behavior Routing
    • Fire-forget (Latency)
    • Fire-Receive-Eventually(Latency)
    • ESB
    • Event Stream Processing(Latency)
    • CQRS(Consistency)
    • Dynamic Load Balancing
  • Behavior Partitioning
    • Loop Parallelism
    • Fork/Join
    • Map/Reduce
    • Round Robin Allocation
    • Random Allocation
    • Weighted Allocation
  • State Partitioning (Favors Latency)
    • Distributed Caching
    • HTTP Caching
    • Sharding
  • State Replication (Favors Availability in Partition Failure)
    • Master Slave-Synchronous (Consistency)
    • Master Slave-Asynch (Latency)
    • Master Master-Synchronous (Consistency)
    • Master Master-Asynch (Latency)
    • Buddy Replication-Synchronous (Consistency)
    • Buddy Replication-Asynch (Latency)
  • State Coordination
    • Message Passing Concurrency(Latency)
    • Software Transactional Memory(Consistency)
    • Shared State Concurrency(Consistency)
    • Service of Record(Consistency if Synchronous)
  • Behavior Coordination
    • SIMD
    • Master/Worker
    • Message Passing Concurrency
    • Dataflow Architecture
    • Tuple Space
    • Request Reply
  • Messaging
    • Publish-Subscribe(Latency)
    • Queuing (Consistency)
    • Request Reply(Latency)
    • Store-Forward(Consistency)

The trade-off between Consistency and Availability arises with the implementation of Replication by selecting an Synchronous versus Asynchronous Messaging (or even Coordination) approach. Employing Partitioning favors Latency and never Consistency (this should be obvious). The remaining patterns of Routing, Coordination and Messaging provides the flexibility where one can choose either Latency or Consistency.

This for now appears to be a workable starting point. Although, there’s a lot of room for improvement. For example in the Replication category, Master-Master or the more general form of Buddy Replication is clearly favors Consistency at the cost of Latency irregardless of the choice of Synchronous or Asynchronous messaging and coordination strategy. I think this article “Concurrency Controls in Data Replication provides a better classification of replication techniques.

There is also some inconsistencies that appear to need further refinement, for example the Fire and Forget Routing strategy appears to favor Latency in the sense that it is non-blocking (see: Scalability Best Practices: Lessons from eBay“), however messaging pattern may be the presence of a queue that clearly favors Consistency over Latency. So it favors Latency from the caller perspective, but Consistency from the receiver side (i.e. everything is serialized). In general one may say that decoupling (or loose coupling) favors latency while the tight coupling favors consistency. As an example, optimistic concurrency is loosely coupled and therefore favors latency.

To summarize, there are a lot of techniques that have been developed over the past few decades. Concepts like Dataflow and Tuple Spaces and many other Parallel Computation techniques have been known since the ’70s. The question an architect should however can ask today (which wasn’t asked back then) is which technique to use given the trade-offs defined by PACELC. The short coming of this pattern language is that is does not provide a prescription of how to achieve high scalability. It only provides the patterns one would find in a high scalability system.

The selection of the architecture, should be clearly driven by the use-cases and requirements. That is, consider vertical (see: “Nuggets of Wisdom from eBay’s Architecture“)as well as horizontal partitioning. Finally, unless a service has a limited set of use cases, one can’t expect to build a one-size fits all architecture in the domain of high-scalability.

P.S. I stumbled upon recently this very impressive paper by James Hamilton from Microsoft’s Live.com. He writes about the important considerations when designing a high scalability system from the operational perspective. This kind of insight is extremely very hard to come by. Not many software developers have the intuition to understand what goes on in the data center. On my next entry, I’ll attempt to incorporate some of Hamilton’s ideas to improve this pattern language.

Share the article!

Some More SOA Design Patterns

Share the article!

Did some more googling around and have uncovered a couple more noteworthy SOA patterns. These are from the following sources:

  • Agent Itinerary – Objectifies agent itineraries and routing among destinations.
  • Forward – Provides a way for a host to forward newly arrived agents automatically to another host
  • Ticket – Objectifies a destination address, and encapsulates the quality of service and permissions that are needed to dispatch an agent to a host address and execute it there
  • Delegation – The debtor of a commitment delegates it to a delegatee who may accept the delegation, thus creating
    a new commitment with the delegatee as the new debtor.
  • Escalation – Commitments may be canceled or otherwise violated. Under such circumstances, the creditor or the
    debtor of the commitment may send escalations to the context Org.
  • Preemption – To cancel a commitment based on conflicting demands.
  • Barrier – Guards an action and specifies (pre)conditions on its execution
  • Co-location – Two or more resources are to be co-located at a certain time and place for a specified duration.
  • Correspondence – Relating two pieces of information each owned by a different participant
  • Deadline – Some information is required for an action before a certain time after which an alternate action is taken
  • Expiration – Some information will become invalid at a certain point in time (not shown in figure)
  • Notification – On-state-change xe2x80x9cpushingxe2x80x9d of information to enforce Correspondence.
  • Query – On-demand periodic polling of information to enforce Correspondence
  • Retry – Retrying an action a number of times before resorting to an alternate action
    Selection Choosing from among similar service offerings from multiple participants according to some criteria
  • Solicitation – Gathering information about service offerings from participants
  • Token – Issuing a permission for executing an action to other participants
  • Saga – How can we get transaction-like behavior or complex interactions between services without transactions.
  • Obligation Management – Allow obligations relating to data processing to be transferred and
    managed when the data is shared
  • Sticky Policies – Bind policies to the data it refers to

A couple of them are redundant with other patterns in other texts. You can find these patterns here:

Share the article!

In Search of a Pattern Language for SOA Intrinsic Interoperability

Share the article!

Any Good Pattern Language should be based on a well defined set of primitives (i.e. basic building blocks). Architectures and Design Patterns (referred in the GOF book as micro-architectures) require a clear definition of constraints to be of any real value. Roy Fielding when defines ReST in the context of constraints. In stark contrast, most SOA definitions that one can find, including the OASIS standard definition, fails to define the architectural constraints.

In previous posts I have formulated a set of attributes that provide the definition of Services. I further refined those to this current definition:

A Service Oriented approach satisfies the following:

  1. Decomposability – The approach helps in the task of decomposing a business problem into a small number of less complex subproblems, connected by a simple structure, and independent enough to allow further work to proceed independently on each item.
  2. Composability – The approach favors the production of Services which may then be freely combined with each other and produce new systems, possibly in an environment quite different from the one in which they were initially developed.
  3. Understandability – The approach helps produce software which a human reader can understand each Service without having to know the others, or, at worst, by having to examine only a few of the others.
  4. Continuity – The approach yields a software architecture that a small change in the problem specification will trigger a change of just one Service, or a small number of Services.
  5. Protection – The approach yields a software architecture in which the effect of an abnormal condition occurring at run time in a Service will remain confined to that Service, or at worst will only propagate to a few neighboring Services.
  6. Introspection – The approach yields an architecture that supports the search and inspection of data about Services (i.e. Service Meta-data).
  7. Remoteability – The approach yields an architecture that enables Service interaction between other Service that reside in separate physical environments.
  8. Asynchronicity – The approach yield an architecture that does not require an immediate response from a Service interaction. In other words, it assumes that latency exists in either the network or the invoked Service.
  9. Document Orientedness – The approach yields an architecture where the messages sent Service to Service interaction are explicitly defined, shared and that there is no implicit state sharing between interactions.
  10. Decentralized Administration – The approach yields an architecture that does not assume a single administrator for all Services.

This is an extended definition of Bertand Meyer’s definition of Modularity. You can look at my previous post entitled “SOA and Modularity” to see how this compares with other definitions of SOA.

Now if we were to consult the “SOA Manifesto” and its value system then we could derive the following goal: “We believe in building modular systems through intrinsic interoperability and evolutionary refinement to achieve business value and satisfy strategic goals”. The key ingredient in this statement that is left ambiguous is “Intrinsic Interoperability”. The key question for anyone employing SOA is to understand how to achieve “Intrinsic Interoperability”. Modularity and Evolutionary Refinement are well understood principles, Intrinsic Interoperability is not. One may have the belief that interoperability can be achieved by simply mandating a global standard. This can work in theory, however rarely ever does in practice. Centralized planning is rarely a scalable approach, evolutionary refinement in fact demands a decentralized approach. The question one needs to ponder is how can I build interoperable systems employing a decentralize approach. In the literature I have surveyed I have yet to find a cohesive treatment on how this can be done.

Over the past decade many Design Patterns have been proposed to address many of the concerns that are introduced with in a Service Oriented Architecture. The most notable collections have been following:

I’ve taken the trouble to comb through these patterns and to identify which ones lead to improved intrinsic interoperability. One of the challenges in developing a pattern language is the creation of a categorization that covers the entire collection.

Service Identification Patterns

  • Dynamic Discovery – When a Service joins a network it might not have any knowledge about which other Services are available.
  • Absolute Object Reference – The notion of an identifier to a service that can be exchanged by other services and used to invoke the original service is a key ingredient for Service mobility.
  • Lookup – A Service is selected based on the query of Services in a directory. Provides an additional layer of indirection in identifying services.
  • Referral – A Service is selected based on the consultation of a Services. The difference with the previous is that another service is responsible for making the selection.
  • Proxy – A service communicates with another service that id does not have the identity of or is unreachabable.

Service Dependency Patterns

  • Termination Notification – A mechanism to indicate when a Service becomes permanently unavailable is necessary to manage the evolution of Services.
  • Lease Renewal – This is mechanism is similar to the original, however the onus is placed on the consuming service to renew its dependency.
  • Reminder – Removes the requirement for a Service to maintain its own scheduling service.

Service Extension Patterns

  • Invocation Interceptor – Provides the capability of dynamically introducing new Service functionality.
  • Invocation Context – Permits new Service functionality to be added that is dependent on invocation context rather than Service definition
  • Protocol Plug-in – Provides a explicit mechanism for introducing a new communication protocol to an existing Service.
  • Location Forwarder – A specialization of Invocation Interceptor where the Forwarder sends an invocation to another Service.
  • Delegation – Where a Service allocates a task previous allocated to it to another Service.
  • Escalation – Where a Service attempts to progress a work item that has stalled by offering it to another Service.
  • Deallocation – Where a Service makes a previously started task available for offer and subsequent distribution.
  • Reallocation – where a Service allocates a task that it has started to another Service. Can be stateful where the current state of the task is retained, or stateless where the task is restarted.
  • Suspension/resumption – where a Service temporarily suspends execution of a task or recommences execution of a previously suspended task.

Service Negotiation Patterns : The Customer and Performer negotiate until they reach an agreement (commitment) about the work to be fulfilled.

  • Receiver Cancels – Receiving Service can cancel within certain timeframe.
  • Sender Cancels / Contingent Request – Sending Service can cancel within certain timeframe
  • Binding Request – A sending party sends an offer that it will agree to to if the receiving party accepts.
  • Binding Offer – A sending party request an offer that will responded to by an offere by the receiving party.
  • Resource-Initiated Allocation – The ability for a resource to commit to undertake a work item without needing to commence working on it immediately.
  • Resource-Initiated Execution – Offered Work Item – The ability for a resource to select a work item offered to it and commence work on it immediately.
  • Resource-Determined Work Queue Content – The ability for resources to specify the format and content of work items listed in the work queue for execution.
  • Selection Autonomy – The ability for resources to select a work item for execution based on its characteristics and their own preferences.

Service Performance Patterns: The Performer fulfills the agreement.

  • Role-Based Distribution – The selection of a service to perform a task is based on the role of a service.
  • Deferred Distribution – The selection of a service to perform a tasks is deferred to the time of the the request.
  • Case Handling – The selection of a service to perform a task is based on the case of the request.
  • Capability-Based Distribution – The selection of a service to perform a task is based on the capability of the service.
  • History-Based Distribution – The selection of a service to perform a task is based on a Service handling history.
  • Organisational Distribution – The selection of a service to perform a task is based on the relationship of the service with other services.
  • Two Phase Execution – A service sends plan information prior to the start of execution.
  • Prepare to Start / Start – A service waits for a permission to start prior to the start of execution.
  • Interleaved Parallel Routing – A partial ordering of tasks are defined and can be executed in any order that conforms to the partial ordering.
  • Deferred Choice – A point in a process where one of several branches is chosen based on interaction with the operating environment.

Service Reporting Patterns – The performer reports on the status of the execution of the agreement.

  • Fire-and-Forget – Invoke a Service without expecting a response.
  • Request-Response with Retry – Invoke a Service with the expectation that a retry does not alter the semantics of the previous invocation.
  • Polling – Periodically invoke a Service to derive status.
  • Subscribe-Notify – Subscribe to a Service to receive future notifications.
  • Quick Acknowledgment
  • Sync with Server – Provide a mechanism to synchronize with a Server’s data.
  • Result Callback – Provide a mechanism for the invoked Service to asynchronously return a response.

Service Acceptance (Satisfaction) – The Customer evaluates the work and either declares satisfaction or points out what remains to be done to fulfill the agreement.

  • Retry
  • Compensating Action

Clearly there’s a lot of interesting literature out there that can provide a lot of insight into the interoperation of Services. The above list is just a rough sketch and I’m hoping to provide a more cohesive set over time.

TBD: Conversation join, Conversation refactor, Initiate conversation, Follow conversation,
Leave conversation, Atomic consumption.

Share the article!

SOA Design Patterns Book – A Review

Share the article!

The problem with SOA is that it has always been too abstract. The SOA manifesto that was signed late 2009 confirms this. No longer is it a set of technologies or even a set of standards, it is just simply the architecture that arises from applying service orientation which is defined as building modular systems via evolutionary refine to achieve business goals. This definition is a bit too abstract for me. Astonishingly, that’s as good a definition you can find in the manifesto.

So I decided to dig deeper, possibly I can gleam some knowledge by go through patterns discovered in practice and documented in the book “SOA Design Patterns” by Thomas Erl. This is a massive book with over 800 pages, the patterns in the book can also be found in www.soapatterns.org. The problem I have with almost all SOA books is that SOA is discussed in a manner that reveals little differentiation from any other distributed processing model.

The first hundred pages of the book covers introductory material covering SOA and Design Patterns. There’s nothing new here that you can’t find in other books on the subjects. So let’s dive straight into the meat of the book, the Design Patterns themselves.

I’m a big fan of Design Patterns, however I just abhor it when authors define a Design Pattern that is an obviously implied by the domain you are defining patterns for. For example, if we take the Object Oriented Programming (OOP) domain, Polymorphism is not a Design Pattern, it is an attribute of OOP. When I see these kinds of patterns, its an indicator to me of the lack of rigor in vetting out these patterns.

In the original GOF book, the OOP design patterns are categorized into 3 sets these are Behavioral, Creational and Structural. In this book the categories are Service Inventory, Service Design and Service Composition. The “Service Inventory” category is the most difficult to grasp simply because it is too abstract and its definitions are very weak. The Service Design category covers concerns revolving around the design of services by itself. The Service Composition category covers concerns that cover how services are composed together and how they interact. Service Inventory however seems to be describing services at a meta-level. That is, how would one describe services

The book is a very difficult read because it avoids the use of more concise terminologies commonly used in other computer science texts. Furthermore, it employs pattern names that although sound familiar, can lead to a lot of confusion. In my attempt to understand the book, I will be relating the Design Patterns of this book to more commonly understand computer science terminology.

The first category of patterns named “Service Inventory Patterns” covers ways in which Services are to be described. It can be a bit confusing discussing ideas in a meta-level, or in other words attempting to describe how you describe things. That’s the main flaw of this section in that it is not made apparent as to what is being talked about.

Chapter 6 covers “Foundational Inventory Patterns”. These patterns is simply recording and categorizing services. The “Enterprise Inventory Pattern” says that you should recorded in an inventory, said inventory can be further categorized into different domains (i.e. “Domain Inventory Pattern”) and to various interacting layers (“Service Layers”). Each service can be normalized to minimize overlap in functionality (i.e. “Service Normalization Pattern”) and making sure to avoid redundant logic (“Logic Centralization Pattern”). A standard protocol (“Canonical Protocol Standard”) and standard schemas (“Canonical Schema”) may be defined in the inventories. Nothing really informative in this chapter, its all about book keeping. Maintaining a Meta data repository to track a systems artifacts is nothing new, I personally would have condensed this as “Meta Service Pattern” and just shoved in all the different aspects into a single pattern.

Chapter 7 covers “Logical Inventory Layer Patterns” which in my opinion simply talks about the kinds of services that may be implemented (really just another categorization). That is one can talk about Utility, Entity and Process focused services.

Chapter 8 covers “Inventory Centralization Patterns”. In general Processes, Schemas, Policies and Rules (which incidentally are all meta-data) can be positioned in a central location so as to avoid duplicate and inconsistent definitions. I would have just called this “Source of Truth Pattern”.

Chapter 9 covers “Inventory Implementation Patterns”. Which would mean something along the lines of how you would implement ‘meta data’. Unfortunately I fail to see the logic behind why the patterns in this chapter are collected in this category. The category seems to consist mostly of patterns involving the sharing of compute resources across multiple services. The first pattern “Dual Protocols” doesn’t really belong here, it really is about support more than one protocol for a given service. I would in fact rename this a “Service Virtualization” pattern. “Canonical Resources” is about providing standard interfaces to compute resources. “State Repository” is about providing a utility service for storing service state. “Stateful Services” is well about Services that maintain their own state. “Service Grid” is some kind of service fabric that provides high scalability and fault tolerance for services that require states. I don’t know why this is a pattern, it seems to be more of a technology. “Inventory Endpoint” is a kind of service that acts like a facade to multiple services. “Cross Domain Utility Layer” provides utility services than span multiple domains. Though this pattern seems to be a replay of a previously mentioned layering pattern.

Chapter 10 covers “Inventory Governance Patterns”. Which would mean manage ‘meta data’. “Canonical Expression” states that there should be a standard way for defining contracts. “Metadata Centralization” states that there should be a registry to store services for discovery. I would rename this pattern as “MetaData Discovery” to disambiguate itself from the “Inventory Centralization Patterns”. The key point here is that meta-data should be discoverable by the services within the system. “Canonical Versioning” states that there is a standard way of defining versions of services, this pattern in fact is ambiguous with a later pattern that describes the idea that there should be a language for versioning.

The next set of chapters covers Service Design.

Chapter 11 covers “Foundational Service Patterns”. The problem I have with this chapter is that it talks about fundamental concepts which is apparently is difficult to differentiate from taking about meta-data. In other words, if I can describe my vocabulary then I am in essence defining the foundations of what I’m describing. The chapter attempts to include patterns that one would assume as being all too obvious. For example, “Functional Decomposition” pattern states that a problem can be broken down into smaller problems. The inclusion of this kind of pattern is just plain simple absurd. There is “Service Encapsulation” which has a misleading name, but it is about designing existing logic as a service that can be used outside of its original context. Which again Erl continues to state the obvious through complex pattern definitions. Finally there are two patterns “Agnostic Context” and “Non-Agnostic Context” patterns which is all about identifying multi or single purpose services. This chapter seems completely pointless in my opinion.

Chapter 12 covers “Service Implementation Patterns”. This is when finally there is some meat to the bones. However, some of these patterns here a miscategorized in that they are more about Service Composition, for example “Service Facade”, “Redundant Implementation”, “Service Data Replication” should be in the Service Composition category. In fact, it would have made better sense to categorize these patterns and those in chapter 9 under “State Handling Patterns”. “Partial State Deferral” pattern is indeed an implementation detail of service in how it manages its runtime state. I personally am a bit ambivalent about SOA design patterns that concern themselves with resource optimization. These kind of patterns belong elsewhere. “Partial Validation” pattern permits services to focus on what’s relevant in data and ignore the rest. This is a very useful capability that supports both versioning and interoperability. “UI Mediator” pattern is likely the most unique pattern I’ve found in this book. It is about providing a mechanism to support receiving timely feedback to a user on the progress of a service execution.

Chapter 13 covers “Service Security Patterns”. This is a very coherent category in that it restricts itself with the concern of handling security of services. This is probably one of the better chapters, and the interesting coincidence is that none of the patterns are written by Erl. In fact, as a rule of thumb, patterns that were written by someone other than Erl tend to be of more valuable. I in particular have high regard to the patterns written by David Orchard, these are non-obvious and quite insightful. However, Erl however a times creates a pattern that appears to be a duplicate of Orchard’s pattern (i.e. Version Identifier) and does a very poor job at presenting it (i.e. Canonical Versioning).

Chapter 14 covers Service Contract Design Patterns. This actually is a good categorization however I would have chosen a different name. I would label it “Contract Coupling” patterns. Decoupled Contract – States that a contract should be decoupled from its implementation. This is actually a practice a good practice worth emphasizing.
Contract Centralization – States that all access to a service is through its contract. A better name would be Service Encapsulation. Contract Denormalization – This seems counter to chapter 7 service normalization. The pattern states that redundancy in contract may be required to reduce demands on consumers. These kinds of patterns that appear to be in conflict with other patterns are actually the ones that can be quite insightful. However I would rename this as “Contract Redundancy”. Concurrent Contracts – Were a service defines different kinds of contracts depending on target consumer, again running counter to Service normalization. Finally, a very interesting pattern “Validation Abstraction” – where Validation logic is made portable from the service contract.

Chapter 15 covers Legacy Encapsulation Patterns. This is yet another bad chapter which covers the “Legacy Wrapper” pattern which clearly is the same as “Service Encapsulation”.
“Multi Channel Endpoint” pattern which is intended support multiple user access channels (ex. laptop, mobile, etc.) which again is expounding on the obvious. Services are meant to be shareable across multiple contexts, is it not blindingly obvious the multi access channels would share the same service? Finally there’s the “File Gateway” pattern which is the same thing a “Protocol Bridging” that is described in a later chapter. This chapter makes me wonder as to the target audience level of technical sophistication.

Chapter 16 covers Service Governance Patterns. “Compatible Change” pattern written by David Orchard discuses how to change a contract without affecting legacy consumers. A good example of a well written and insightful pattern. The same goes with the next pattern “Version Identification” which describes the need to define a Version vocabulary that identifies the compatibility constraints between versions. “Termination Notification” pattern is another pattern that is all to easy to forget. There should be a mechanism for contracts to express service termination information. This is then the point where this chapter turns for the worse. “Service Refactoring” pattern which is an obvious consequence consequence of “Service Decoupling” described previously. “Service Decomposition” pattern, well this is just a refactoring technique and the same goes for the “Proxy Capability” pattern. Now my head is beginning to hurt. Where you find the “Decomposed Capability”, with the following description “How can a service be designed to minimize the chances of capability logic deconstruction?”. I’ve got simply little patience left to figure what is meant here. The same goes with the “Distributed Capability” pattern. I’ll likely make another effort some other day.

The next section covers Service Composition Patterns, which discusses patterns on how to compose existing services with each other.

Chapter 17, a chapter that bordering on the absurd. The “Capability Composition” pattern is about composing services out of other services. I’m a bit confused here, for all this time I had thought that services by definition were composable. “Capability Recomposition” which again is obscured with a description like “How can the same capability be used to help solve multiple problems”. This is just plain and simple service instantiation. I can’t see how this is non-obvious. The book seems to repetitively recast well known computer science concepts as patterns and furthermore recasts these as entire chapters. The entire chapter can be summarized in one sentence “Services can be composed of services and Services can be instantiated and invoked in multiple contexts”. I’m beginning to get the feeling that Erl has trouble understanding basic concept like ‘instantiation’.

Chapter 18 covers “Service Messaging” patterns, Hohpe “Enterprise Integration Pattern” provides a much better treatment of this subject area and I refer you to his excellent book if your interested in this. To be brief, the following patterns are discussed: “Service Messaging”, “Messaging Metadata”, “Service Agent”, “Intermediate Routing”, “State Messaging”, “Service Callback”, “Service Instance Routing”, “Asynchronous Queuing”, “Reliable Messaging” and “Event Driven Messaging”. The Author again tries to re-define a common word, he defines the “Service Messaging” pattern which essentially is “Asynchronous Communication” as a pattern.

Chapter 19 covers “Composition Implementation Patterns”. I find the title to be confusing, I simply don’t understand what ‘Implementation’ is meant in this context. The chapter covers a incoherent collection of patterns: “Agnostic SubController”, “Composition Autonomy”, “Atomic Service Transaction” and “Compensating Service Transaction”. I would think that the appropriate title for this could be “Scope of Work Patterns”. It is as if Erl structures his chapter by creating combinations of the words “foundational” and “implementation” without giving any thought as to what they mean.

Chapter 20 covers Service Interaction Security Patterns which covers an interesting collection of patterns not written by Erl.

Chapter 21 covers Transformation Patterns. The chapter covers the “Data Model Transformation” pattern and the “Data Format Transformation” pattern which were written by Erl and the “Protocol Bridging” pattern written by Mark Little. The latter pattern is clearly the generalization of the former two.

In summary, the SOA Design Patterns book isn’t structured with the same rigor and coherence as other Design Patterns books. The content is unusually wordy and repetitive. There are a lot of diagrams but a majority of them provide little insight. The book takes well known concepts in computer science and regurgitates them as design patterns essentially taking what is obvious and making them obscure. Despite the poor quality of most of the book, its saving grace is that there are but a few patterns that have been submitted by contributors that are of a high quality.

However considering the pervasively poor quality of SOA books in general, I’m going to say it is one of the more valuable SOA books. Even if this bar is extremely low, this is of the few SOA books where you can indeed find some true nuggets of wisdom. (The book’s website has a lot more interesting patterns that weren’t published with the book) However, you have to dig very hard and long to find them because the map that is provided can is deliberately obscuring and more of a hindrance than an aid. Read it only if you know what to look for.

So you don’t have to waste your own valuable time, I’ve collected a reference list of patterns from the book that are of some value. Included is a quick explanation of my own and an alternative and hopefully more concise name.

  1. Canonical Protocol – Always convenient to standards on a common protocol to reduce bridging costs. Uniform Protocol.
  2. Dual Protocol – Supporting more than one protocol increases the number of compliant clients. Virtual Service.
  3. Canonical Expression – Specifications about meta-data should be standardized to avoid the cost of translation. Canonical Metadata Language.
  4. Metadata Centralization – SOA systems should support some kind of discovery of metadata services. Metadata Discovery.
  5. Partial Validation – Services should support non-strict validation of messages. Non-strict Validation.
  6. UI Mediator – Provide a capability to receive timely feedback when monitoring a service’s execution.
  7. Exception Shielding – To ensure security the implementation details of an exception should be hidden from a client.
  8. Message Screening
  9. Trusted Subsystem
  10. Service Perimeter Guard
  11. Partial State Deferral
  12. Contract Denormalization – Redundant specifications are something necessary to reduce coupling. Redundant Contract.
  13. Validation Abstraction – A language for input validation should be introspect-able to permit flexibility in where validation is performed. Introspect-able Validation.
  14. Compatible Change – Service contract changes can be performed in a way to support backward compatibility.
  15. Version Identification – A versioning vocabulary should reveal the compatibility constraints between different versions of a services. Versioning Constraints.
  16. Termination Notification – A service should have a mechanism to express its availability.
  17. Messaging Metadata – There should be a mechanism to parse information about a message without having to read the entire message. Message Envelope.
  18. Intermediate Routing
  19. State Messaging – Conversational State can me stored in message. Conversation State Messages.
  20. Service Instance Routing – Communication between services may be routed using logic that is dependent on the content of the message. Content Based Routing.
  21. Asynchronous Queuing – Clients need not require the temporal availability of the services it requires. Asynchronous Communication.
  22. Reliable Messaging – Clients should not have to manage the reliable delivery of a communication to its destination.
  23. Event Driven Messaging – A service may not require the knowledge of the identity of its clients. Publish and Subscribe.
  24. Compensating Service Transaction – Actions performed by services should be undoable.
  25. Data Confidentiality
  26. Data Origin Authentication – A mechanism for discovering the provenance of data in essential. Non-forgeable Provenance.
  27. Broker Authentication – An intermediate broker may be required when there is no trust between two interacting services. Trust Broker.
  28. Protocol Bridging – SOA should allow the inclusion of a protocol mediator to translate communication between services. Protocol Mediator.

Everything else not listed here is most likely to consist mostly of fluff and best be ignored. Do let me know however if I mistakenly ignored a good design pattern.

Share the article!

Design Patterns for Almost-Infinite Scalability

Share the article!

Pat Helland has written a very illuminative paper entitled “Life beyond Distributed Transactions: an Apostate’s Opinion“. Pat Helland in a former life worked on TP monitors (the pre-cursor to EJB) at Tandem and advocated SOA (the pre-cursor to WS-*) at Microsoft. A few people have remarked that Pat’s paper shows that he re-discovered ReST the ‘hard way‘. Pat never mentions ReST in througout his paper, but the similarities are clearly apparent. However, there are other interesting observations that are worthwhile noting down. These observations I believe make an additional step forward beyond the original ReST principles.

As a side note, I’ve been an advocate for years in favor of ReST. However, I don’t believe it’s the end-all in building web scale applications. There are a lot of useful patterns being discovered everyday that aren’t incompatible with ReST, but rather can be used complementary to it.

First of all, I completely subscribe to his title, that is distributed transactions as he calls it, a ‘Maginot Line’. I’ve made similar arguments about transactions in a past blog. Pat Helland’s paper reveals some of these patterns in building what he calls ‘alsmost-infinite scalability’.

  • Entities are uniquely identified – each entity which represents disjoint data (i.e. no overlap of data between entities)
    should have a unique key.
  • Multiple disjoint scopes of transactional serializability – in other words there are these ‘entities’ and that you cannot perform atomic transactions across these entities.
  • At-Least-Once messaging – that is an application must tolerate message retries and out-of-order arrival of messages.
  • Messages are adressed to entities – that is one can’t abstract away from the business logic the existence of the unique keys for addresing entities. Addressing however is independent of location.
  • Entities manage conversational state per party – that is, to ensure idemptency an entity needs to remember that a message has been previously processed. Furthermore, in a world without atomic transactions, outcomes need to be ‘negotiated’ using some kind of workflow capability.
  • Alternate indexes cannot reside within a single scope of serializability – that is, one can’t assume the indices or references to entities can be update atomically. There is the potential that these indices may become out of sync.
  • Messaging between Entities are Tentative – that is, entities need to accept some level of uncertainty and that messages that are sent are requests form commitment and may possibly be cancelled.

These are interesting observations that jive quite well with my table of loosely coupled APIs, as well as my previous arguments about distributed transactions and the need for provisional information and contracts.
I believe that these patterns can be implemented on top of ReST using in the manner I prescribe in my Speech Acts blog entry.

Pat Helland now works at Amazon, although I’m unaware of his influence on the development of S3, but it’s interesting to contrast some of the design principles of that product:

  • Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.

  • Asynchrony: The system makes progress under all circumstances.

  • Autonomy: The system is designed such that individual components can make decisions based on local information.

  • Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.

  • Controlled concurrency: Operations are designed such that no or limited concurrency control is required.

  • Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.

  • Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.

  • Decompose into small well-understood building blocks: Do not try to provide a single service that does everything for every one, but instead build small components that can be used as building blocks for other services.

  • Symmetry: Nodes in the system are identical in terms of functionality, and require no or minimal node-specific configuration to function.

  • Simplicity: The system should be made as simple as possible (- but no simpler).

There clearly is an emerging consensus on the architecture of ‘almost-infinitely scalable’ applications. Furthermore, revealing insights from the experiences of Google, Amazon, Yahoo and eBay are sheding a lot of light on this subject.

Share the article!