Technoracle (soon to be Canadian Cybertech): 30/10/2005

Friday, November 04, 2005

Anti-patterns for SOA: Part One

This is a quick paper to explain some of the core tenets of SOA and illustrate a potentially unhealthy practice of trying to make functionality that exists behind a service interface visible to service consumers. Sadly, this practice seems to be growing in popularity lately which makes me question if people really understand why we are trying to move to SOA in the first place.

I came across this situation as editor of a requirements document for what may be a service oriented architecture for content management. In an early draft, I had copied what someone wrote:

"..a derived requirement that services would need to exist for Search, Read, Write, Copy, Move, Update, Merge, Version, Lock, Access Control Introspect type methods for content management. "

I received a comment back that stated :

“Content transformation – might be overloaded on copy… alternatively might want to be able to create complex transformation pipelines (splits, branches and merge…)“

It turns out that this insightful comment was extrenmely useful and lead to an epiphany. I now believe that the set should be shortened to only the following services:

Search – to locate a reference from which the content may be retrieved.
Read/Retrieve – essentially the ability to retrieve a copy of content.
Write – a method that would make subsequent retrieval of the content via a service possible. Note that I did not state „to write it to some persistent media“. This will be explained more later.
Update – a method to update content without over-writing an existing version. This assumes some form of version control.
Move – the ability to associate content with an additional URI, IRI or similar reference pointer. Note that I did not state „to migrate content from one persistent location to another“. That makes assumptions of some specific functionality or mechanism behind a service interface.
Lock – lock permissions for content so no one else can invoke the other methods.
Delete – make the content no longer reference able via the service interface. This does not necessarily imply removing the binary sequence that represents the content.
Access Control Introspect – inspecting the access control policies associated with content.

The following should not be core services, but should be optional:

Copy – essentially, this is akin to a combination of read and a subsequent write. This could optionally be implemented but should not be core since it is redundant.
Move – implies the ability to migrate content from one persistent location to another and is nothing more that a combination of retrieve, write and delete. Note that in some cases, content may not actually physically move, it may simply just have a different reference. Many operating systems work this way.
Merge – merge is really retrieval of two or more bits of content, a port-retrieval join and then an write() of a new aggregate content. It can also take place entirely behind the service interface by the creation of a method that will enable the end result of the merge to be retrieved from a service call. This does not mean that the content is actually merged – it is simply possible to get aggregated content.

So you are probably seeing a pattern by now. The core tenet is the externally visible properties or effects of invoking an action VS. the functionality that enables the end result. In SOA, one should not try to focus on the latter, the focus is brought to the service itself. The implementation is not relevant to the consumer.

Case in point. Microsoft engineers have engineered windows so via the user interface, it appears that you can move something from one location on a hard disk to another. But did it really move? No. In fact, in many cases, all you are really doing is changing the pointers which results in a change in the way the view is created for the user via the GUI. Those of you who have programmed in C and CPP will be familiar with this concept.

One service should probably be out of scope for the first phase of the project:

Transform – implying custom manipulations of content.

So why should the first group be core and transform be not in scope?

Rationale:

Services are autonomous and manage their transparency. A service should remain as opaque as possible to mitigate creating dependencies from those who invoke the service. When you make a service request, you enter parameters based on what you want. If for example, a blob of data exists in XML format (let’s call it FOO), and you desire a copy of it in HTML format, you should ask for it like this:

Please retrieve FOO for me in HTML format?

get(FOO, html);

may represent a call to a java class. FOO in this case represents the thing you want and html may be an optional parameter to force the class to provide it in html rather than some default format.

When invoking this method, you should not care how the method gets you FOO in html format. It is not your job to question that. You only care about one thing. It either gives you FOO in html or it fails. Whether or not FOO.html was created by applying a complex transformation, aggregation or forced composition is irrelevant in most cases. Perhaps monkey’s banging out machine byte code randomly created it for you. It really doesn’t matter.

Bad Practice:

You should not be forced to ask for it in any manner that attempts to control processes behind the service interface like this:

“Please retrieve FOO for me and transform it from XML to HTML.”

The latter is bad architecture since it introduces unnecessary dependencies. When you ask for FOO in HTML format, you should not care whether or not it persists in HTML or some other format, only that you get what the interface promises you can retrieve.

The same principle applies for versioning and many other functions. Versions are merely parameters that can be entered into the server, but you should not specify how the functionality behind the service interface evaluates which one is the latest version or how it finds and serializes a specific version. All you need to know is that you either get the version you requested or a notice that the service failed.

Resolution:

There are multiple correct ways to architect a solution for allowing transformations. The first is based on the transformation happening behind the service interface. The service description can declare this is happening but should only do this if that information is necessary for the service consumer to know. Keep in mind that the Service Consumer cannot and should not see anything behind the service interface.

Example One – Good Practice:

Explanation:

The service consumer makes a call to the service and asks for a specific chunk of content in a specific format. In this case, it asks for FOO and specifies a parameter stating what format it wants foo in (HTML). The service takes the request, then retrieves a copy of FOO in xml format. Before passing it back to the service consumer, it calls an internal (and invisible to the service consumer) method to transform foo.xml into foo.html. It gets back the result then passes it back to the service consumer.

In this diagram, everything above the service consumer or below the service itself should be out of scope since there are no externally visible properties nor should assumptions be made about them. To introduce these dependencies would introduce unnecessary dependencies and assumptions and violate the core axioms of SOA.

Example Two – Good Practice

The second variation involves post–retrieval transformations and is also acceptable. Note that the patterns of interaction between service consumer and service provider are not broken.

In this case, the service is only capable of retrieving the FOO content in xml and passes the xml back to the service consumer. The service consumer subsequently transforms it to the HTML format.

In this scenario, the post retrieval use of the content is clearly out of scope for this particular SOA since it’s job is to facilitate the retrieval. There are an unlimited number of things that the user of the service consumer may do once it has retrieved the content. In Phase II, we MAY tackle a few of these however at present, all post retrieval uses should be out of scope until we have the core layer defined since there is a dependency on the base layer.

It should be noted that in this example, it is possible the Transformation component’s interface is visible to the service. That leads us to the third example.

Example Three – suboptimal

In this final scenario, the service consumer calls the service, and requests that the service redirect the response to a transformation service, with instructions to tell the transformation service to do something with it then return the result back to the service consumer.

This introduces several things that I would fire an architect for.

First – it mixes two tasks into one process. If something goes wrong, the service consumer will have great difficulty knowing where things went wrong. The service consumer also cannot accurately know the state of the service call since it would likely have no idea when the service passed the work to the transformation component.

Secondly, it also blurs two unique functions into one. This lessens the probability of uncomplicated reuse but building unnecessarily complex interfaces. The service interface, instead of handling simple requests, now has to handle requests overloaded with details of where to forward things, what protocol to use and what instructions to forward. Inelegant.

This scenario also introduces another potentially unhealthy risk – the service does not know if the transformation service is even available when it accepts the service request. The Service consumer may get some form of acknowledgement back from the service leading it to believe its request is being processed when in fact it cannot be completed. This would become much more complex in the real world if you layered protocols like WS-RX over top of the transactions.

Finally - As with OO programming and analysis, object should break large jobs down into several smaller tasks which can be completed and orchestrated at will. The third example above is akin to procedural web services. It also introduces dependencies between three components which may result in problems maintaining the system.

Conclusion:

A core tenet of SOA is managed transparency. Consumers should not make assumptions about what happens behind the service, nor should they demand to see details of such they do not really need.

So what do you think? I wm interested in hearing some counter points but please feel free to agree also.

Wednesday, November 02, 2005

What are Open Standards?

The topic of open standards and open source is often confusing. A colleague of mine, Klaus Dieter-naujoks, once tried in vain to impart his wisdom on this subject to me. Being young and naive, I did not let it sink in until years later.

The official definition from Wikipedia

1. Open standards are publicly available specifications for achieving a specific task. By allowing anyone to use the standard, they increase compatibility between various hardware and software components since anyone with the technical know-how and the necessary equipment to implement solutions can build something that works together with those of other vendors.
en.wikipedia.org/wiki/Open_standard

Seems reasonable.

2. A standard that is easy for companies to produce and market products or services conforming to the standard. Essentially there is a competitive market without monopoly profits in products or services conforming to the standard. Some standards that are marketed as "open" aren't open in this sense. ...
www.jmcgowan.com/avigloss.html

This implies that there is FUD in the marketplace, which I agree with.

Many people do not understand the difference between open standards and open source. This is a source of frustration for those of us who try to explain it. I know understand why Klaus's face was often red when he spoke with me ;-)

A standard is open if anyone can obtain, read and implement the standard. So who writes the standard? There are two veins of standards - du jure and de facto. These are also sub-types of open standards.

Du jure - these are standards that are developed when a bunch of smart people get together in a room and follow a process to develop a standard. ISO, ITU, UN/CEFACT and others all fall into this category.

De Facto - in latin this means "by the fact itself". If enough people do the same thing, it becomes a de facto standard. This does not invalidate it or in any way mean it is technically superior or inferior.

The main difference is that one has a formal and openly document process for defining and maintaining the standard while the other makes that optional. Even though some specifications are not part of a formal standards body like ISO or ITU, they are still legitimate standards. Those who write them cannot simply do whatever they want. The user communities always have to be considered in maintaining the standard.

The main principle at play is that with all standards, the stakeholders have influence in how the standards are maintained.

Why are Open Standards desirable?

It avoids a situation where those who buy products are locked in to one specific vendor. It is often more important to ascertain that there is good vendor support for a specific standard than the standardÂ’s process to meet this goal. This is generally applicable to many things. Light bulbs, cars, toasters etc. all follow standards. The software industry is no different.

Major Wine find!!

I picked up a case of vintage Bordeaux today in Vancouver. The wines have lots of colour, good concentration and fine tannic structure.They are without doubt good keeping wines whatever the vintage. Robert Parker Jr. includes Château Dalem as one of the “Top Châteaux of Bordeaux” and considers it to be one of the best wines of the appellation. Just looking at this picture makes me drool.

I tasted on of the 1983's and it s amazing. A full blooded classic Bordeaux blend if I ever tasted one.

Reference - http://www.seacove.com/frcdalem.htm