I was fortunate to attend the QCon London 2012 conference, this year. Firstly, I was delighted to be invited as a speaker and then when that did not pan out, secondly, they still let me in as VIP guest. I would like to special thanks the QCon / Trifork staff for the awesome gift. My busy schedule granted allowed me to attend just two days, the Thursday and Friday.
Thursday started with Rich Hickey’s Simple Made Easy key note, in the huge Fleming room, for a British and International audience. He had already performed this talk at the Strange Loop conference in the USA, which my American friends said was “an awesome talk”. I will concur with them here for the content, which I had already viewed with InfoQ videos. In person, I felt it could have even had more notes and be improved, with exactly how can you simplify your software development.
What was my first talk of the QCon London? The answer was Progressive Architectures at the Royal Bank of Scotland in the Mountbatten Room, which was splendidly at the top of the Queen Elizabeth II building in the Westminster, which provided terrific views of the London Eye, Westminster Abbey and the Houses of Parliament. I found some of the ideas of this RBS talk a little bit only appropriate to banks. The ODS storage had already discussed in previous years at QCon London. However, this talk did had multiple speakers, who shared the platform by taking turns to speak. The manager laid the foundation of business, front office architecture, the challenge of new technology in financial services, such as FPML, AMPQ and the so-called Race to Zero [latency] – which is as much oxymoronic phrase as there ever was, rather like saying, “let’s race to infinite!”. The talk did focus on Hadoop Distributed Filing System, and moving away from message based architectures. Actually, I found Ben Stopford just started to get into interesting architecture, with some explanation around Big Data and Big Database, and also Oracle Coherence, but then I had misplaced my mobile and I, immediately, had to rush off and find it. Ah well! My impressions of the talk was a little bit aloof for the audience.
Guarding For Inevitable Failure
Thanks to the Las Receptionistas of QCon, on the fifth floor, I connected with my beloved mobile phone again, to much relief, because I thought that I was done for. The next talk was Architecting for Failure at the Guardian.co.uk , which was I thought was much more interesting than the RBS talk. The speaker did a splendid job at persuasion of the evitable fact that all systems are going to fail. Michael Brunford-Spall implored us to design well in advance therefore to support recovery from failure. Architecture of applications in the future need to deal with mitigation, availability and robustness.
Load Balancers –> Apache Server –> Load Balancer –> Other Services
Unfortunately they could not scale a relational database this way against load balancers. So the Guardian had two databases based on a redundant architecture mirrored twin data centres. These were the statistics:
- Serves 3.5m unique daily browsers
- Over 1.6m unique pieces of content
- Supports hundreds of editorial staff
- Create articles, audio, video, galleries, interactive sub-sites, micro-sites
The Guardian rightly discovered their biggest drawback, that their entire architecture was a massive content management system (CMS), which was monolithic. In order to get around this lack of scalability, they introduced Micro-Apps, separation of systems, SSI-like [Server Side Includes] technology and also they deliberately focused on communications with HTTP protocol for all of these disparate internal systems.
The benefits of this re-architecture from monolithic to a polylithic was that they had lots of small simple applications, which meant they could code, release, test in isolation. It also allowed them to cache data and configure responses for each of those micro applications.
The drawbacks were increased architectural complexity from a monolith to an operational cost. The Guardian needed a huge cache (memcached 30GB of static HTML), context. It was not easy to find how what is cached (peek into the cache) without disturbing the data and retrieving the data.
The latency of micro-apps was their biggest problem. For the Guardian, failure was not a problem, but slowness was a problem (caching terminology if-stale-while-revalidate?). The reason was that they could afford JVM Garbage Collector pauses (full and partial) and simultaneously processing of incoming message. In the time that the GC acted to recycle memory, in a collection, and especially in a full GC [stop the world], thousands of messages were still being received. These incoming message represented a large queue of memory allocations, which would be created as new request objects pending after the garbage collector had finished, which conspired to produce a classic hysteresis loop problem. Performance headache at peak times!
Guardian also had to face, what it called Emergency Mode situations. They realised that dynamic pages were expensive to created, and they also had to deal with peak news traffic, which was often unpredictable. These social epidemic new stories were usually measured against small subset of functionality, which rarely could be predicted beforehand. One page was doing 2000 requests per page last year 2011 (an Indonesian cute rat that happened to be three foot long). The Guardian was a large dynamic page resource, they realised their architecture had an infrastructure problem, and a content problem. Dynamicism was a really feature, a so-called nice-to-have, but they knew sometimes speed was very important. Emergency mode was conceived for servicing pressed pages, which were repopulated from a full page cache, and converted into temporary static files. They traded dynamicism for speed with the knowledge that “Users do not mind if the news is slightly delayed for a minute or so”.
Here are some other salient points, of this interesting talk:
- Cache by URL
- Cache – what’s important
- Content – when modified
- Navigation – we just go around the site every 2 weeks then cache them.
- Monitoring is the most important thing you can do on the site
- Error detection – What has gone wrong?
- Error detection – Where did it go wrong?
- Error analysis – How is it going wrong?
- Aggregate stats
- Provide Automatic switches
- Create Release valves in the architecture
- Provide an Emergency mode
- Provide an Database off mode
Camel and Clouds
Riding the Camel into the Cloud was my expected highlight of the day before the conference. James Strachan did not disappoint at all. I found it a very nice introduction into Camel the enterprise integration framework. To start with, Mr Strachan, explained that enterprise integration is hard and referenced the popular Enterprise Integration Pattern books, such as Gregor Hohpe and Bobbi Woolf’s classic tome.
He explained that Apache Camel http://camel.apache.org/ is an open source integration framework based on the known enterprise integration patterns. It’s small, it’s simple and it’s lightweight. He showed us some fluent language programming code for a MessageFilter. Camel is essentially based on URI reference, in fact lots of URI, endpoints, and also active components. Strachan said currently there are well over 120+ Camel components, which can talk to just about any hardware, technology and if there was not a component available, you can write your own component. http://camel.apache.org/components/
Strachan digressed into another project very briefly, Apache MINA, which useful for driving highly scalable networking applications over UDP / TCP and NIO. Camel Endpoints were just names, logical names and that Camel prefered convention-over-configuration. URI were resolved by asking the component to resolve the implementation. Here is an example:-
Of course, this would be stored in a Java file.
In Scala, which Camel supports with a Domain Specific Language, it looks very concise:
Or in the longer style to show the actual closures.
How does one use this in a program? Here is how:
Camel attempted to hide all the middle ware API by creating annotations like @XPath and @Header. By default, Camel, could use the Spring Transaction Manager e.g. Spring JMS Transaction, transaction is the location of the endpoint.
James Strachan recommended the following books, namely: Camel in Action and Active MQ in Action as well as his own website http://www.fusesource.com/ for an all encompassing IDE
Finally, in the last 10 minutes of his talk, we got the part about visualisation and cloud. He mentioned the beginning of Cloud, when we just had operating systems, 50 years ago until the middle of 1990s, then we suddenly had virtual processes (JVM/.Net/ LLVM). We then created application servers, then we had virtualisation of the hardware, now we have clouds, in the future James postulated about multi-clouds(!?).
He said that Cloud changes the dynamics of scales, going from 2 machines to a 1000 machines is now distinctly possible, if you have the expenditure, and if, of course, it makes business sense. In today’s new cloud and virtualised world, we could no longer wire names with hard-coded IP addresses, port numbers, host and domain names. We were going to have to be more dynamic than that for flexibility, availability and extensibility. We were demanding now Loose coupling! Messages worked best when the infrastructure and architecture was build with loose coupling – both time and location!
But rhetorically James Strachan asked a delicate question, what if you can’t [do this now]? Because with virtualisation and cloud computing, the sheer fact of discovery, load balancing & coordination could be hard to build for your organisation. You may only need to connect to the message broker and that is what you should care about. In a cloud environment, systems come and go, because of elasticity. You can send messages to other systems, even if there are not there.
So like many cloud providers, James Strachan introduced his own product http://fuse.fusesource.org/fabric/ Github , opensource PaaS, Apache 2.0, “a simple framework for running on lots of machines” – FuseFabric.
- Registries for configuration, runtime and dynamic provisioning cloud
- Zookeeper is great for geographic clustering
- Always check for duplicates
- Transactions begin at the consuming component, commit at the end of process (or rollback)
- Prefer big transaction boundaries to optimise the wait
James Strachan was asked about Camel in comparison with Spring Integration. He said from 30,000 feet that the two frameworks are conceptually similar. Camel works with Spring, this is the advantage. Camel is more focused on DSL. When you run the Camel route you can inspect the channels, raising the abstraction level. The biggest difference is the community involvement, camel has 100’s of components written. Camel has a component for Spring Integration 2.0.
I found his answer interesting as I have now some experience with Spring Integration. I think if you want to have an integration that is very close to Spring Framework, then you might lean towards Spring Integration, especially if you want to have configuration in XML. On the other hand, writing a Domain Specific Language in Java or Scala is slight attraction, if you can wire an integration in a dynamic language like Groovy, JRuby or Clojure that may be a boon for integrators. The consultants classic answer as always is, it depends on your situation and circumstance.
To Be Lock Free or Not
The final talk of Thursday, was a sell out. I mean that Real Logic accelerating software: Lock-Free algorithms way oversubscribed. Michael Barker and Martin Thompson started by talking about modern hardware – memory is very hierarchical – Intel Nehalem
Access to memory was outline in terms of today’s speed:
Register < 1ns
L1 ~4 cycles ~1ns
L2 ~12 cycles ~3ns
L3 ~45 cycles ~15ns
Sandy Bridge loaded and stored addresses on different execution units ports – double the speed Divide ops were still expensive, because it stops all of the other executions units in the CPU.
This talk was very much involved, and it was best to see all of it live. I think performance tuning stuff, the good talks are best seen with the presenter discussing the ideas. I think I will leave you with an outstanding quote:
“The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer hardware industry.”
- Henry Peteroski