Monday, 14 July 2014

From framework to platform

When I started my career as a Java developer close to 10 years ago, the industry is going through a revolutionary change. Spring framework, which was released in 2003, was quickly gaining ground and became a serious challenger to the bulky J2EE platform. Having gone through the transition time, I quickly found myself in favour of Spring framework instead of J2EE platform, even the earlier versions of Spring are very tedious to declare beans.

What happened next is the revamping of J2EE standard, which was later renamed to Java EE. Still, dominating of this era is the use of opensource framework over the platform proposed by Sun. This practice gives developers full control over the technologies they used but inflating the deployment size. Slowly, when cloud application become the norm for modern applications, I observed the trend of moving the infrastructure service from framework to platform again. However, this time, it is not motivated by Cloud application.

Framework vs Platform

I have never heard of or had to used any framework in school. However, after joining the industry, it is tough to build scalable and configurable software without the help of any framework.

From my understanding, any application is consist of codes that implement business logic and some other codes that are helpers, utilities or to setup infrastructure. The codes that are not related to business logic, being used repetitively in many projects, can be generalised and extracted for reuse.  The output of this extraction process is framework.

To make it shorter, framework is any codes that is not related to business logic but helps to dress common concerns in applications and fit to be reused.

If following this definition then MVC, Dependency Injection, Caching, JDBC Template, ORM are all consider frameworks.

Platform is similar to framework as it also helps to dress common concerns in applications but in contrast to framework, the service is provided outside the application. Therefore, a common service endpoint can serve multiple applications at the same time. The services provided by JEE application server or Amazon Web Services are sample of platforms.

Compare the two approaches, platform is more scalable, easier to use than framework but it also offers less control. Because of these advantage, platform seem to be the better approach to use when we build Cloud Application.

When should we use platform over framework

Moving toward platform does not guarantee that developers will get rid of framework. Rather, platform only complements framework in building applications. However, one some special occasions we have a choice to use platform or framework to achieve final goal.  From my personal opinion, platform is greater that framework when following conditions are matched:
  • Framework is tedious to use and maintain
  • The service has some common information to be shared among instances.
  • Can utilize additional hardware to improve performance.
In office, we still uses Spring framework, Play framework or RoR in our applications and this will not change any time soon. However, to move to Cloud era, we migrated some of our existing products from internal hosting to Amazon EC2 servers. In order to make the best use of Amazon infrastructure and improve software quality, we have done some major refactoring to our current software architecture. 

Here are some platforms that we are integrating our product to:

Amazon Simple Storage Service (Amazon S3) &  Amazon Cloud Front

We found that Amazon Cloud Front is pretty useful to boost average response time for our applications. Previously, we host most of the applications in our internal server farms, which located in UK and US. This lead to noticeable increase in response time for customers in other continents. Fortunately, Amazon has much greater infrastructure with server farms built all around the worlds. That helps to guarantee a constant delivery time for package, no matter customer locations.

Currently, due to manual effort to setup new instance for applications, we feel that the best use for Amazon Cloud Front is with static contents, which we host separately from application in Amazon S3. This practice give us double benefit in performance with more consistent delivery time offered by the CDN plus the separate connection count in browser for the static content.

Amazon Elastic Cache

Caching has never been easy on cluster environment. The word "cluster" means that your object will not be stored and retrieve from system memory. Rather, it was sent and retrieved over the network. This task was quite tricky in the past because developers need to sync the records from one node to another node. Unfortunately, not all caching framework support this feature automatically. Our best framework for distributed caching was Terracotta.

Now, we turned to Amazon Elastic Cache because it is cheap, reliable and save us the huge effort for setting up and maintain distributed cache. It is worth to highlight that distributed caching is never mean to replace local cache. The difference in performance suggest that we should only use distributed caching over local caching when user need to access real-time temporary data.

Event Logging for Data Analytics

In the past, we used Google Analytics for analysing user behaviour but later decided to build internal data warehouse. One of the motivation is the ability to track events from both browsers and servers. The Event Tracking system uses MongoDB as the database as it allow us to quickly store huge amount of events.

To simplify the creation and retrieval of events, we choose JSON as the format for events. We cannot simply send this event directly to event tracking server due to browser prevention of cross-domain attack. For this reason, Google Analytic send the events to server under the form of a GET request for static resource. As we have the full control over how the application was built, we choose to let the events send back to application server first and route to event tracking server later. This approach is much more convenient and powerful.

Knowledge Portal

In the past, applications access data from database or internal file repository. However, to be able to scale better, we gathered all knowledge to build a knowledge portal. We also built query language to retrieve knowledge from this portal. This approach add one additional layer to the knowledge retrieval process but fortunately for us, our system does not need to serve real time data. Therefore, we can utilize caching to improve performance.

Conclusion

Above is some of our experience on transforming software architecture when moving to the Cloud. Please share with us your experience and opinion.

3 comments: