Create reliable and highly available application using 3 crucial design patterns!

THURSDAY, MAY 7, 2020 12:34 PM    

As a form of personal development. I strongly believe that design patterns serves as guiding principles to allow us to make more sound decisions. Even in our daily lives, as we go about our days making decisions at almost every juncture of time, you find yourself to be more in a state of peace and serenity when you are able to fallback on a set of principles that governs your life. Just a simple example to illustrate the point, As I grow up, I inculcated a principle of being thrifty. When confronted with a choice between a comfortable private hire ride and the public transport, I find myself choosing the latter. To further immerse myself into the lifestyle that conforms to that principle, I start to plan activities around these choices. Eg, I will start slotting reading time in my daily commute, or, if reading is not possible, I could listen to podcast when I am walking home from the bus terminal.

Similarly, along the same train of thought, you can seek to understand application or architecture design patterns as such. They are guiding principles that seeks to bring clarity, structure, and coherence in your understanding and thought process in your approach towards software development work.

Now in this article, I will be going through another 3 design patterns:

  1. Circuit Breaker Pattern (being relevant to the current situation here in Singapore where we are currently having a circuit breaker,
  2. Federated Identity Pattern, and
  3. Command and Query Responsibility Segregation Pattern (CQRS).

1. Circuit Breaker (CB) Pattern

You want to consider this pattern if you want add stability your application during failure. These failures are related to those that takes a much longer time to recover instead of a simple reboot. Circuit Breaker pattern creates a layer that serves like a proxy (I’ve written about proxies before, you can check it out here). This proxy, on behalf of the application, will talk to an external resource that the application may wish to communicate with. This pattern focuses on providing stability while the system recovers from a failure and minimises the impact on performance. This design pattern may be considered ‘too much’, providing more maintenance cost than it’s intended benefits, if the application does not require high available or that a simple unavailable page may suffice. But for applications that wishes to provide a clean and better user experience, CB pattern can help to provide the system performance by immediately rejecting a request when it knows that the remote service is currently unavailable. Such an action provides a relatively better experience instead of waiting for it to time out, or never return a response. Ever remember an incident where you waited for 15 seconds, felt like forever, only to be informed that the service is down. (what a bummer!!)

There are instances of failure which are quick to recover in which a retry pattern could easier be set in place to complete an attempted operation. At times, there are also other types of failure that might take much longer to fix due to unanticipated events. In these situations it might be pointless for an appoint to continually retry an operation that is unlikely to succeed, and instead application should quickly accept that the operation has failed and handle the failure accordingly. Notice how this serves to deal with another use case of failure that a retry pattern might not be the most optimal pattern to deal with.

In the midst of a failure, at times these blocked requests may end up hoarding shared resources, causing failure of other possibly unrelated parts of the system. In cases as such, it would be better for the operation to fail immediately, and only attempt to invoke the service again if it’s likely to succeed.

In this CB pattern, the ‘proxy’ helps to check the availability of the service. This design pattern can also be implemented with the use of a state machine (Basically just to keep track of what is the current condition of the service: working, not working, trying to get it to work). We simply call this condition recording: state. It may choose to block and immediately reject operations when the recorded state indicates that the operation is highly unlikely to succeed (not working). During the state of “trying to get it to work”, the CB proxy checks if the service has been restored by throttling the incoming requests to test if a limited number of operations are successful over a time interval. In this ‘trying to get it to work’ phase, it is usually call the half-open state. When these limited number of operations succeeds over a defined time interval, it will move the state of the remote service to be a closed state and the throttle set in place will be lifted.

When this service is down, remember that users may still be requesting for your service. So in this pattern, while these requests are received on your application, you can record these requests down. These recorded requests can then be replayed when the service is available again. Remember, you can consider this pattern if you wish to prevent an application from trying to invoke a remote service or access a shared resource if this operation is likely to fail, and thereupon save the resources that are locked by these requests that are ‘known’ to fail. Just as you return an error immediately to the user, your error can also indicate the delay expected for the service before it recovers, as a way to manage users’ expectations.

While all design patterns seeks to serve a particular intention, do avoid using this pattern if the service that application is trying to access is a local private resource, such as an in-memory data structure. It would have been an over-kill and create overhead to the system (aka more cost than benefit upon implementation). Also, note that this circuit breaker pattern is not a substitute to handle exceptions in the business logic of your application. This pattern addresses the issue of accessing remote resources that may fail. Extrapolating use cases beyond its intended scenarios is always a big no-no. It serves as a proxy between your remotely accessed resource and your client, conveying end-user friendly errors when a service is half-open (in recovery mode) or open (disrupted).

Reference: Circuit Breaker

2. Federated Identity Pattern

You want to consider this design pattern when you want to offload and delegate the user authentication process to an external identity provider. This can help to simply application development and minimise user administration, improving the users overall experience when using multiple applications. You should also consider this design pattern when you want to implement Single sign-on in the company where there may be multiple applications an employee have to use. If you have multiple partners to manage where you need to authenticate corporate employees and also business partners whom are not in the corporate directory, this separation of concern allows you that flexibility.

Usually in a bigger company, there are multiple applications hosted by different organisations. Having a need to have different credential for each applications can cause a disjoined user experience. This increases security risk when users leave the organization but the user’s deprovision of access can be overlooked. Having disjoined authentication across applications will also complicate user management by having to manage different credentials for different applications for a single user.

Now this is where this design pattern comes into play. We separate the authentication logic from the application layer, delegating it in a trusted identity provider. This separation allows you to clearly decouple authentication from authorisation. (NEAT!) Having a separation supports Single sign-on implementation, where the different applications just have to trust the claim tokens provided by the identity provider. Now in this implementation, you notice that the user’s credentials are no longer managed by the different applications, users’ credentials are hidden from all applications but the original identity provider. The applications merely needs to trust the token and claims in the user’s token to verify authorisation.

Delving deeper, such a design pattern allows us to perform claims-based access control. With this approach, applications and services can authorise the user’s access to features and functionalities based on the claims contained in the token provided.

Note, with this pattern, you have a single point of failure. Once your authentication service is done, multiple of your applications are no longer functional. So if you deploy applications into multiple data centres, consider deploying the identity management mechanisms to the same dat centres to maintain application reliability and availability.

Reference: Federated Identity Pattern

3. Command and Query Responsibility Segregation (CQRS) pattern

You want to consider this design pattern when you find a need to separate your read and write operations for your data store. This need for segregation is not needed for applications that only requires basic CRUD operations, and adding such a segregation might be an ‘overkill’; creating more maintenance cost than benefits. But if you find your application scaling up (and kudos to such an application and the company behind it), adopting this design pattern will help to maximise your application’s performance, scalability, and security. Always remember, a good and relevant design pattern will allow you to create a pit of success, where ‘falling’ into success gets a whole lot easier. With the CQRS pattern, you allow your application to be flexible to evolve over time and prevent write operations from clashing at the domain level.

In a large company or a scaled application, you often find that the read and write workload are often asymmetrical with vastly different performance and scaling requirements. For example, in write operations, you find yourself having to deal more with data transformation and validations in top of the db write operation. Read operations often is just returning a data transfer object. If these two operations are sharing same resources, a high write workload beyond performance threshold may start to impact read operations. That seemingly impact between independent operations feels unnecessarily ‘sticky’, and that’s where we may start to consider this design pattern.

Where data is concerned, one should naturally consider the implications from a security standpoint. Remember, if you are appropriately considering this design pattern, your application is probably dealing with high workload that a basic CRUD operation can no longer suffice. Here’s where you have to consider managing security and permission on an operation-level. (ie. one entity that read, should not have the permission to write, in the event of an accidental exposure of data update in a wrong context)

Some benefits of this design pattern includes an independent scaling of read and write components within a data layer. Often times, you may want to consider having more read-only replicas to improve query and load performance. This may help in a distributed architecture where read-only replica are located close to the application instances. Now, one other benefit is the optimised data schemas. Some schemas are more optimised for read, while others for write. With a segregation, you may not optimize on the schema level which is great. Remember, large scale application = every optimisation counts. More security can also be set in place to ensure entities that can read may not write, better protecting the integrity of your data. One of my favourite benefits is the Separation of concerns. With a separation of concerns, this segregation provides more flexibility with models and logic that are more maintainable. Now, for a more comprehensive application, write operations can be more complicated with input validation, business validation, and business logic. Read can be relatively simple. Also, with materialised view, application can avoid complicated joins when querying, reaping the benefits of an optimized schema best suited for the job.

Note, all design patterns comes with limitations:

As wealthy as the benefits may be, such an implementation introduces complexity into the architecture. It is especially so if it uses the event sourcing pattern. CQRS is commonly implemented using messages to process commands and publish update events. Definitely as there are perks to an introduction of a new pattern, such as the competing consumer pattern As a limitation of messages, the architecture must now handle message failure or duplicated messages. Lastly, as you can choose to separate read and write databases, the read DB may eventually become stale. If you wish to address this issue, the read database must be updated to reflect the changes to the write model store. Such a data synchronisation can be tedious and difficult to detect when a user issues a request based on stale read data.

Reference: Commands and Queries Separation Pattern