Rewrites and Transitional Architecture

Rewrites

It was tempting to rewrite the system at the time. In hind sight, I am really glad we didn’t. Later in my career, I learned that this temptation had a name -

Transitional Architecture

Transitional architecture is really a pretty way of saying iterative development. This means, take small steps forward. Going back to this system — we decided to take an iterative approach to fixing the mess that we inherited; putting aside our desire to rewrite it. We had a big hill to climb, but we had to take the first step. First, we looked at all the issues we had and created a chart breaking out the problems. The pareto looked something like this …

Iteration 1

Our first iteration of the rate limiting system was a static file that was built into the package that was deployed to production. This means we would have to update the file, check it in for code review, build it, then deploy it to production — in the middle of an incident! This sounds nasty but it worked! We didn’t reduce the number of incidents but the mean time to resolution (MTTR) of our incidents dropped in half. This was a huge win.

Iteration 2

Next we needed to figure out how we could apply configuration changes to production without having to build the code and then deploy it. We replaced the static config file with a reference to a file that sat in a S3 bucket. We then built a separate tool that could push changes from the developers desktop to this file (with the appropriate checks and balances) and we had updated the service to reload this file every few minutes. On the surface this sounds easy, but we had to think through this more deeply. We had to make sure the service would work even if S3 failed. We had to figure out if the service would fail open or apply a default limit across all customers, handle eventually consistency across the fleet, and so on. These were hard problems to solve but we took our time answering them — knowing we had iteration 1 in production already. This bought us a ton of time.

Iteration 3

Eventually we answered all the hard questions and got iteration 2 out to production. Our appetitive grew at this point. We reduced MTTR even more but engineers wished we had a dynamic system that would rate limit incoming requests automatically. We felt we NEEDED this! We HAD to have this system to reduce our operational load even more!

  1. A low latency data distribution protocol to report request-rate-by-customer from hosts across the fleet
  2. A semi-decentralised service that can make some decision based on heuristics
  3. A automated system to update request limits
  4. Oh, and this system had to be highly resilient and fault tolerant

The perfect system

I reflect on my time on this team from time to time and specifically the rate limiting project. We knew we needed to get to iteration 3 at some point. If we had started there, we would have built a highly complex system that would check all the boxes, cost us a lot of time, and return only some of the value of the original problem we were trying to solve while creating new problems along the way.

The take aways

My time on this team taught me a lot of valuable lessons. Two of the most important ones are -

  1. Avoid the temptation to rewrite — On the surface, rewriting systems feel like the right thing to do to avoid the inherent complexities. When you appreciate the value of working systems and the lessons they embody rewrites are often less appealing. (Sometimes you have to rewrite a system. That’s ok. When you have to rewrite it, stick to replicating what works — stick with what works and make the smallest amount of change necessary)
  2. Transitional architecture — North stars are good to have but often they should be left as just that — a vision. An iterative or transitional architecture can help eliminate unnecessary complexity and deliver a solution that is good enough.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store