Internet Scale Design: Part One

Posted in: Technical Track

Whether your company is a small, bootstrapped startup or a rapidly growing, well-funded tier 2 or 3 business, everyone dreams of having ‘Internet scale’ systems which are highly automated, anti-fragile, self-healing, and inexpensive to maintain.

The problem, however, is that those types of complex systems are only within the reach of well-funded companies who can afford to hire elite technical resources and expensive, leading edge technology systems, right?

Wrong!

Recently, I was working with one of my smaller retail clients. Their business is highly seasonal, doing the vast majority of their business during two separate two-month periods each year. During these periods, the demand on their system is extremely high, and downtime would be catastrophic because of the amount of their annual revenue generated during those times.

They run on a LAMP stack (Linux Apache MySQL PHP) and they’re hosted on a mix of shared VMs, and colocation equipment.  Their hosting costs are fixed year round, even though most of the year they don’t need two thirds of it.

They were beginning their annual budget review when we brought up what the next steps were for them. About a year ago, I began introducing the idea of Amazon Web Services (AWS) cloud, taking more of a site reliability engineering (SRE) approach to their business, so we scheduled a conference call with their Chief Financial Officer, a few VPs, and one of my SREs.

“We would like to move you 100% into the Amazon Cloud,” I said. “In doing so, we will automate all of your systems using image and configuration management, and set up detailed monitoring, graphics, and trending systems. We will work with your developers to redesign your applications to take advantage of the platform and its flexibility. The end result will be a system that is considerably more anti-fragile, runs in multiple regions, offers significantly faster recovery in the event of major failures, and can scale down to very little and up to larger scale in a moment’s notice.” We assured them that the systems could be self-healing and would require very little management.

The Sr. VP of Product Management’s response surprised me.

“Bill, this sounds awesome. But we’re a pretty small company, and we couldn’t possibly afford that sort of system.” I laughed a bit and responded, “Actually, not only can you afford it, but it will save you tens of thousands of dollars.”

I explained the benefits of auto-scaling and the cost savings it would bring to their business model. We discussed how those cost savings would fund the expenses of having Pythian build and manage the new service.

His response? “This is amazing – why isn’t everyone doing this?”

The answer is twofold. I’ll be sharing my insights in part two, so stay tuned…

email

Author

Want to talk with an expert? Schedule a call with our team to get the conversation started.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *