As a software application scales, it invariably becomes more complex. And with that increase in complexity comes the increased risk of problems that could potentially impact the application’s availability.
Take, for example, the case of a well-known monitoring company that suffered from serious availability problems while it was growing from a small to a midsize company. Its traffic was increasing dramatically, but its infrastructure couldn’t keep up. Worse yet, the company didn&rsqu
Analytics are essential to the successful operation of every modern SaaS application. Effectively managing a SaaS application requires continuous tracking of its performance, what’s going on inside the application, and whether or not it’s accomplishing its goals.
However, there is a wide variety of analytics that need to be monitored and tracked to successfully run an application. The purpose, value, accuracy, and reliability of those analytics vary greatly depending on how they are
Seldom do emerging SaaS startups consider the scalability of their applications from the outset. While they may anticipate future expansion and incorporate growth into their financial strategies, their primary emphasis tends to be on developing marketable features rather than designing their applications for scalability.
However, it’s important to think about scalability right from the start, even before landing your first customer. As the company introduces one feature after another and a
To ensure that a modern, high-performance application can operate smoothly even if a data center experiences an outage, it’s crucial to distribute individual application instances across multiple data centers. This approach is widely recognized as a best practice within the industry and is an essential characteristic to incorporate into your application architecture to increase resilience against potential data center issues.
When constructing an application in the cloud, a similar princip
No matter how smoothly your services normally run, outages can happen to the best of us. The truth is, that occasional incidents are unavoidable. Dealing with those incidents is both an art and a science, and there are many products, systems, and procedures that can help you create incident response processes to help reduce the impact of incidents when they do happen to your application.
But what about after the incident? What then? Once an incident is finished, it’s just as importan
Imagine you and your friends have been eagerly anticipating the season premiere of your favorite HBO show all year long. You decide to throw a viewing party, excited to show off your brand new 75-inch 4K super deluxe TV. The drinks are cold, the snacks are all set out, and everyone is excited. The show is just about to start, when suddenly your internet connection goes down and all that enormous new TV screen displays is one big, high-definition error message.
This was the last thing you expecte
Availability and reliability are two similar but different concepts. When building a highly scaled, highly available web application, it is important to understand the difference between the two.
Reliability generally refers to the quality of a system. Typically, it means the ability of a system to consistently perform according to specifications. You speak of software as reliable if it passes its test suites, and does generally what you think it should do. Reliability answers questions such as
What do technology expert Ken Gavranovic and I have in common? We are both featured in the GOTO Book Club video this month. The GOTO Book Club series brings in experts and authors to interview each other, with a focus on newly released and classic dev books. In the latest episode, Ken and I discuss some of the topics I cover in the new second edition of my book, Architecting for Scale.
“Chaos shouldn’t be feared. Chaos is value. Chaos is an opportunity to learn.” —
I learned to fly radio-controlled airplanes when I was a kid, and one of the most important rules I remember was “Always keep your airplane at least ‘two mistakes’ high.” When you are learning to fly a model airplane, especially when you begin to attempt acrobatics, you learn this lesson quickly because mistakes equal altitude. You make a mistake, you lose altitude. As you can imagine, losing too much altitude makes for a very bad day for your airplane. So what does this h
What does it mean to “architect for scale” and why do you need to do so? Architecting for scale is about building and updating critical applications so they deliver what your increasingly demanding digital customers expect. Remember, your application’s performance, more and more, will be compared with the likes of Amazon and Instagram and Facebook. Architecting for scale is a way of thinking, designing, planning for, and executing so your applications meet the needs and demands
What are SLAs? What are SLOs?
SLAs, or Service Level Commitments, are measuring a commitment to a given level of reliability and performance. In software terms, SLAs are usually described by commitments given to customers on the availability and operational readiness of a software application or system.
SLAs are a commitment to provide a given level of reliability and performance. They are used to create a solid contractual relationship between service owners and customers.
An overnight delive