As a software application scales, it invariably becomes more complex. And with that increase in complexity comes the increased risk of problems that could potentially impact the application’s availability.
Take, for example, the case of a well-known monitoring company that suffered from serious availability problems while it was growing from a small to a midsize company. Its traffic was increasing dramatically, but its infrastructure couldn’t keep up. Worse yet, the company didn&rsqu
To ensure that a modern, high-performance application can operate smoothly even if a data center experiences an outage, it’s crucial to distribute individual application instances across multiple data centers. This approach is widely recognized as a best practice within the industry and is an essential characteristic to incorporate into your application architecture to increase resilience against potential data center issues.
When constructing an application in the cloud, a similar princip
Availability and reliability are two similar but different concepts. When building a highly scaled, highly available web application, it is important to understand the difference between the two.
Reliability generally refers to the quality of a system. Typically, it means the ability of a system to consistently perform according to specifications. You speak of software as reliable if it passes its test suites, and does generally what you think it should do. Reliability answers questions such as
If the qualifications for playing in the Big Game were based entirely on regular season records, then the championship should’ve been between Tennessee and Green Bay. And yet, neither team made it to the game. In fact, both of the league’s best teams were gone by the end of the second round.
The lesson is clear: Things that look good on paper don’t always play out well in practice.
This lesson extends far beyond sports. In fact, it’s a crucial one for businesses th
I learned to fly radio-controlled airplanes when I was a kid, and one of the most important rules I remember was “Always keep your airplane at least ‘two mistakes’ high.” When you are learning to fly a model airplane, especially when you begin to attempt acrobatics, you learn this lesson quickly because mistakes equal altitude. You make a mistake, you lose altitude. As you can imagine, losing too much altitude makes for a very bad day for your airplane. So what does this h
Facebook and its other networks Instagram and WhatsApp suffered their largest outage on Monday since 2008. By mid-day, The Verge speculated that DNS had caused the problem, and referred back to Slack’s outage last week to claim that “it’s always DNS.” We’re not going to speculate on what caused Facebook’s misfortune, but we will answer some of the most common questions about DNS.
What is DNS?
DNS stands for Domain Name System, which is akin to the internet&rs
What does it mean to “architect for scale” and why do you need to do so? Architecting for scale is about building and updating critical applications so they deliver what your increasingly demanding digital customers expect. Remember, your application’s performance, more and more, will be compared with the likes of Amazon and Instagram and Facebook. Architecting for scale is a way of thinking, designing, planning for, and executing so your applications meet the needs and demands
What are SLAs? What are SLOs?
SLAs, or Service Level Commitments, are measuring a commitment to a given level of reliability and performance. In software terms, SLAs are usually described by commitments given to customers on the availability and operational readiness of a software application or system.
SLAs are a commitment to provide a given level of reliability and performance. They are used to create a solid contractual relationship between service owners and customers.
An overnight delive