❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Learning Notes #8 – SLI, SLA, SLO

25 December 2024 at 16:11

In this blog, i write about SLI, SLA, SLO . I got a refreshing session from a podcast https://open.spotify.com/episode/2Ags7x1WrxaFLRd3KBU50K?si=vbYtW_YVQpOi8HwT9AOM1g. This blog is about that.

In the world of service reliability and performance, the terms SLO, SLA, and SLI are often used interchangeably but have distinct meanings. This blog explains these terms in detail, their importance, and how they relate to each other with practical examples.

1. What are SLIs, SLOs, and SLAs?

Service Level Indicators (SLIs)

An SLI is a metric that quantifies the level of service provided by a system. It measures specific aspects of performance or reliability, such as response time, uptime, or error rate.

Example:

  • Percentage of successful HTTP requests over a time window.
  • Average latency of API responses.

Service Level Objectives (SLOs)

An SLO is a target value or range for an SLI. It defines what β€œacceptable” performance or reliability looks like from the perspective of the service provider or user.

Example:

  • β€œ99.9% of HTTP requests must succeed within 500ms.”
  • β€œThe application should have 99.95% uptime per quarter.”

Service Level Agreements (SLAs)

An SLA is a formal contract between a service provider and a customer that specifies the agreed-upon SLOs and the consequences of failing to meet them, such as penalties or compensations.

Example:

  • β€œIf the uptime drops below 99.5% in a calendar month, the customer will receive a 10% credit on their monthly bill.”

2. Relationship Between SLIs, SLOs, and SLAs

  • SLIs are the metrics measured.
  • SLOs are the goals or benchmarks derived from SLIs.
  • SLAs are agreements that formalize SLOs and include penalties or incentives.

SLI: Average latency of API requests.
SLO: 95% of API requests should have latency under 200ms.
SLA: If latency exceeds the SLO for two consecutive weeks, the provider will issue service credits.

3. Practical Examples

Example 1: Web Hosting Service

  • SLI: Percentage of time the website is available.
  • SLO: The website must be available 99.9% of the time per month.
  • SLA: If uptime falls below 99.9%, the customer will receive a refund of 20% of their monthly fee.

Example 2: Cloud Storage Service

  • SLI: Time taken to retrieve a file from storage.
  • SLO: 95% of retrieval requests must complete within 300ms.
  • SLA: If retrieval times exceed 300ms for more than 5% of requests in a billing cycle, customers will get free additional storage for the next month.

Example 3: API Service

  • SLI: Error rate of API responses.
  • SLO: Error rate must be below 0.1% for all requests in a day.
  • SLA: If the error rate exceeds 0.1% for more than three days in a row, the customer is entitled to a credit worth 5% of their monthly subscription fee.

❌
❌