Parotta Salna
Learning Notes #76 – Specifying Virtual Users (VUs) and Test duration in K6
16 February 2025 at 05:13

Learning Notes #76 – Specifying Virtual Users (VUs) and Test duration in K6

By: Mr.ParottaSalna

16 February 2025 at 05:13

When running load tests with K6, two fundamental aspects that shape test execution are the number of Virtual Users (VUs) and the test duration. These parameters help simulate realistic user behavior and measure system performance under different load conditions.

In this blog, i jot down notes on virtual users and test duration in options. Using this we can ramp up users.

Defining VUs and Duration in K6

K6 offers multiple ways to define VUs and test duration, primarily through options in the test script or the command line.

Basic VU and Duration Configuration

The simplest way to specify VUs and test duration is by setting them in the options object of your test script.

import http from 'k6/http';
import { sleep } from 'k6';

export const options = {
  vus: 10, // Number of virtual users
  duration: '30s', // Duration of the test
};

export default function () {
  http.get('https://test.k6.io/');
  sleep(1);
}

This script runs a load test with 10 virtual users for 30 seconds, making requests to the specified URL.

Specifying VUs and Duration from the Command Line

You can also set the VUs and duration dynamically using command-line arguments without modifying the script.

k6 run --vus 20 --duration 1m script.js

This command runs the test with 20 virtual users for 1 minute.

Ramp Up and Ramp Down with Stages

Instead of a fixed number of VUs, you can simulate user load variations over time using stages. This helps to gradually increase or decrease the load on the system.

export const options = {
  stages: [
    { duration: '30s', target: 10 }, // Ramp up to 10 VUs
    { duration: '1m', target: 50 },  // Ramp up to 50 VUs
    { duration: '30s', target: 10 }, // Ramp down to 10 VUs
    { duration: '20s', target: 0 },  // Ramp down to 0 VUs
  ],
};

This test gradually increases the load, sustains it, and then reduces it, simulating real-world traffic patterns.

Custom Execution Scenarios

For more advanced load testing strategies, K6 supports scenarios, allowing fine-grained control over execution behavior.

Syntax of Custom Execution Scenarios

A scenarios object defines different execution strategies. Each scenario consists of

executor: Defines how the test runs (e.g., ramping-vus, constant-arrival-rate, etc.).
vus: Number of virtual users (for certain executors).
duration: How long the scenario runs.
iterations: Total number of iterations per VU (for certain executors).
stages: Used in ramping-vus to define load variations over time.
rate: Defines the number of iterations per time unit in constant-arrival-rate.
preAllocatedVUs: Number of VUs reserved for the test.

Different Executors in K6

K6 provides several executors that define how virtual users (VUs) generate load

shared-iterations – Distributes a fixed number of iterations across multiple VUs.
per-vu-iterations – Each VU runs a specific number of iterations independently.
constant-vus – Maintains a fixed number of VUs for a set duration.
ramping-vus – Increases or decreases the number of VUs over time.
constant-arrival-rate – Ensures a constant number of requests per time unit, independent of VUs.
ramping-arrival-rate – Gradually increases or decreases the request rate over time.
externally-controlled – Allows dynamic control of VUs via an external API.

Example: Ramping VUs Scenario

export const options = {
  scenarios: {
    ramping_users: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '30s', target: 20 },
        { duration: '1m', target: 100 },
        { duration: '30s', target: 0 },
      ],
    },
  },
};

export default function () {
  http.get('https://test-api.example.com');
  sleep(1);
}

Example: Constant Arrival Rate Scenario

export const options = {
  scenarios: {
    constant_request_rate: {
      executor: 'constant-arrival-rate',
      rate: 50, // 50 iterations per second
      timeUnit: '1s', // Per second
      duration: '1m', // Test duration
      preAllocatedVUs: 20, // Number of VUs to allocate
    },
  },
};

export default function () {
  http.get('https://test-api.example.com');
  sleep(1);
}

Example: Per VU Iteration Scenario

export const options = {
  scenarios: {
    per_vu_iterations: {
      executor: 'per-vu-iterations',
      vus: 10,
      iterations: 50, // Each VU executes 50 iterations
      maxDuration: '1m',
    },
  },
};

export default function () {
  http.get('https://test-api.example.com');
  sleep(1);
}

Choosing the Right Configuration

Use fixed VUs and duration for simple, constant load testing.
Use stages for ramping up and down load gradually.
Use scenarios for more complex and controlled testing setups.

References

Scenarios & Executors – https://grafana.com/docs/k6/latest/using-k6/scenarios/

Parotta Salna
Golden Feedbacks for Python Sessions 1.0 from last year (2024)
13 February 2025 at 08:49

Golden Feedbacks for Python Sessions 1.0 from last year (2024)

Parotta Salna

By: Mr.ParottaSalna

13 February 2025 at 08:49

Many Thanks to Shrini for documenting it last year. This serves as a good reference to improve my skills. Hope it will help many.

What Participants wanted to improve

🚶‍♂️ Go a bit slower so that everyone can understand clearly without feeling rushed.

Provide more basics and examples to make learning easier for beginners.

Spend the first week explaining programming basics so that newcomers don’t feel lost.

Teach flowcharting methods to help participants understand the logic behind coding.

Try teaching Scratch as an interactive way to introduce programming concepts.

Offer weekend batches for those who prefer learning on weekends.

Encourage more conversations so that participants can actively engage in discussions.

Create sub-groups to allow participants to collaborate and support each other.

Get “cheerleaders” within the team to make the classes more fun and interactive.

Increase promotion efforts to reach a wider audience and get more participants.

Provide better examples to make concepts easier to grasp.

Conduct more Q&A sessions so participants can ask and clarify their doubts.

Ensure that each participant gets a chance to speak and express their thoughts.

Showing your face in videos can help in building a more personal connection with the learners.

Organize mini-hackathons to provide hands-on experience and encourage practical learning.

Foster more interactions and connections between participants to build a strong learning community.

Encourage participants to write blogs daily to document their learning and share insights.

Motivate participants to give talks in class and other communities to build confidence.

Other Learnings & Suggestions

Avoid creating WhatsApp groups for communication, as the 1024 member limit makes it difficult to manage multiple groups.

Telegram works fine for now, but explore using mailing lists as an alternative for structured discussions.

Mute groups when necessary to prevent unnecessary messages like “Hi, Hello, Good Morning.”

Teach participants how to join mailing lists like ChennaiPy and KanchiLUG and guide them on asking questions in forums like Tamil Linux Community.

Show participants how to create a free blog on platforms like dev.to or WordPress to share their learning journey.

Avoid spending too much time explaining everything in-depth, as participants should start coding a small project by the 5th or 6th class.

Present topics as solutions to project ideas or real-world problem statements instead of just theory.

Encourage using names when addressing people, rather than calling them “Sir” or “Madam,” to maintain an equal and friendly learning environment.

Zoom is costly, and since only around 50 people complete the training, consider alternatives like Jitsi or Google Meet for better cost-effectiveness.

Will try to incorporate these learnings in our upcoming sessions.

Let’s make this learning experience engaging, interactive, and impactful!

Parotta Salna
Learning Notes #68 – Buildpacks and Dockerfile
2 February 2025 at 09:32

Learning Notes #68 – Buildpacks and Dockerfile

Parotta Salna

By: Mr.ParottaSalna

2 February 2025 at 09:32

Last few days, i was exploring on Buildpacks. I am amused at this tool features on reducing the developer’s pain. In this blog i jot down my experience on Buildpacks.

Before going to try Buildpacks, we need to understand what is an OCI ?

What is an OCI ?

An OCI Image (Open Container Initiative Image) is a standard format for container images, defined by the Open Container Initiative (OCI) to ensure interoperability across different container runtimes (Docker, Podman, containerd, etc.).

It consists of,

Manifest – Metadata describing the image (layers, config, etc.).
Config JSON – Information about how the container should run (CMD, ENV, etc.).
Filesystem Layers – The actual file system of the container.

OCI Image Specification ensures that container images built once can run on any OCI-compliant runtime.

Does Docker Create OCI Images?

Yes, Docker creates OCI-compliant images. Since Docker v1.10+, Docker has been aligned with the OCI Image Specification, and all Docker images are OCI-compliant by default.

When you build an image with docker build, it follows the OCI Image format.
When you push/pull images to registries like Docker Hub, they follow the OCI Image Specification.

However, Docker also supports its legacy Docker Image format, which existed before OCI was introduced. Most modern registries and runtimes (Kubernetes, Podman, containerd) support OCI images natively.

What is a Buildpack ?

A buildpack is a framework for transforming application source code into a runnable image by handling dependencies, compilation, and configuration. Buildpacks are widely used in cloud environments like Heroku, Cloud Foundry, and Kubernetes (via Cloud Native Buildpacks).

Overview of Buildpack Process

The buildpack process consists of two primary phases

Detection Phase: Determines if the buildpack should be applied based on the app’s dependencies.
Build Phase: Executes the necessary steps to prepare the application for running in a container.

Buildpacks work with a lifecycle manager (e.g., Cloud Native Buildpacks’ lifecycle) that orchestrates the execution of multiple buildpacks in an ordered sequence.

Builder: The Image That Executes the Build

A builder is an image that contains all necessary components to run a buildpack.

Components of a Builder Image

Build Image – Used during the build phase (includes compilers, dependencies, etc.).
Run Image – A minimal environment for running the final built application.
Lifecycle – The core mechanism that executes buildpacks, orchestrates the process, and ensures reproducibility.

Stack: The Combination of Build and Run Images

Build Image + Run Image = Stack
Build Image: Base OS with tools required for building (e.g., Ubuntu, Alpine).
Run Image: Lightweight OS with only the runtime dependencies for execution.

Installation and Initial Setups

For Installation: https://buildpacks.io/docs/app-journey/
For App Developers: https://buildpacks.io/docs/for-app-developers/

Basic Build of an Image (Python Project)

Project Source: https://github.com/syedjaferk/gh_action_docker_build_push_fastapi_app

Building an image using buildpack

Before running these commands, ensure you have Pack CLI (pack) installed.

a) Detect builder suggest

pack builder suggest

b) Build the image

pack build my-app --builder paketobuildpacks/builder:base

c) Run the image locally


docker run -p 8080:8080 my-python-app

Building an Image using Dockerfile

a) Dockerfile


FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .

RUN pip install -r requirements.txt

COPY ./random_id_generator ./random_id_generator
COPY app.py app.py

EXPOSE 8080

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]

b) Build and Run


docker build -t my-python-app .
docker run -p 8080:8080 my-python-app

Unique Benefits of Buildpacks

No Need for a `Dockerfile` (Auto-Detection)

Buildpacks automatically detect the language and dependencies, removing the need for Dockerfile.


pack build my-python-app --builder paketobuildpacks/builder:base

It detects Python, installs dependencies, and builds the app into a container. Docker requires a Dockerfile, which developers must manually configure and maintain.

Automatic Security Updates

Buildpacks automatically patch base images for security vulnerabilities.

If there’s a CVE in the OS layer, Buildpacks update the base image without rebuilding the app.


pack rebase my-python-app

No need to rebuild! It replaces only the OS layers while keeping the app the same.

Standardized & Reproducible Builds

Ensures consistent images across environments (dev, CI/CD, production). Example: Running the same build locally and on Heroku/Cloud Run,


pack build my-app

Extensibility: Custom Buildpacks

Developers can create custom Buildpacks to add special dependencies.

Example: Adding ffmpeg to a Python buildpack,


pack buildpack package my-custom-python-buildpack --path .

Generating SBOM in Buildpacks

a) Using `pack` CLI to Generate SBOM

After building an image with pack, run,


pack sbom download my-python-app --output-dir ./sbom

This fetches the SBOM for your built image.
The SBOM is saved in the ./sbom/ directory.

Supported formats:

SPDX (sbom.spdx.json)
CycloneDX (sbom.cdx.json)

b) Generate SBOM in Docker


trivy image --format cyclonedx -o sbom.json my-python-app

Both are helpful in creating images. Its all about the tradeoffs.

Parotta Salna
Learning Notes #67 – Build and Push to a Registry (Docker Hub) with GH-Actions
28 January 2025 at 02:30

Learning Notes #67 – Build and Push to a Registry (Docker Hub) with GH-Actions

Parotta Salna

By: Mr.ParottaSalna

28 January 2025 at 02:30

GitHub Actions is a powerful tool for automating workflows directly in your repository.In this blog, we’ll explore how to efficiently set up GitHub Actions to handle Docker workflows with environments, secrets, and protection rules.

Why Use GitHub Actions for Docker?

My Code base is in Github and i want to tryout gh-actions to build and push images to docker hub seamlessly.

Setting Up GitHub Environments

GitHub Environments let you define settings specific to deployment stages. Here’s how to configure them:

1. Create an Environment

Go to your GitHub repository and navigate to Settings > Environments. Click New environment, name it (e.g., production), and save.

2. Add Secrets and Variables

Inside the environment settings, click Add secret to store sensitive information like DOCKER_USERNAME and DOCKER_TOKEN.

Use Variables for non-sensitive configuration, such as the Docker image name.

3. Optional: Set Protection Rules

Enforce rules like requiring manual approval before deployments. Restrict deployments to specific branches (e.g., main).

Sample Workflow for Building and Pushing Docker Images

Below is a GitHub Actions workflow for automating the build and push of a Docker image based on a minimal Flask app.

Workflow: .github/workflows/docker-build-push.yml


name: Build and Push Docker Image

on:
  push:
    branches:
      - main  # Trigger workflow on pushes to the `main` branch

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    environment: production  # Specify the environment to use

    steps:
      # Checkout the repository
      - name: Checkout code
        uses: actions/checkout@v3

      # Log in to Docker Hub using environment secrets
      - name: Log in to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}

      # Build the Docker image using an environment variable
      - name: Build Docker image
        env:
          DOCKER_IMAGE_NAME: ${{ vars.DOCKER_IMAGE_NAME }}
        run: |
          docker build -t ${{ secrets.DOCKER_USERNAME }}/$DOCKER_IMAGE_NAME:${{ github.run_id }} .

      # Push the Docker image to Docker Hub
      - name: Push Docker image
        env:
          DOCKER_IMAGE_NAME: ${{ vars.DOCKER_IMAGE_NAME }}
        run: |
          docker push ${{ secrets.DOCKER_USERNAME }}/$DOCKER_IMAGE_NAME:${{ github.run_id }}

To Actions on live: https://github.com/syedjaferk/gh_action_docker_build_push_fastapi_app/actions

Parotta Salna
Learning Notes #66 – What is SBOM ? Software Bill of Materials
26 January 2025 at 09:16

Learning Notes #66 – What is SBOM ? Software Bill of Materials

Parotta Salna

By: Mr.ParottaSalna

26 January 2025 at 09:16

Yesterday, i came to know about SBOM, from my friend Prasanth Baskar. Let’s say you’re building a website.

You decide to use a popular open-source tool to handle user logins. Here’s the catch,

That library uses another library to store data.

That tool depends on another library to handle passwords.

Now, if one of those libraries has a bug or security issue, how do you even know it’s there? In this blog, i will jot down my understanding on SBOM with Trivy.

What is SBOM ?

A Software Bill of Materials (SBOM) is a list of everything that makes up a piece of software.

Think of it as,

A shopping list for all the tools, libraries, and pieces used to build the software.
A recipe card showing what’s inside and how it’s structured.

For software, this means,

Components: These are the “ingredients,” such as open-source libraries, frameworks, and tools.
Versions: Just like you might want to know if the cake uses almond flour or regular flour, knowing the version of a software component matters.
Licenses: Did the baker follow the rules for the ingredients they used? Software components also come with licenses that dictate how they can be used.

So How come its Important ?

1. Understanding What You’re Using

When you download or use software, especially something complex, you often don’t know what’s inside. An SBOM helps you understand what components are being used are they secure? Are they trustworthy?

2. Finding Problems Faster

If someone discovers that a specific ingredient is bad—like flour with bacteria in it—you’d want to know if that’s in your cake. Similarly, if a software library has a security issue, an SBOM helps you figure out if your software is affected and needs fixing.

For example,

When the Log4j vulnerability made headlines, companies that had SBOMs could quickly identify whether they used Log4j and take action.

3. Building Trust

Imagine buying food without a label or list of ingredients.

You’d feel doubtful, right ? Similarly, an SBOM builds trust by showing users exactly what’s in the software they’re using.

4. Avoiding Legal Trouble

Some software components come with specific rules or licenses about how they can be used. An SBOM ensures these rules are followed, avoiding potential legal headaches.

How to Create an SBOM?

For many developers, creating an SBOM manually would be impossible because modern software can have hundreds (or even thousands!) of components.

Thankfully, there are tools that automatically create SBOMs. Examples include,

Trivy: A lightweight tool to generate SBOMs and find vulnerabilities.
CycloneDX: A popular SBOM format supported by many tools https://cyclonedx.org/
SPDX: Another format designed to make sharing SBOMs easier https://spdx.dev/

These tools can scan your software and automatically list out every component, its version, and its dependencies.

We will see example on generating a SBOM file for nginx using trivy.

How Trivy Works ?

On running trivy scan,

1. It downloads Trivy DB including vulnerability information.

2. Pull Missing layers in cache.

3. Analyze layers and stores information in cache.

4. Detect security issues and write to SBOM file.

Note: a CVE refers to a Common Vulnerabilities and Exposures identifier. A CVE is a unique code used to catalog and track publicly known security vulnerabilities and exposures in software or systems.

How to Generate SBOMs with Trivy

Step 1: Install Trivy in Ubuntu

sudo apt-get install wget gnupg
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | gpg --dearmor | sudo tee /usr/share/keyrings/trivy.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/trivy.gpg] https://aquasecurity.github.io/trivy-repo/deb generic main" | sudo tee -a /etc/apt/sources.list.d/trivy.list
sudo apt-get update
sudo apt-get install trivy

More on Installation: https://github.com/aquasecurity/trivy/blob/main/docs/getting-started/installation.md

Step 2: Generate an SBOM

Trivy allows you to create SBOMs in formats like CycloneDX or SPDX.

trivy image --format cyclonedx --output sbom.json nginx:latest

It generates the SBOM file.

It can be incorporated into Github CI/CD.

Parotta Salna
Event Summary: FOSS United Chennai Meetup – 25-01-2025
26 January 2025 at 04:53

Event Summary: FOSS United Chennai Meetup – 25-01-2025

Parotta Salna

By: Mr.ParottaSalna

26 January 2025 at 04:53

Attended the FOSS United Chennai Meetup Yesterday!

After, attending Grafana & Friends Meetup, straightly went to FOSS United Chennai Meetup at YuniQ in Taramani.

Had a chance to meet my Friends face to face after a long time. Sakhil Ahamed E. , Dhanasekar T, Dhanasekar Chellamuthu, Thanga Ayyanar, Parameshwar Arunachalam, Guru Prasath S, Krisha, Gopinathan Asokan

Talks Summary,

1. Ansh Arora, Gave a tour on FOSS United, How its formed, Motto, FOSS Hack, FOSS Clubs.

2. Karthikeyan A K, Gave a talk on his open source product injee (The no configuration instant database for frontend developers.). It’s a great tool. He gave a personal demo for me. It’s a great tool with lot of potentials. Would like to contribute !.

3. Justin Benito, How they celebrated New Year with https://tamilnadu.tech
It’s single go to page for events in Tamil Nadu. If you are interested ,go to the repo https://lnkd.in/geKFqnFz and contribute.

From Kaniyam Foundation we are maintaining a Google Calendar for a long time on Tech Events happening in Tamil Nadu https://lnkd.in/gbmGMuaa.

4. Prasanth Baskar, gave a talk on Harbor, OSS Container Registry with SBOM and more functionalities. SBOM was new to me.

5. Thanga Ayyanar, gave a talk on Static Site Generation with Emacs.

At the end, we had a group photo and went for tea. Got to meet my Juniors from St. Joseph’s Institute of Technology in this meet. Had a discussion with Parameshwar Arunachalam on his BuildToLearn Experience. They started prototyping an Tinder app for Tamil Words. After that had a small discussion on our Feb 8th Glug Inauguration at St. Joseph’s Institute of Technology Dr. KARTHI M .

Happy to see, lot of minds travelling from different districts to attend this meet.

Parotta Salna
Event Summary: Grafana & Friends Meetup Chennai – 25-01-2025
26 January 2025 at 04:47

Event Summary: Grafana & Friends Meetup Chennai – 25-01-2025

Parotta Salna

By: Mr.ParottaSalna

26 January 2025 at 04:47

Attended the Grafana & Friends Meetup Yesterday!

I usually have a question. As a developer, i have logs, isn’t that enough. With curious mind, i attended Grafana & Friends Chennai meetup (Jan 25th 2025)

Had an awesome time meeting fellow tech enthusiasts (devops engineers) and learning about cool ways to monitor and understand data better.
Big shoutout to the Grafana Labs community and Presidio for hosting such a great event!

Sandwich and Juice was nice

Talk Summary,

1⃣ Making Data Collection Easier with Grafana Alloy
Dinesh J. and Krithika R shared how Grafana Alloy, combined with Open Telemetry, makes it super simple to collect and manage data for better monitoring.

2⃣ Running Grafana in Kubernetes
Lakshmi Narasimhan Parthasarathy (https://lnkd.in/gShxtucZ) showed how to set up Grafana in Kubernetes in 4 different ways (vanilla, helm chart, grafana operator, kube-prom-stack). He is building a SaaS product https://lnkd.in/gSS9XS5m (Heroku on your own servers).

3⃣ Observability for Frontend Apps with Grafana Faro
Selvaraj Kuppusamy show how Grafana Faro can help frontend developers monitor what’s happening on websites and apps in real time. This makes it easier to spot and fix issues quickly. Were able to see core web vitals, and traces too. I was surprised about this.

Techies i had interaction with,

Prasanth Baskar, who is an open source contributor at Cloud Native Computing Foundation (CNCF) on project https://lnkd.in/gmHjt9Bs. I was also happy to know that he knows **parottasalna** (that’s me) and read some blogs. Happy To Hear that.

Selvaraj Kuppusamy, Devops Engineer, he is also conducting Grafana and Friends chapter in Coimbatore on Feb 1. I will attend that aswell.

Saranraj Chandrasekaran who is also a devops engineer, Had a chat with him on devops and related stuffs.

To all of them, i shared about KanchiLUG (https://lnkd.in/gasCnxXv) and Parottasalna (https://parottasalna.com/) and My Channel on Tech https://lnkd.in/gKcyE-b5.

Thanks Achanandhi M for organising this wonderful meetup. You did well. I came to Achanandhi M from medium. He regularly writes blog on cloud related stuffs. https://lnkd.in/ghUS-GTc Checkout his blog.

Also, He shared some tasks for us,

1. Create your First Grafana Dashboard.
Objective: Create a basic Grafana Dashboard to visualize data in various formats such as tables, charts and graphs. Aslo, try to connect to multiple data sources to get diverse data for your dashboard.

2. Monitor your linux system’s health with prometheus, Node Exporter and Grafana.
Objective: Use prometheus, Node Exporter adn Grafana to monitor your linux machines health system by tracking key metrics like CPU, memory and disk usage.

3. Using Grafana Faro to track User Actions (Like Button Clicks) and Identify the Most Used Features.

Give a try on these.

Parotta Salna
RSVP for RabbitMQ: Build Scalable Messaging Systems in Tamil
24 January 2025 at 11:21

RSVP for RabbitMQ: Build Scalable Messaging Systems in Tamil

Parotta Salna

By: Mr.ParottaSalna

24 January 2025 at 11:21

Hi All,

Invitation to RabbitMQ Session

Topic: RabbitMQ: Asynchronous Communication
Date: Feb 2 Sunday
Time: 10:30 AM to 1 PM
Venue: Online. Will be shared in mail after RSVP.

Join us for an in-depth session on RabbitMQ in தமிழ், where we’ll explore,

Message queuing fundamentals
Connections, channels, and virtual hosts
Exchanges, queues, and bindings
Publisher confirmations and consumer acknowledgments
Use cases and live demos

Whether you’re a developer, DevOps enthusiast, or curious learner, this session will empower you with the knowledge to build scalable and efficient messaging systems.

Don’t miss this opportunity to level up your messaging skills!

RSVP closed !

Our Previous Monthly meets – https://www.youtube.com/watch?v=cPtyuSzeaa8&list=PLiutOxBS1MizPGGcdfXF61WP5pNUYvxUl&pp=gAQB

Our Previous Sessions,

Our Social Handles,

Parotta Salna
Learning Notes #65 – Application Logs, Metrics, MDC
21 January 2025 at 05:45

Learning Notes #65 – Application Logs, Metrics, MDC

Parotta Salna

By: Mr.ParottaSalna

21 January 2025 at 05:45

I am big fan of logs. Would like to log everything. All the request, response of an API. But is it correct ? Though logs helped our team greatly during this new year, i want to know, is there a better approach to log things. That search made this blog. In this blog i jot down notes on logging. Lets log it.

Throughout this blog, i try to generalize things. Not biased to a particular language. But here and there you can see me biased towards Python. Also this is my opinion. Not a hard rule.

Which is a best logger ?

I’m not here to argue about which logger is the best, they all have their problems. But the worst one is usually the one you build yourself. Sure, existing loggers aren’t perfect, but trying to create your own is often a much bigger mistake.

1. Why Logging Matters

Logging provides visibility into your application’s behavior, helping to,

Diagnose and troubleshoot issues (This is most common usecase)
Monitor application health and performance (Metrics)
Meet compliance and auditing requirements (Audit Logs)
Enable debugging in production environments (we all do this.)

However, poorly designed logging strategies can lead to excessive log volumes, higher costs, and difficulty in pinpointing actionable insights.

2. Logging Best Practices

a. Use Structured Logs

Long story short, instead of unstructured plain text, use JSON or other structured formats. This makes parsing and querying easier, especially in log aggregation tools.


{
  "timestamp": "2025-01-20T12:34:56Z",
  "level": "INFO",
  "message": "User login successful",
  "userId": 12345,
  "sessionId": "abcde12345"
}

b. Leverage Logging Levels

Define and adhere to appropriate logging levels to avoid log bloat:

DEBUG: Detailed information for debugging.
INFO: General operational messages.
WARNING: Indications of potential issues.
ERROR: Application errors that require immediate attention.
CRITICAL: Severe errors leading to application failure.

c. Avoid Sensitive Data

Sanitize your logs to exclude sensitive information like passwords, PII, or API keys. Instead, mask or hash such data. Don’t add token even for testing.

d. Include Contextual Information

Incorporate metadata like request IDs, user IDs, or transaction IDs to trace specific events effectively.

3. Log Ingestion at Scale

As applications scale, log ingestion can become a bottleneck. Here’s how to manage it,

a. Centralized Logging

Stream logs to centralized systems like Elasticsearch, Logstash, Kibana (ELK), or cloud-native services like AWS CloudWatch, Azure Monitor, or Google Cloud Logging.

b. Optimize Log Volume

Log only necessary information.
Use log sampling to reduce verbosity in high-throughput systems.
Rotate logs to limit disk usage.

c. Use Asynchronous Logging

Asynchronous loggers improve application performance by delegating logging tasks to separate threads or processes. (Not Suitable all time. It has its own problems)

d. Method return values are usually important

If you have a log in the method and don’t include the return value of the method, you’re missing important information. Make an effort to include that at the expense of slightly less elegant looking code.

e. Include filename in error messages

Mention the path/to/file:line-number to pinpoint the location of the issue.

3. Logging Don’ts

a. Don’t Log Everything at the Same Level

Logging all messages at the INFO or DEBUG level creates noise and makes it difficult to identify critical issues.

b. Don’t Hardcode Log Messages

Avoid static, vague, or generic log messages. Use dynamic and descriptive messages that include relevant context.

# Bad Example
Error occurred.

# Good Example
Error occurred while processing payment for user_id=12345, transaction_id=abc-6789.

c. Don’t Log Sensitive or Regulated Data

Exposing personally identifiable information (PII), passwords, or other sensitive data in logs can lead to compliance violations (e.g., GDPR, HIPAA).

d. Don’t Ignore Log Rotation

Failing to implement log rotation can result in disk space exhaustion, especially in high traffic systems (Log Retention).

e. Don’t Overlook Log Correlation

Logs without request IDs, session IDs, or contextual metadata make it difficult to correlate related events.

f. Don’t Forget to Monitor Log Costs

Logging everything without considering storage and processing costs can lead to financial inefficiency in large-scale systems.

g. Keep the log message short

Long and verbose messages are a cost. The cost is in reading time and ingestion time.

h. Never use log message in loop

This might seem obvious, but just to be clear -> logging inside a loop, even if the log level isn’t visible by default, can still hurt performance. It’s best to avoid this whenever possible.

If you absolutely need to log something at a hidden level and decide to break this guideline, keep it short and straightforward.

i. Log item you already “have”

We should avoid this,


logger.info("Reached X and value of method is {}", method());

Here, just for the logging purpose, we are calling the method() again. Even if the method is cheap. You’re effectively running the method regardless of the respective logging levels!

j. Dont log iterables

Even if it’s a small list. The concern is that the list might grow and “overcrowd” the log. Writing the content of the list to the log can balloon it up and slow processing noticeably. Also kills time in debugging.

k. Don’t Log What the Framework Logs for You

There are great things to log. E.g. the name of the current thread, the time, etc. But those are already written into the log by default almost everywhere. Don’t duplicate these efforts.

l.Don’t log Method Entry/Exit

Log only important events in the system. Entering or exiting a method isn’t an important event. E.g. if I have a method that enables feature X the log should be “Feature X enabled” and not “enable_feature_X entered”. I have done this a lot.

m. Dont fill the method

A complex method might include multiple points of failure, so it makes sense that we’d place logs in multiple points in the method so we can detect the failure along the way. Unfortunately, this leads to duplicate logging and verbosity.

Errors will typically map to error handling code which should be logged in generically. So all error conditions should already be covered.

This creates situations where we sometimes need to change the flow/behavior of the code, so logging will be more elegant.

n. Don’t use AOP logging

AOP (Aspect-Oriented Programming) logging allows you to automatically add logs at specific points in your application, such as when methods are entered or exited.

In Python, AOP-style logging can be implemented using decorators or middleware that inject logs into specific points, such as method entry and exit. While it might seem appealing for detailed tracing, the same problems apply as in other languages like Java.


import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def log_method_entry_exit(func):
    def wrapper(*args, **kwargs):
        logger.info(f"Entering: {func.__name__} with args={args} kwargs={kwargs}")
        result = func(*args, **kwargs)
        logger.info(f"Exiting: {func.__name__} with result={result}")
        return result
    return wrapper

# Example usage
@log_method_entry_exit
def example_function(x, y):
    return x + y

example_function(5, 3)

Why Avoid AOP Logging in Python

Performance Impact:
- Injecting logs into every method increases runtime overhead, especially if used extensively in large-scale systems.
- In Python, where function calls already add some overhead, this can significantly affect performance.
Log Verbosity:
- If this decorator is applied to every function or method in a system, it produces an enormous amount of log data.
- Debugging becomes harder because the meaningful logs are lost in the noise of entry/exit logs.
Limited Usefulness:
- During local development, tools like Python debuggers (pdb), profilers (cProfile, line_profiler), or tracing libraries like trace are far more effective for inspecting function behavior and performance.
CI Issues:
- Enabling such verbose logging during CI test runs can make tracking test failures more difficult because the logs are flooded with entry/exit messages, obscuring the root cause of failures.

Use Python-specific tools like pdb, ipdb, or IDE-integrated debuggers to inspect code locally.

o. Dont Double log

It’s pretty common to log an error when we’re about to throw an error. However, since most error code is generic, it’s likely there’s a log in the generic error handling code.

4. Ensuring Scalability

To keep your logging system robust and scalable,

Monitor Log Storage: Set alerts for log storage thresholds.
Implement Compression: Compress log files to reduce storage costs.
Automate Archival and Deletion: Regularly archive old logs and purge obsolete data.
Benchmark Logging Overhead: Measure the performance impact of logging on your application.

5. Logging for Metrics

Below, is the list of items that i wish can be logged for metrics.

General API Metrics

General API Metrics on HTTP methods, status codes, latency/duration, request size.
Total requests per endpoint over time. Requests per minute/hour.
Frequency and breakdown of 4XX and 5XX errors.
User ID or API client making the request.


{
  "timestamp": "2025-01-20T12:34:56Z",
  "endpoint": "/projects",
  "method": "POST",
  "status_code": 201,
  "user_id": 12345,
  "request_size_bytes": 512,
  "response_size_bytes": 256,
  "duration_ms": 120
}

Business Specific Metrics

Objects (session) creations: No. of projects created (daily/weekly)
Average success/failure rate.
Average time to create a session.
Frequency of each action on top of session.


{
  "timestamp": "2025-01-20T12:35:00Z",
  "endpoint": "/projects/12345/actions",
  "action": "edit",
  "status_code": 200,
  "user_id": 12345,
  "duration_ms": 98
}

Performance Metrics

Database query metrics on execution time, no. of queries per request.
Third party service metrics on time spent, success/failure rates of external calls.


{
  "timestamp": "2025-01-20T12:37:15Z",
  "endpoint": "/projects/12345",
  "db_query_time_ms": 45,
  "external_api_time_ms": 80,
  "status_code": 200,
  "duration_ms": 130
}

Scalability Metrics

Concurrency metrics on max request handled.
Request queue times during load.
System Metrics on CPU and Memory usage during request processing (this will be auto captured).

Usage Metrics

Traffic analysis on peak usage times.
Most/Least used endpoints.

6. Mapped Diagnostic Context (MDC)

MDC is the one, i longed for most. Also went into trouble by implementing without a middleware.

Mapped Diagnostic Context (MDC) is a feature provided by many logging frameworks, such as Logback, Log4j, and SLF4J. It allows developers to attach contextual information (key-value pairs) to the logging events, which can then be automatically included in log messages.

This context helps in differentiating and correlating log messages, especially in multi-threaded applications.

Why Use MDC?

Enhanced Log Clarity: By adding contextual information like user IDs, session IDs, or transaction IDs, MDC enables logs to provide more meaningful insights.
Easier Debugging: When logs contain thread-specific context, tracing the execution path of a specific transaction or user request becomes straightforward.
Reduced Log Ambiguity: MDC ensures that logs from different threads or components do not get mixed up, avoiding confusion.

Common Use Cases

Web Applications: Logging user sessions, request IDs, or IP addresses to trace the lifecycle of a request.
Microservices: Propagating correlation IDs across services for distributed tracing.
Background Tasks: Tracking specific jobs or tasks in asynchronous operations.

Limitations (Curated from other blogs. I havent tried yet )

Thread Boundaries: MDC is thread-local, so its context does not automatically propagate across threads (e.g., in asynchronous executions). For such scenarios, you may need to manually propagate the MDC context.
Overhead: Adding and managing MDC context introduces a small runtime overhead, especially in high-throughput systems.
Configuration Dependency: Proper MDC usage often depends on correctly configuring the logging framework.


2025-01-21 14:22:15.123 INFO  [thread-1] [userId=12345, transactionId=abc123] Starting transaction
2025-01-21 14:22:16.456 DEBUG [thread-1] [userId=12345, transactionId=abc123] Processing request
2025-01-21 14:22:17.789 ERROR [thread-1] [userId=12345, transactionId=abc123] Error processing request: Invalid input
2025-01-21 14:22:18.012 INFO  [thread-1] [userId=12345, transactionId=abc123] Transaction completed

In Fastapi, we can implement this via a middleware,


import logging
import uuid
from fastapi import FastAPI, Request
from starlette.middleware.base import BaseHTTPMiddleware

# Configure the logger
logger = logging.getLogger("uvicorn")
logger.setLevel(logging.INFO)

# Create a custom formatter with MDC placeholders
class CustomFormatter(logging.Formatter):
    def format(self, record):
        record.user_id = getattr(record, "user_id", "unknown")
        record.transaction_id = getattr(record, "transaction_id", str(uuid.uuid4()))
        return super().format(record)

# Set the logging format with MDC keys
formatter = CustomFormatter(
    "%(asctime)s %(levelname)s [%(threadName)s] [userId=%(user_id)s, transactionId=%(transaction_id)s] %(message)s"
)

# Apply the formatter to the handler
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logger.addHandler(console_handler)

# FastAPI application
app = FastAPI()

# Custom Middleware to add MDC context
class RequestContextMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        # Add MDC info before handling the request
        user_id = request.headers.get("X-User-ID", "default-user")
        transaction_id = str(uuid.uuid4())
        logging.getLogger().info(f"Request started: {user_id}, {transaction_id}")

        # Add MDC info to log
        logging.getLogger().user_id = user_id
        logging.getLogger().transaction_id = transaction_id

        response = await call_next(request)

        # Optionally, log additional information when the response is done
        logging.getLogger().info(f"Request finished: {user_id}, {transaction_id}")

        return response

# Add custom middleware to the FastAPI app
app.add_middleware(RequestContextMiddleware)

@app.get("/")
async def read_root():
    logger.info("Handling the root endpoint.")
    return {"message": "Hello, World!"}

@app.get("/items/{item_id}")
async def read_item(item_id: int):
    logger.info(f"Fetching item with ID {item_id}")
    return {"item_id": item_id}

Hope, you might have got a better idea on logging.

Parotta Salna
Learning Notes #62 – Serverless – Just like riding a taxi
19 January 2025 at 04:55

Learning Notes #62 – Serverless – Just like riding a taxi

Parotta Salna

By: Mr.ParottaSalna

19 January 2025 at 04:55

What is Serverless Computing?

Serverless computing allows developers to run applications without having to manage the underlying infrastructure. You write code, deploy it, and the cloud provider takes care of the rest from provisioning servers to scaling applications.

Popular serverless platforms include AWS Lambda, Azure Functions, and Google Cloud Functions.

The Taxi Analogy

Imagine traveling to a destination. There are multiple ways to get there,

Owning a Car (Traditional Servers): You own and maintain your car. This means handling maintenance, fuel, insurance, parking, and everything else that comes with it. It’s reliable and gives you control, but it’s also time-consuming and expensive to manage.
Hiring a Taxi (Serverless): With a taxi, you simply book a ride when you need it. You don’t worry about maintaining the car, fueling it, or where it’s parked afterward. You pay only for the distance traveled, and the service scales to your needs whether you’re alone or with friends.

Why Serverless is Like Taking a Taxi ?

No Infrastructure Management – With serverless, you don’t have to manage or worry about servers, just like you don’t need to maintain a taxi.
Pay-As-You-Go – In a taxi, you pay only for the distance traveled. Similarly, in serverless, you’re billed only for the compute time your application consumes.
On-Demand Availability – Need a ride at midnight? A taxi is just a booking away. Serverless functions work the same way available whenever you need them, scaling up or down as required.
Scalability – Whether you’re a solo traveler or part of a group, taxis can adapt by providing a small car or a larger vehicle. Serverless computing scales resources automatically based on traffic, ensuring optimal performance.
Focus on the Destination – When you take a taxi, you focus on reaching your destination without worrying about the vehicle. Serverless lets you concentrate on writing and deploying code rather than worrying about servers.

Key Benefits of Serverless (and Taxi Rides)

Cost-Effectiveness – Avoid upfront costs. No need to buy servers (or cars) you might not fully utilize.
Flexibility – Serverless platforms support multiple programming languages and integrations.
Taxis, too, come in various forms: regular cars, SUVs, and even luxury rides for special occasions.
Reduced Overhead – Free yourself from maintenance tasks, whether it’s patching servers or checking tire pressure.

When Not to Choose Serverless (or a Taxi)

Predictable, High-Volume Usage – Owning a car might be cheaper if you’re constantly on the road. Similarly, for predictable and sustained workloads, traditional servers or containers might be more cost-effective than serverless.
Special Requirements – Need a specific type of vehicle, like a truck for moving furniture? Owning one might make sense. Similarly, applications with unique infrastructure requirements may not be a perfect fit for serverless.
Latency Sensitivity – Taxis take time to arrive after booking. Likewise, serverless functions may experience cold starts, adding slight delays. For ultra-low-latency applications, other architectures may be preferable.

Parotta Salna
Learning Notes #57 – Partial Indexing in Postgres
16 January 2025 at 14:36

Learning Notes #57 – Partial Indexing in Postgres

Parotta Salna

By: Mr.ParottaSalna

16 January 2025 at 14:36

Today, i learnt about partial indexing in postgres, how its optimizes the indexing process to filter subset of table more efficiently. In this blog, i jot down notes on partial indexing.

Partial indexing in PostgreSQL is a powerful feature that provides a way to optimize database performance by creating indexes that apply only to a subset of a table’s rows. This selective indexing can result in reduced storage space, faster index maintenance, and improved query performance, especially when queries frequently involve filters or conditions that only target a portion of the data.

An index in PostgreSQL, like in other relational database management systems, is a data structure that improves the speed of data retrieval operations. However, creating an index on an entire table can sometimes be inefficient, especially when dealing with very large datasets where queries often focus on specific subsets of the data. This is where partial indexing becomes invaluable.

Unlike a standard index that covers every row in a table, a partial index only includes rows that satisfy a specified condition. This condition is defined using a WHERE clause when the index is created.

To understand the mechanics, let us consider a practical example.

Suppose you have a table named orders that stores details about customer orders, including columns like order_id, customer_id, order_date, status, and total_amount. If the majority of your queries focus on pending orders those where the status is pending, creating a partial index specifically for these rows can significantly improve performance.

Example 1:

Here’s how you can create such an index,

CREATE INDEX idx_pending_orders
ON orders (order_date)
WHERE status = 'pending';

In this example, the index idx_pending_orders includes only the rows where status equals pending. This means that any query that involves filtering by status = 'pending' and utilizes the order_date column will leverage this index. For instance, the following query would benefit from the partial index,

SELECT *
FROM orders
WHERE status = 'pending'
AND order_date > '2025-01-01';

The benefits of this approach are significant. By indexing only the rows with status = 'pending', the size of the index is much smaller compared to a full table index.

This reduction in size not only saves disk space but also speeds up the process of scanning the index, as there are fewer entries to traverse. Furthermore, updates or modifications to rows that do not meet the WHERE condition are excluded from index maintenance, thereby reducing the overhead of maintaining the index and improving performance for write operations.

Example 2:

Let us explore another example. Suppose your application frequently queries orders that exceed a certain total amount. You can create a partial index tailored to this use case,

CREATE INDEX idx_high_value_orders
ON orders (customer_id)
WHERE total_amount > 1000;

This index would optimize queries like the following,

SELECT *
FROM orders
WHERE total_amount > 1000
AND customer_id = 123;

The key advantage here is that the index only includes rows where total_amount > 1000. For datasets with a wide range of order amounts, this can dramatically reduce the number of indexed entries. Queries that filter by high-value orders become faster because the database does not need to sift through irrelevant rows.

Additionally, as with the previous example, index maintenance is limited to the subset of rows matching the condition, improving overall performance for insertions and updates.

Partial indexes are also useful for enforcing constraints in a selective manner. Consider a scenario where you want to ensure that no two active promotions exist for the same product. You can achieve this using a unique partial index

CREATE UNIQUE INDEX idx_unique_active_promotion
ON promotions (product_id)
WHERE is_active = true;

This index guarantees that only one row with is_active = true can exist for each product_id.

In conclusion, partial indexing in PostgreSQL offers a flexible and efficient way to optimize database performance by targeting specific subsets of data.

Parotta Salna
Learning Notes #54 – Architecture Decision Records
14 January 2025 at 02:35

Learning Notes #54 – Architecture Decision Records

Parotta Salna

By: Mr.ParottaSalna

14 January 2025 at 02:35

Last few days, i was learning on how to make a accountable decision on deciding technical stuffs. Then i came across ADR. So far i haven’t used or seen used by our team. I think this is a necessary step to be incorporated to make accountable decisions. In this blog i share details on ADR for my future reference.

What is an ADR?

An Architectural Decision Record (ADR) is a concise document that captures a single architectural decision, its context, the reasoning behind it, and its consequences. ADRs help teams document, share, and revisit architectural choices, ensuring transparency and better collaboration.

Why Use ADRs?

Documentation: ADRs serve as a historical record of why certain decisions were made.
Collaboration: They promote better understanding across teams.
Traceability: ADRs link architectural decisions to specific project requirements and constraints.
Accountability: They clarify who made a decision and when.
Change Management: ADRs help evaluate the impact of changes and facilitate discussions around reversals or updates.

ADR Structure

A typical ADR document follows a standard format. Here’s an example:

Title: A clear and concise title describing the decision.
Context: Background information explaining the problem or opportunity.
Decision: A summary of the chosen solution.
Consequences: The positive and negative outcomes of the decision.
Status: Indicates whether the decision is proposed, accepted, superseded, or deprecated.

Example:

Optimistic locking on MongoDB https://docs.google.com/document/d/1olCbicQeQzYpCxB0ejPDtnri9rWb2Qhs9_JZuvANAxM/edit?usp=sharing

References

Parotta Salna
Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX
12 January 2025 at 09:14

Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX

Parotta Salna

By: Mr.ParottaSalna

12 January 2025 at 09:14

Redis, a high-performance in-memory key-value store, is widely used for caching, session management, and various other scenarios where fast data retrieval is essential. One of its key features is the ability to set expiration times for keys. However, when using the SET command with the EX option, developers might encounter unexpected behaviors where the expiration time is seemingly lost. Let’s explore this issue in detail.

Understanding `SET` with `EX`

The Redis SET command with the EX option allows you to set a key’s value and specify its expiration time in seconds. For instance


SET key value EX 60

This command sets the key key to the value value and sets an expiration time of 60 seconds.

The Problem

In certain cases, the expiration time might be unexpectedly lost. This typically happens when subsequent operations overwrite the key without specifying a new expiration. For example,


SET key value1 EX 60
SET key value2

In the above sequence,

The first SET command assigns a value to key and sets an expiration of 60 seconds.
The second SET command overwrites the value of key but does not include an expiration time, resulting in the key persisting indefinitely.

This behavior can lead to subtle bugs, especially in applications that rely on key expiration for correctness or resource management.

Why Does This Happen?

The Redis SET command is designed to replace the entire state of a key, including its expiration. When you use SET without the EX, PX, or EXAT options, the expiration is removed, and the key becomes persistent. This behavior aligns with the principle that SET is a complete update operation.

When using Redis SET with EX, be mindful of operations that might overwrite keys without reapplying expiration. Understanding Redis’s behavior and implementing robust patterns can save you from unexpected issues, ensuring your application remains efficient and reliable.

Parotta Salna
Learning Notes #52 – Hybrid Origin Failover Pattern
12 January 2025 at 06:29

Learning Notes #52 – Hybrid Origin Failover Pattern

Parotta Salna

By: Mr.ParottaSalna

12 January 2025 at 06:29

Today, i learnt about failover patterns from AWS https://aws.amazon.com/blogs/networking-and-content-delivery/three-advanced-design-patterns-for-high-available-applications-using-amazon-cloudfront/ . In this blog i jot down my understanding on this pattern for future reference,

Hybrid origin failover is a strategy that combines two distinct approaches to handle origin failures effectively, balancing speed and resilience.

The Need for Origin Failover

When an application’s primary origin server becomes unavailable, the ability to reroute traffic to a secondary origin ensures continuity. The failover process determines how quickly and effectively this switch happens. Broadly, there are two approaches to implement origin failover:

Stateful Failover with DNS-based Routing
Stateless Failover with Application Logic

Each has its strengths and limitations, which the hybrid approach aims to mitigate.

Stateful Failover

Stateful failover is a system that allows a standby server to take over for a failed server and continue active sessions. It’s used to create a resilient network infrastructure and avoid service interruptions.

This method relies on a DNS service with health checks to detect when the primary origin is unavailable. Here’s how it works,

Health Checks: The DNS service continuously monitors the health of the primary origin using health checks (e.g., HTTP, HTTPS).
DNS Failover: When the primary origin is marked unhealthy, the DNS service resolves the origin’s domain name to the secondary origin’s IP address.
TTL Impact: The failover process honors the DNS Time-to-Live (TTL) settings. A low TTL ensures faster propagation, but even in the most optimal configurations, this process introduces a delay—often around 60 to 70 seconds.
Stateful Behavior: Once failover occurs, all traffic is routed to the secondary origin until the primary origin is marked healthy again.

Implementation from AWS (as-is from aws blog)

The first approach is using Amazon Route 53 Failover routing policy with health checks on the origin domain name that’s configured as the origin in CloudFront. When the primary origin becomes unhealthy, Route 53 detects it, and then starts resolving the origin domain name with the IP address of the secondary origin. CloudFront honors the origin DNS TTL, which means that traffic will start flowing to the secondary origin within the DNS TTLs. The most optimal configuration (Fast Check activated, a failover threshold of 1, and 60 second DNS TTL) means that the failover will take 70 seconds at minimum to occur. When it does, all of the traffic is switched to the secondary origin, since it’s a stateful failover. Note that this design can be further extended with Route 53 Application Recovery Control for more sophisticated application failover across multiple AWS Regions, Availability Zones, and on-premises.

The second approach is using origin failover, a native feature of CloudFront. This capability of CloudFront tries for the primary origin of every request, and if a configured 4xx or 5xx error is received, then CloudFront attempts a retry with the secondary origin. This approach is simple to configure and provides immediate failover. However, it’s stateless, which means every request must fail independently, thus introducing latency to failed requests. For transient origin issues, this additional latency is an acceptable tradeoff with the speed of failover, but it’s not ideal when the origin is completely out of service. Finally, this approach only works for the GET/HEAD/OPTIONS HTTP methods, because other HTTP methods are not allowed on a CloudFront cache behavior with Origin Failover enabled.

Advantages

Works for all HTTP methods and request types.
Ensures complete switchover, minimizing ongoing failures.

Disadvantages

Relatively slower failover due to DNS propagation time.
Requires a reliable health-check mechanism.

Approach 2: Stateless Failover with Application Logic

This method handles failover at the application level. If a request to the primary origin fails (e.g., due to a 4xx or 5xx HTTP response), the application or CDN immediately retries the request with the secondary origin.

How It Works

Primary Request: The application sends a request to the primary origin.
Failure Handling: If the response indicates a failure (configurable for specific error codes), the request is retried with the secondary origin.
Stateless Behavior: Each request operates independently, so failover happens on a per-request basis without waiting for a stateful switchover.

Implementation from AWS (as-is from aws blog)

The hybrid origin failover pattern combines both approaches to get the best of both worlds. First, you configure both of your origins with a Failover Policy in Route 53 behind a single origin domain name. Then, you configure an origin failover group with the single origin domain name as primary origin, and the secondary origin domain name as secondary origin. This means that when the primary origin becomes unavailable, requests are immediately retried with the secondary origin until the stateful failover of Route 53 kicks in within tens of seconds, after which requests go directly to the secondary origin without any latency penalty. Note that this pattern only works with the GET/HEAD/OPTIONS HTTP methods.

Advantages

Near-instantaneous failover for failed requests.
Simple to configure and doesn’t depend on DNS TTL.

Disadvantages

Adds latency for failed requests due to retries.
Limited to specific HTTP methods like GET, HEAD, and OPTIONS.
Not suitable for scenarios where the primary origin is entirely down, as every request must fail first.

The Hybrid Origin Failover Pattern

The hybrid origin failover pattern combines the strengths of both approaches, mitigating their individual limitations. Here’s how it works:

DNS-based Stateful Failover: A DNS service with health checks monitors the primary origin and switches to the secondary origin if the primary becomes unhealthy. This ensures a complete and stateful failover within tens of seconds.
Application-level Stateless Failover: Simultaneously, the application or CDN is configured to retry failed requests with a secondary origin. This provides an immediate failover mechanism for transient or initial failures.

Implementation Steps

DNS Configuration
- Set up health checks on the primary origin.
- Define a failover policy in the DNS service, which resolves the origin domain name to the secondary origin when the primary is unhealthy.
Application Configuration
- Configure the application or CDN to use an origin failover group.
- Specify the primary origin domain as the primary origin and the secondary origin domain as the backup.

Behavior

Initially, if the primary origin encounters issues, requests are retried immediately with the secondary origin.
Meanwhile, the DNS failover switches all traffic to the secondary origin within tens of seconds, eliminating retry latencies for subsequent requests.

Benefits of Hybrid Origin Failover

Faster Failover: Immediate retries for failed requests minimize initial impact, while DNS failover ensures long-term stability.
Reduced Latency: After DNS failover, subsequent requests don’t experience retry delays.
High Resilience: Combines stateful and stateless failover for robust redundancy.
Simplicity and Scalability: Leverages existing DNS and application/CDN features without complex configurations.

Limitations and Considerations

HTTP Method Constraints: Stateless failover works only for GET, HEAD, and OPTIONS methods, limiting its use for POST or PUT requests.
TTL Impact: Low TTLs reduce propagation delays but increase DNS query rates, which could lead to higher costs.
Configuration Complexity: Combining DNS and application-level failover requires careful setup and testing to avoid misconfigurations.
Secondary Origin Capacity: Ensure the secondary origin can handle full traffic loads during failover.

Parotta Salna
Learning Notes #51 – Postgres as a Queue using SKIP LOCKED
11 January 2025 at 06:56

Learning Notes #51 – Postgres as a Queue using SKIP LOCKED

Parotta Salna

By: Mr.ParottaSalna

11 January 2025 at 06:56

Yesterday, i came across a blog from inferable.ai https://www.inferable.ai/blog/posts/postgres-skip-locked, which walkthrough about using postgres as a queue. In this blog, i jot down notes on using postgres as a queue for future references.

PostgreSQL is a robust relational database that can be used for more than just storing structured data. With the SKIP LOCKED feature introduced in PostgreSQL 9.5, you can efficiently turn a PostgreSQL table into a job queue for distributed processing.

Why Use PostgreSQL as a Queue?

Using PostgreSQL as a queue can be advantageous because,

Familiarity: If you’re already using PostgreSQL, there’s no need for an additional message broker.
Durability: PostgreSQL ensures ACID compliance, offering reliability for your job processing.
Simplicity: No need to manage another component like RabbitMQ or Kafka

Implementing a Queue with SKIP LOCKED

1. Create a Queue Table

To start, you need a table to store the jobs,


CREATE TABLE job_queue (
    id SERIAL PRIMARY KEY,
    job_data JSONB NOT NULL,
    status TEXT DEFAULT 'pending',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

This table has the following columns,

id: A unique identifier for each job.
job_data: The data or payload for the job.
status: Tracks the job’s state (‘pending’, ‘in_progress’, or ‘completed’).
created_at: Timestamp of job creation.

2. Insert Jobs into the Queue

Adding jobs is straightforward,


INSERT INTO job_queue (job_data)
VALUES ('{"task": "send_email", "email": "user@example.com"}');

3. Fetch Jobs for Processing with SKIP LOCKED

Workers will fetch jobs from the queue using SELECT ... FOR UPDATE SKIP LOCKED to avoid contention,

WITH next_job AS (
    SELECT id, job_data
    FROM job_queue
    WHERE status = 'pending'
    FOR UPDATE SKIP LOCKED
    LIMIT 1
)
UPDATE job_queue
SET status = 'in_progress'
FROM next_job
WHERE job_queue.id = next_job.id
RETURNING job_queue.id, job_queue.job_data;

Key Points:

FOR UPDATE locks the selected row to prevent other workers from picking it up.
SKIP LOCKED ensures locked rows are skipped, enabling concurrent workers to operate without waiting.
LIMIT 1 processes one job at a time per worker.

4. Mark Jobs as Completed

Once a worker finishes processing a job, it should update the job’s status,


UPDATE job_queue
SET status = 'completed'
WHERE id = $1; -- Replace $1 with the job ID

5. Delete Old or Processed Jobs

To keep the table clean, you can periodically remove completed jobs,


DELETE FROM job_queue
WHERE status = 'completed' AND created_at < NOW() - INTERVAL '30 days';

Example Worker Implementation

Here’s an example of a worker implemented in Python using psycopg2


import psycopg2
from psycopg2.extras import RealDictCursor

connection = psycopg2.connect("dbname=yourdb user=youruser")

while True:
    with connection.cursor(cursor_factory=RealDictCursor) as cursor:
        cursor.execute(
            """
            WITH next_job AS (
                SELECT id, job_data
                FROM job_queue
                WHERE status = 'pending'
                FOR UPDATE SKIP LOCKED
                LIMIT 1
            )
            UPDATE job_queue
            SET status = 'in_progress'
            FROM next_job
            WHERE job_queue.id = next_job.id
            RETURNING job_queue.id, job_queue.job_data;
            """
        )

        job = cursor.fetchone()
        if job:
            print(f"Processing job {job['id']}: {job['job_data']}")

            # Simulate job processing
            cursor.execute("UPDATE job_queue SET status = 'completed' WHERE id = %s", (job['id'],))

        else:
            print("No jobs available. Sleeping...")
            time.sleep(5)

    connection.commit()

Considerations

Transaction Isolation: Use the REPEATABLE READ or SERIALIZABLE isolation level cautiously to avoid unnecessary locks.
Row Locking: SKIP LOCKED only skips rows locked by other transactions, not those locked within the same transaction.
Performance: Regularly archive or delete old jobs to prevent the table from growing indefinitely. Consider indexing the status column to improve query performance.
Fault Tolerance: Ensure that workers handle crashes or timeouts gracefully. Use a timeout mechanism to revert jobs stuck in the ‘in_progress’ state.
Scaling: Distribute workers across multiple nodes to handle a higher job throughput.
The SKIP LOCKED clause only applies to row-level locks – the required ROW SHARE table-level lock is still taken normally.
Using SKIP LOCKED provides an inconsistent view of the data by design. This is why it’s perfect for queue-like tables where we want to distribute work, but not suitable for general purpose work where consistency is required.

Parotta Salna
Learning Notes #50 – Fixed Partition Pattern | Distributed Pattern
9 January 2025 at 16:51

Learning Notes #50 – Fixed Partition Pattern | Distributed Pattern

Parotta Salna

By: Mr.ParottaSalna

9 January 2025 at 16:51

Today, i learnt about fixed partition, where it handles about balancing the data among servers without high movement of data. In this blog, i jot down notes on how fixed partition helps in solving the problem.

This entire blog is inspired from https://www.linkedin.com/pulse/distributed-systems-design-pattern-fixed-partitions-retail-kumar-v-c34pc/?trackingId=DMovSwEZSfCzKZEKa7yJrg%3D%3D

Problem Statement

In a distributed key-value store system, data items need to be mapped to a set of cluster nodes to ensure efficient storage and retrieval. The system must satisfy the following requirements,

Uniform Distribution: Data should be evenly distributed across all cluster nodes to avoid overloading any single node.
Deterministic Mapping: Given a data item, the specific node responsible for storing it should be determinable without querying all the nodes in the cluster.

A common approach to achieve these goals is to use hashing with a modulo operation. For example, if there are three nodes in the cluster, the key is hashed, and the hash value modulo the number of nodes determines the node to store the data. However, this method has a critical drawback,

Rebalancing Issue: When the cluster size changes (e.g., nodes are added or removed), the mapping for most keys changes. This requires the system to move almost all the data to new nodes, leading to significant overhead in terms of time and resources, especially when dealing with large data volumes.

Challenge: How can we design a mapping mechanism that minimizes data movement during cluster size changes while maintaining uniform distribution and deterministic mapping?

Solution

There is a concept of Fixed Partitioning,

What Is Fixed Partitioning?

This pattern organizes data into a predefined number of fixed partitions that remain constant over time. Data is assigned to these partitions using a hashing algorithm, ensuring that the mapping of data to partitions is permanent. The system separates the fixed partitioning of data from the physical servers managing these partitions, enabling seamless scaling.

Key Features of Fixed Partitioning

Fixed Number of Partitions
- The number of partitions is determined during system initialization (e.g., 8 partitions).
- Data is assigned to these partitions based on a consistent hashing algorithm.
Stable Data Mapping
- Each piece of data is permanently mapped to a specific partition.
- This eliminates the need for large-scale data reshuffling when scaling the system.
Adjustable Partition-to-Server Mapping
- Partitions can be reassigned to different servers as the system scales.
- Only the physical location of the partitions changes; the fixed mapping remains intact.
Balanced Load Distribution
- Partitions are distributed evenly across servers to balance the workload.
- Adding new servers involves reassigning partitions without moving or reorganizing data within the partitions.

Naive Example

We have a banking system with transactions stored in 8 fixed partitions, distributed based on a customer’s account ID.


CREATE TABLE transactions (
    id SERIAL PRIMARY KEY,
    account_id INT NOT NULL,
    transaction_amount NUMERIC(10, 2) NOT NULL,
    transaction_date DATE NOT NULL
) PARTITION BY HASH (account_id);

1. Create Partition


DO $$
BEGIN
    FOR i IN 0..7 LOOP
        EXECUTE format(
            'CREATE TABLE transactions_p%s PARTITION OF transactions FOR VALUES WITH (modulus 8, remainder %s);',
            i, i
        );
    END LOOP;
END $$;

This creates 8 partitions (transactions_p0 to transactions_p7) based on the hash remainder of account_id modulo 8.

2. Inserting Data

When inserting data into the transactions table, PostgreSQL automatically places it into the correct partition based on the account_id.


INSERT INTO transactions (account_id, transaction_amount, transaction_date)
VALUES (12345, 500.00, '2025-01-01');

The hash of 12345 % 8 determines the target partition (e.g., transactions_p5).

3. Querying Data

Querying the base table works transparently across all partitions


SELECT * FROM transactions WHERE account_id = 12345;

PostgreSQL automatically routes the query to the correct partition.

4. Scaling by Adding Servers

Initial Setup:

Suppose we have 4 servers managing the partitions,

Server 1: transactions_p0, transactions_p1
Server 2: transactions_p2, transactions_p3
Server 3: transactions_p4, transactions_p5
Server 4: transactions_p6, transactions_p7

Adding a New Server:

When a 5th server is added, we redistribute partitions,

Server 1: transactions_p0
Server 2: transactions_p1
Server 3: transactions_p2, transactions_p3
Server 4: transactions_p4
Server 5: transactions_p5, transactions_p6, transactions_p7

Partition Migration

During the migration, transactions_p5 is copied from Server 3 to Server 5.
Once the migration is complete, Server 5 becomes responsible for transactions_p5.

Benefits:

Minimal Data Movement – When scaling, only the partitions being reassigned are copied to new servers. Data within partitions remains stable.
Optimized Performance – Queries are routed directly to the relevant partition, minimizing scan times.
Scalability – Adding servers is straightforward, as it involves reassigning partitions, not reorganizing data.

What happens when a new server is added then. Don’t we need to copy the data ?

When a partition is moved to a new server (e.g., partition_b from server_A to server_B), the data in the partition must be copied to the new server. However,

The copying is limited to the partition being reassigned.
No data within the partition is reorganized.
Once the partition is fully migrated, the original copy is typically deleted.

For example, in PostgreSQL,

Export the Partition pg_dump -t partition_b -h server_A -U postgres > partition_b.sql
Import on New Server: psql -h server_B -U postgres -d mydb < partition_b.sql

Parotta Salna
Learning Notes #49 – Pitfall of Implicit Default Values in APIs
9 January 2025 at 14:00

Learning Notes #49 – Pitfall of Implicit Default Values in APIs

Parotta Salna

By: Mr.ParottaSalna

9 January 2025 at 14:00

Today, we faced a bug in our workflow due to implicit default value in an 3rd party api. In this blog i will be sharing my experience for future reference.

Understanding the Problem

Consider an API where some fields are optional, and a default value is used when those fields are not provided by the client. This design is common and seemingly harmless. However, problems arise when,

Unexpected Categorization: The default value influences logic, such as category assignment, in ways the client did not intend.
Implicit Assumptions: The API assumes a default value aligns with the client’s intention, leading to misclassification or incorrect behavior.
Debugging Challenges: When issues occur, clients and developers spend significant time tracing the problem because the default behavior is not transparent.

Here’s an example of how this might manifest,


POST /items
{
  "name": "Sample Item",
  "category": "premium"
}

If the category field is optional and a default value of "basic" is applied when it’s omitted, the following request,


POST /items
{
  "name": "Another Item"
}

might incorrectly classify the item as basic, even if the client intended it to be uncategorized.

Why This is a Code Smell

Implicit default handling for optional fields often signals poor design. Let’s break down why,

Violation of the Principle of Least Astonishment: Clients may be unaware of default behavior, leading to unexpected outcomes.
Hidden Logic: The business logic embedded in defaults is not explicit in the API’s contract, reducing transparency.
Coupling Between API and Business Logic: When defaults dictate core behavior, the API becomes tightly coupled to specific business rules, making it harder to adapt or extend.
Inconsistent Behavior: If the default logic changes in future versions, existing clients may experience breaking changes.

Best Practices to Avoid the Trap

Make Default Behavior Explicit
- Clearly document default values in the API specification (but we still missed it.)
- For example, use OpenAPI/Swagger to define optional fields and their default values explicitly
Avoid Implicit Defaults
- Instead of applying defaults server-side, require the client to explicitly provide values, even if they are defaults.
- This ensures the client is fully aware of the data being sent and its implications.
Use Null or Explicit Indicators
- Allow optional fields to be explicitly null or undefined, and handle these cases appropriately.
- In this case, the API can handle null as “no category specified” rather than applying a default.
Fail Fast with Validation
- Use strict validation to reject ambiguous requests, encouraging clients to provide clear inputs.


{
  "error": "Field 'category' must be provided explicitly."
}

5. Version Your API Thoughtfully:

Document changes and provide clear migration paths for clients.

If you must change default behaviors, ensure backward compatibility through versioning.

Implicit default values for optional fields can lead to unintended consequences, obscure logic, and hard-to-debug issues. Recognizing this pattern as a code smell is the first step to building more robust APIs. By adopting explicitness, transparency, and rigorous validation, you can create APIs that are easier to use, understand, and maintain.

Parotta Salna
Learning Notes #48 – Common Pitfalls in Event Driven Architecture
8 January 2025 at 15:04

Learning Notes #48 – Common Pitfalls in Event Driven Architecture

Parotta Salna

By: Mr.ParottaSalna

8 January 2025 at 15:04

Today, i came across Raul Junco post on mistakes in Event Driven Architecture – https://www.linkedin.com/posts/raul-junco_after-years-building-event-driven-systems-activity-7278770394046631936-zu3-?utm_source=share&utm_medium=member_desktop. In this blog i am highlighting the same for future reference.

Event-driven architectures are awesome, but they come with their own set of challenges. Missteps can lead to unreliable systems, inconsistent data, and frustrated users. Let’s explore some of the most common pitfalls and how to address them effectively.

1. Duplication

Idempotent APIs – https://parottasalna.com/2025/01/08/learning-notes-47-idempotent-post-requests/

Events often get re-delivered due to retries or system failures. Without proper handling, duplicate events can,

Charge a customer twice for the same transaction: Imagine a scenario where a payment service retries a payment event after a temporary network glitch, resulting in a duplicate charge.
Cause duplicate inventory updates: For example, an e-commerce platform might update stock levels twice for a single order, leading to overestimating available stock.
Create inconsistent or broken system states: Duplicates can cascade through downstream systems, introducing mismatched or erroneous data.

Solution:

Assign unique IDs: Ensure every event has a globally unique identifier. Consumers can use these IDs to detect and discard duplicates.
Design idempotent processing: Structure your operations so they produce the same outcome even when executed multiple times. For instance, an API updating inventory could always set stock levels to a specific value rather than incrementing or decrementing.

2. Not Guaranteeing Order

Events can arrive out of order when distributed across partitions or queues. This can lead to

Processing a refund before the payment: If a refund event is processed before the corresponding payment event, the system might show a negative balance or fail to reconcile properly.
Breaking logic that relies on correct sequence: Certain workflows, such as assembling logs or transactional data, depend on a strict event order to function correctly.

Solution

Use brokers with ordering guarantees: Message brokers like Apache Kafka support partition-level ordering. Design your topics and partitions to align with entities requiring ordered processing (e.g., user or account ID).
Add sequence numbers or timestamps: Include metadata in events to indicate their position in a sequence. Consumers can use this data to reorder events if necessary, ensuring logical consistency.

3. The Dual Write Problem

Outbox Pattern: https://parottasalna.com/2025/01/03/learning-notes-31-outbox-pattern-cloud-pattern/

When writing to a database and publishing an event, one might succeed while the other fails. This can

Lose events: If the event is not published after the database write, downstream systems might remain unaware of critical changes, such as a new order or a status update.
Cause mismatched states: For instance, a transaction might be logged in a database but not propagated to analytical or monitoring systems, creating inconsistencies.

Solution

Use the Transactional Outbox Pattern: In this pattern, events are written to an “outbox” table within the same database transaction as the main data write. A separate process then reads from the outbox and publishes events reliably.
Adopt Change Data Capture (CDC) tools: CDC tools like Debezium can monitor database changes and publish them as events automatically, ensuring no changes are missed.

4. Non-Backward-Compatible Changes

Changing event schemas without considering existing consumers can break systems. For example:

Removing a field: A consumer relying on this field might encounter null values or fail altogether.
Renaming or changing field types: This can lead to deserialization errors or misinterpretation of data.

Solution:

Maintain versioned schemas: Introduce new schema versions incrementally and ensure consumers can continue using older versions during the transition.
Use schema evolution-friendly formats: Formats like Avro or Protobuf natively support schema evolution, allowing you to add fields or make other non-breaking changes easily.
Add adapters for compatibility: Build adapters or translators that transform events from new schemas to older formats, ensuring backward compatibility for legacy systems.

Parotta Salna
Learning Notes #41 – Shared Lock and Exclusive Locks | Postgres
6 January 2025 at 14:07

Learning Notes #41 – Shared Lock and Exclusive Locks | Postgres

Parotta Salna

By: Mr.ParottaSalna

6 January 2025 at 14:07

Today, I learnt about various locking mechanism to prevent double update. In this blog, i make notes on Shared Lock and Exclusive Lock for my future self.

What Are Locks in Databases?

Locks are mechanisms used by a DBMS to control access to data. They ensure that transactions are executed in a way that maintains the ACID (Atomicity, Consistency, Isolation, Durability) properties of the database. Locks can be classified into several types, including

Shared Locks (S Locks): Allow multiple transactions to read a resource simultaneously but prevent any transaction from writing to it.
Exclusive Locks (X Locks): Allow a single transaction to modify a resource, preventing both reading and writing by other transactions.
Intent Locks: Used to signal the type of lock a transaction intends to acquire at a lower level.
Deadlock Prevention Locks: Special locks aimed at preventing deadlock scenarios.

Shared Lock

A shared lock is used when a transaction needs to read a resource (e.g., a database row or table) without altering it. Multiple transactions can acquire a shared lock on the same resource simultaneously. However, as long as one or more shared locks exist on a resource, no transaction can acquire an exclusive lock on that resource.


-- Transaction A: Acquire a shared lock on a row
BEGIN;
SELECT * FROM employees WHERE id = 1 FOR SHARE;
-- Transaction B: Acquire a shared lock on the same row
BEGIN;
SELECT * FROM employees WHERE id = 1 FOR SHARE;
-- Both transactions can read the row concurrently
-- Transaction C: Attempt to update the same row
BEGIN;
UPDATE employees SET salary = salary + 1000 WHERE id = 1;
-- Transaction C will be blocked until Transactions A and B release their locks

Key Characteristics of Shared Locks

1. Concurrent Reads

Shared locks allow multiple transactions to read the same resource at the same time.
This is ideal for operations like SELECT queries that do not modify data.

2. Write Blocking

While a shared lock is active, no transaction can modify the locked resource.
Prevents dirty writes and ensures read consistency.

3. Compatibility

Shared locks are compatible with other shared locks but not with exclusive locks.

When Are Shared Locks Used?

Shared locks are typically employed in read operations under certain isolation levels. For instance,

1. Read Committed Isolation Level:

Shared locks are held for the duration of the read operation.
Prevents dirty reads by ensuring the data being read is not modified by other transactions during the read.

2. Repeatable Read Isolation Level:

Shared locks are held until the transaction completes.
Ensures that the data read during a transaction remains consistent and unmodified.

3. Snapshot Isolation:

Shared locks may not be explicitly used, as the DBMS creates a consistent snapshot of the data for the transaction.

Exclusive Locks

An exclusive lock is used when a transaction needs to modify a resource. Only one transaction can hold an exclusive lock on a resource at a time, ensuring no other transactions can read or write to the locked resource.


-- Transaction X: Acquire an exclusive lock to update a row
BEGIN;
UPDATE employees SET salary = salary + 1000 WHERE id = 2;
-- Transaction Y: Attempt to read the same row
BEGIN;
SELECT * FROM employees WHERE id = 2;
-- Transaction Y will be blocked until Transaction X completes
-- Transaction Z: Attempt to update the same row
BEGIN;
UPDATE employees SET salary = salary + 500 WHERE id = 2;
-- Transaction Z will also be blocked until Transaction X completes

Key Characteristics of Exclusive Locks

1. Write Operations: Exclusive locks are essential for operations like INSERT, UPDATE, and DELETE.

2. Blocking Reads and Writes: While an exclusive lock is active, no other transaction can read or write to the resource.

3. Isolation: Ensures that changes made by one transaction are not visible to others until the transaction is complete.

When Are Exclusive Locks Used?

Exclusive locks are typically employed in write operations or any operation that modifies the database. For instance:

1. Transactional Updates – A transaction that updates a row acquires an exclusive lock to ensure no other transaction can access or modify the row during the update.

2. Table Modifications – When altering a table structure, the DBMS may place an exclusive lock on the entire table.

Benefits of Shared and Exclusive Locks

Benefits of Shared Locks

Consistency in Multi-User Environments – Ensure that data being read is not altered by other transactions, preserving consistency.
Concurrency Support – Allow multiple transactions to read data simultaneously, improving system performance.
Data Integrity – Prevent dirty reads and writes, ensuring that operations yield reliable results.

Benefits of Exclusive Locks

Data Integrity During Modifications – Prevents other transactions from accessing data being modified, ensuring changes are applied safely.
Isolation of Transactions – Ensures that modifications by one transaction are not visible to others until committed.

Limitations and Challenges

Shared Locks

Potential for Deadlocks – Deadlocks can occur if two transactions simultaneously hold shared locks and attempt to upgrade to exclusive locks.
Blocking Writes – Shared locks can delay write operations, potentially impacting performance in write-heavy systems.
Lock Escalation – In systems with high concurrency, shared locks may escalate to table-level locks, reducing granularity and concurrency.

Exclusive Locks

Reduced Concurrency – Exclusive locks prevent other transactions from accessing the locked resource, which can lead to bottlenecks in highly concurrent systems.
Risk of Deadlocks – Deadlocks can occur if two transactions attempt to acquire exclusive locks on resources held by each other.

Normal view

Defining VUs and Duration in K6

Basic VU and Duration Configuration

Specifying VUs and Duration from the Command Line

Ramp Up and Ramp Down with Stages

Custom Execution Scenarios

Syntax of Custom Execution Scenarios

Different Executors in K6

Example: Ramping VUs Scenario

Example: Constant Arrival Rate Scenario

Example: Per VU Iteration Scenario

Choosing the Right Configuration

References

What Participants wanted to improve

Other Learnings & Suggestions

What is an OCI ?

Does Docker Create OCI Images?

What is a Buildpack ?

Overview of Buildpack Process

Builder: The Image That Executes the Build

Components of a Builder Image

Stack: The Combination of Build and Run Images

Installation and Initial Setups

Basic Build of an Image (Python Project)

Building an image using buildpack

Building an Image using Dockerfile

Unique Benefits of Buildpacks

No Need for a Dockerfile (Auto-Detection)

Automatic Security Updates

Standardized & Reproducible Builds

Extensibility: Custom Buildpacks

Generating SBOM in Buildpacks

a) Using pack CLI to Generate SBOM

b) Generate SBOM in Docker

Why Use GitHub Actions for Docker?

Setting Up GitHub Environments

Sample Workflow for Building and Pushing Docker Images

What is SBOM ?

So How come its Important ?

How to Create an SBOM?

How Trivy Works ?

How to Generate SBOMs with Trivy

Which is a best logger ?

1. Why Logging Matters

2. Logging Best Practices

3. Log Ingestion at Scale

3. Logging Don’ts

4. Ensuring Scalability

5. Logging for Metrics

6. Mapped Diagnostic Context (MDC)

What is Serverless Computing?

The Taxi Analogy

Why Serverless is Like Taking a Taxi ?

Key Benefits of Serverless (and Taxi Rides)

When Not to Choose Serverless (or a Taxi)

Example 1:

Example 2:

What is an ADR?

Why Use ADRs?

ADR Structure

Example:

References

Understanding SET with EX

The Problem

Why Does This Happen?

The Need for Origin Failover

Stateful Failover

Implementation from AWS (as-is from aws blog)

Approach 2: Stateless Failover with Application Logic

Implementation from AWS (as-is from aws blog)

The Hybrid Origin Failover Pattern

Implementation Steps

Behavior

Benefits of Hybrid Origin Failover

Limitations and Considerations

Why Use PostgreSQL as a Queue?

Implementing a Queue with SKIP LOCKED

1. Create a Queue Table

2. Insert Jobs into the Queue

3. Fetch Jobs for Processing with SKIP LOCKED

No Need for a `Dockerfile` (Auto-Detection)

a) Using `pack` CLI to Generate SBOM

Understanding `SET` with `EX`