Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

🎯 PostgreSQL Zero to Hero with Parottasalna – 2 Day Bootcamp (FREE!) 🚀

2 March 2025 at 07:09

Databases power the backbone of modern applications, and PostgreSQL is one of the most powerful open-source relational databases trusted by top companies worldwide. Whether you’re a beginner or a developer looking to sharpen your database skills, this FREE bootcamp will take you from Zero to Hero in PostgreSQL!

What You’ll Learn?

✅ PostgreSQL fundamentals & installation

✅ Postgres Architecture
✅ Writing optimized queries
✅ Indexing & performance tuning
✅ Transactions & locking mechanisms
✅ Advanced joins, CTEs & subqueries
✅ Real-world best practices & hands-on exercises

This intensive hands on bootcamp is designed for developers, DBAs, and tech enthusiasts who want to master PostgreSQL from scratch and apply it in real-world scenarios.

Who Should Attend?

🔹 Beginners eager to learn databases
🔹 Developers & Engineers working with PostgreSQL
🔹 Anyone looking to optimize their SQL skills

📅 Date: March 22, 23 -> (Moved to April 5, 6)
⏰ Time: Will be finalized later.
📍 Location: Online
💰 Cost: 100% FREE 🎉

🔗 RSVP Here

Prerequisite

  1. Checkout this playlist of our previous postgres session https://www.youtube.com/playlist?list=PLiutOxBS1Miy3PPwxuvlGRpmNo724mAlt

🎉 This bootcamp is completely FREE – Learn without any cost! 🎉

💡 Spots are limited – RSVP now to reserve your seat!

Why Skeleton Screens Improve Perceived Loading in Apps like LinkedIn

18 February 2025 at 11:55

Introduction

What do Reddit, Discord, Medium, and LinkedIn have in common? They use what’s called a skeleton loading screen for their applications. A skeleton screen is essentially a wireframe of the application. The wireframe is a placeholder until the application finally loads.

skeleton loader image

Rise of skeleton loader.

The term “skeleton screen” was introduced in 2013 by product designer Luke Wroblewski in a blog post about reducing perceived wait time. In this lukew.com/ff/entry.asp?1797 post, he explains how gradually revealing page content turns user attention to the content being loaded, and off of the loading time itself.

Skeleton Loader

Skeleton loading screens will improve your application’s user experience and make it feel more performant. The skeleton loading screen essentially impersonates the original layout.

This lets the user know what’s happening on the screen. The user interprets this as the application is booting up and the content is loading.

In simplest terms, Skeleton Loader is a static / animated placeholder for the information that is still loading. It mimic the structure and look of the entire view.

Why not just a loading spinner ?

Instead of showing a loading spinner, we could show a skeleton screen that makes the user see that there is progress happening when launching and navigating the application.

They let the user know that some content is loading and, more importantly, provide an indication of what is loading, whether it’s an image, text, card, and so on.

This gives the user the impression that the website is faster because they already know what type of content is loading before it appears. This is referred to as perceived performance.

Skeleton screens don’t really make pages load faster. Instead, they are designed to make it feel like pages are loading faster.

When to use ?

  1. Use on high-traffic pages where resources takes a bit long to load like account dashboard.
  2. When the component contains good amount of information, such as list or card.
  3. Could be replaced by spin in any situation, but can provide a better user experience.
  4. Use when there’s more than 1 element loading at the same time that requires an indicator.
  5. Use when you need to load multiple images at once, a skeleton screen might make a good placeholder. For these pages, consider implementing lazy loading first, which is a similar technique for decreasing perceived load time.

When not to use ?

  1. Not to use for a long-running process, e.g. importing data, manipulation of data etc. (Operations on data intensive applications)
  2. Not to use for fast processes that that take less than half a second.
  3. Users still associate video buffering with spinners. Avoid skeleton screens any time a video is loading on your page.
  4. For longer processes (uploads, download, file manipulation ) can use progress bar instead of skeleton loading.
  5. As a replacement for poor performance: If you can further optimize your website to actually load content more quickly, always pursue that first.

Let’s design a simple skeleton loading

index.html

<!DOCTYPE html>
<html>
<head>
	<meta charset="utf-8">
	<meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Skeleton Loading</title>

	<style>
		.card {
			width:  250px;
			height:  150px;
			background-color: #fff;
			padding: 10px;
			border-radius: 5px;
			border:  1px solid gray;
		}

		.card-skeleton {
			background-image: linear-gradient(90deg, #ccc, 0px, rgb(229 229 229 / 90%) 40px, #ccc 80px);
			background-size: 300%;
			background-position: 100% 0;
			border-radius: inherit;
			animation: shimmer 1.5s infinite;
		}

		.title {
			height: 15px;
			margin-bottom: 15px;
		}

		.description {
			height: 100px;
		}

		@keyframes shimmer{
			to {
				background-position: -100% 0;
			}
		}
	</style>

</head>
<body>
	<div class="card">
		<div class="card-skeleton title"></div>
		<div class="card-skeleton description"></div>
	</div>

</body>
</html>

Skeleton Loading .card { width: 250px; height: 150px; background-color: #fff; padding: 10px; border-radius: 5px; border: 1px solid gray; } .card-skeleton { background-image: linear-gradient(90deg, #ccc, 0px, rgb(229 229 229 / 90%) 40px, #ccc 80px); background-size: 300%; background-position: 100% 0; border-radius: inherit; animation: shimmer 1.5s infinite; } .title { height: 15px; margin-bottom: 15px; } .description { height: 100px; } @keyframes shimmer{ to { background-position: -100% 0; } }

Suggestions to keep in mind

  1. The goal is to design for a perception of decreased waiting time.
  2. Contents should replace skeleton exactly by position and size immediately after they are loaded.
  3. Using motion that moves from left to right (wave) gives a better perception of decreased waiting time than fading in and out (pulse).
  4. Using motion that is slow and steady gives a perception of decreased waiting time.

Avoid Cache Pitfalls: Key Problems and Fixes

16 February 2025 at 09:22

Caching is an essential technique for improving application performance and reducing the load on databases. However, improper caching strategies can lead to serious issues.

I got inspired from ByteByteGo https://www.linkedin.com/posts/bytebytego_systemdesign-coding-interviewtips-activity-7296767687978827776-Dizz

In this blog, we will discuss four common cache problems: Thundering Herd Problem, Cache Penetration, Cache Breakdown, and Cache Crash, along with their causes, consequences, and solutions.

  1. Thundering Herd Problem
    1. What is it?
    2. Example Scenario
    3. Solutions
  2. Cache Penetration
    1. What is it?
    2. Example Scenario
    3. Solutions
  3. Cache Breakdown
    1. What is it?
    2. Example Scenario
    3. Solutions
  4. Cache Crash
    1. What is it?
    2. Example Scenario
    3. Solutions

Thundering Herd Problem

What is it?

The Thundering Herd Problem occurs when a large number of keys in the cache expire at the same time. When this happens, all requests bypass the cache and hit the database simultaneously, overwhelming it and causing performance degradation or even a system crash.

Example Scenario

Imagine an e-commerce website where product details are cached for 10 minutes. If all the products’ cache expires at the same time, thousands of users sending requests will cause an overwhelming load on the database.

Solutions

  1. Staggered Expiration: Instead of setting a fixed expiration time for all keys, introduce a random expiry variation.
  2. Allow Only Core Business Queries: Limit direct database access only to core business data, while returning stale data or temporary placeholders for less critical data.
  3. Lazy Rebuild Strategy: Instead of all requests querying the database, the first request fetches data and updates the cache while others wait.
  4. Batch Processing: Queue multiple requests and process them in batches to reduce database load.

Cache Penetration

What is it?

Cache Penetration occurs when requests are made for keys that neither exist in the cache nor in the database. Since these requests always hit the database, they put excessive pressure on the system.

Example Scenario

A malicious user could attempt to query random user IDs that do not exist, forcing the system to repeatedly query the database and skip the cache.

Solutions

  1. Cache Null Values: If a key does not exist in the database, store a null value in the cache to prevent unnecessary database queries.
  2. Use a Bloom Filter: A Bloom filter helps check whether a key exists before querying the database. If the Bloom filter does not contain the key, the request is discarded immediately.
  3. Rate Limiting: Implement request throttling to prevent excessive access to non-existent keys.
  4. Data Prefetching: Predict and load commonly accessed data into the cache before it is needed.

Cache Breakdown

What is it?

Cache Breakdown is similar to the Thundering Herd Problem, but it occurs specifically when a single hot key (a frequently accessed key) expires. This results in a surge of database queries as all users try to retrieve the same data.

Example Scenario

A social media platform caches trending hashtags. If the cache expires, millions of users will query the same hashtag at once, hitting the database hard.

Solutions

  1. Never Expire Hot Keys: Keep hot keys permanently in the cache unless an update is required.
  2. Preload the Cache: Refresh the cache asynchronously before expiration by setting a background task to update the cache regularly.
  3. Mutex Locking: Ensure only one request updates the cache, while others wait for the update to complete.
  4. Double Buffering: Maintain a secondary cache layer to serve requests while the primary cache is being refreshed.

Cache Crash

What is it?

A Cache Crash occurs when the cache service itself goes down. When this happens, all requests fall back to the database, overloading it and causing severe performance issues.

Example Scenario

If a Redis instance storing session data for a web application crashes, all authentication requests will be forced to hit the database, leading to a potential outage.

Solutions

  1. Cache Clustering: Use a cluster of cache nodes instead of a single instance to ensure high availability.
  2. Persistent Storage for Cache: Enable persistence modes like Redis RDB or AOF to recover data quickly after a crash.
  3. Automatic Failover: Configure automated failover with tools like Redis Sentinel to ensure availability even if a node fails.
  4. Circuit Breaker Mechanism: Prevent the application from directly accessing the database if the cache is unavailable, reducing the impact of a crash.
class CircuitBreaker:
    def __init__(self, failure_threshold=5):
        self.failure_count = 0
        self.failure_threshold = failure_threshold
    
    def call(self, func, *args, **kwargs):
        if self.failure_count >= self.failure_threshold:
            return "Service unavailable"
        try:
            return func(*args, **kwargs)
        except Exception:
            self.failure_count += 1
            return "Error"

Caching is a powerful mechanism to improve application performance, but improper strategies can lead to severe bottlenecks. Problems like Thundering Herd, Cache Penetration, Cache Breakdown, and Cache Crash can significantly degrade system reliability if not handled properly.

Learning Notes #77 – Smoke Testing with K6

16 February 2025 at 07:12

In this blog, i jot down notes on what is smoke test, how it got its name, and how to approach the same in k6.

The term smoke testing originates from hardware testing, where engineers would power on a circuit or device and check if smoke appeared.

If smoke was detected, it indicated a fundamental issue, and further testing was halted. This concept was later adapted to software engineering.

What is Smoke Testing?

Smoke testing is a subset of test cases executed to verify that the major functionalities of an application work as expected. If a smoke test fails, the build is rejected, preventing further testing of a potentially unstable application. This test helps catch major defects early, saving time and effort.

Key Characteristics

  • Ensures that the application is not broken in major areas.
  • Runs quickly and is not exhaustive.
  • Usually automated as part of a CI/CD pipeline.

Writing a Basic Smoke Test with K6

A basic smoke test using K6 typically checks API endpoints for HTTP 200 responses and acceptable response times.

import http from 'k6/http';
import { check } from 'k6';

export let options = {
    vus: 1, // 1 virtual user
    iterations: 5, // Runs the test 5 times
};

export default function () {
    let res = http.get('https://example.com/api/health');
    check(res, {
        'is status 200': (r) => r.status === 200,
        'response time < 500ms': (r) => r.timings.duration < 500,
    });
}

Advanced Smoke Test Example

import http from 'k6/http';
import { check, sleep } from 'k6';

export let options = {
    vus: 2, // 2 virtual users
    iterations: 10, // Runs the test 10 times
};

export default function () {
    let res = http.get('https://example.com/api/login');
    check(res, {
        'status is 200': (r) => r.status === 200,
        'response time < 400ms': (r) => r.timings.duration < 400,
    });
    sleep(1);
}

Running and Analyzing Results

Execute the test using

k6 run smoke-test.js

Sample Output

checks...
✔ is status 200
✔ response time < 500ms

If any of the checks fail, K6 will report an error, signaling an issue in the application.

Smoke testing with K6 is an effective way to ensure that key functionalities in your application work as expected. By integrating it into your CI/CD pipeline, you can catch major defects early, improve application stability, and streamline your development workflow.

Learning Notes #76 – Specifying Virtual Users (VUs) and Test duration in K6

16 February 2025 at 05:13

When running load tests with K6, two fundamental aspects that shape test execution are the number of Virtual Users (VUs) and the test duration. These parameters help simulate realistic user behavior and measure system performance under different load conditions.

In this blog, i jot down notes on virtual users and test duration in options. Using this we can ramp up users.

  1. Defining VUs and Duration in K6
  2. Basic VU and Duration Configuration
  3. Specifying VUs and Duration from the Command Line
  4. Ramp Up and Ramp Down with Stages
  5. Custom Execution Scenarios
    1. Syntax of Custom Execution Scenarios
    2. Different Executors in K6
    3. Example: Ramping VUs Scenario
    4. Example: Constant Arrival Rate Scenario
    5. Example: Per VU Iteration Scenario
  6. Choosing the Right Configuration
  7. References

Defining VUs and Duration in K6

K6 offers multiple ways to define VUs and test duration, primarily through options in the test script or the command line.

Basic VU and Duration Configuration

The simplest way to specify VUs and test duration is by setting them in the options object of your test script.

import http from 'k6/http';
import { sleep } from 'k6';

export const options = {
  vus: 10, // Number of virtual users
  duration: '30s', // Duration of the test
};

export default function () {
  http.get('https://test.k6.io/');
  sleep(1);
}

This script runs a load test with 10 virtual users for 30 seconds, making requests to the specified URL.

Specifying VUs and Duration from the Command Line

You can also set the VUs and duration dynamically using command-line arguments without modifying the script.

k6 run --vus 20 --duration 1m script.js

This command runs the test with 20 virtual users for 1 minute.

Ramp Up and Ramp Down with Stages

Instead of a fixed number of VUs, you can simulate user load variations over time using stages. This helps to gradually increase or decrease the load on the system.

export const options = {
  stages: [
    { duration: '30s', target: 10 }, // Ramp up to 10 VUs
    { duration: '1m', target: 50 },  // Ramp up to 50 VUs
    { duration: '30s', target: 10 }, // Ramp down to 10 VUs
    { duration: '20s', target: 0 },  // Ramp down to 0 VUs
  ],
};

This test gradually increases the load, sustains it, and then reduces it, simulating real-world traffic patterns.

Custom Execution Scenarios

For more advanced load testing strategies, K6 supports scenarios, allowing fine-grained control over execution behavior.

Syntax of Custom Execution Scenarios

A scenarios object defines different execution strategies. Each scenario consists of

  • executor: Defines how the test runs (e.g., ramping-vus, constant-arrival-rate, etc.).
  • vus: Number of virtual users (for certain executors).
  • duration: How long the scenario runs.
  • iterations: Total number of iterations per VU (for certain executors).
  • stages: Used in ramping-vus to define load variations over time.
  • rate: Defines the number of iterations per time unit in constant-arrival-rate.
  • preAllocatedVUs: Number of VUs reserved for the test.

Different Executors in K6

K6 provides several executors that define how virtual users (VUs) generate load

  1. shared-iterations – Distributes a fixed number of iterations across multiple VUs.
  2. per-vu-iterations – Each VU runs a specific number of iterations independently.
  3. constant-vus – Maintains a fixed number of VUs for a set duration.
  4. ramping-vus – Increases or decreases the number of VUs over time.
  5. constant-arrival-rate – Ensures a constant number of requests per time unit, independent of VUs.
  6. ramping-arrival-rate – Gradually increases or decreases the request rate over time.
  7. externally-controlled – Allows dynamic control of VUs via an external API.

Example: Ramping VUs Scenario

export const options = {
  scenarios: {
    ramping_users: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '30s', target: 20 },
        { duration: '1m', target: 100 },
        { duration: '30s', target: 0 },
      ],
    },
  },
};

export default function () {
  http.get('https://test-api.example.com');
  sleep(1);
}

Example: Constant Arrival Rate Scenario

export const options = {
  scenarios: {
    constant_request_rate: {
      executor: 'constant-arrival-rate',
      rate: 50, // 50 iterations per second
      timeUnit: '1s', // Per second
      duration: '1m', // Test duration
      preAllocatedVUs: 20, // Number of VUs to allocate
    },
  },
};

export default function () {
  http.get('https://test-api.example.com');
  sleep(1);
}

Example: Per VU Iteration Scenario

export const options = {
  scenarios: {
    per_vu_iterations: {
      executor: 'per-vu-iterations',
      vus: 10,
      iterations: 50, // Each VU executes 50 iterations
      maxDuration: '1m',
    },
  },
};

export default function () {
  http.get('https://test-api.example.com');
  sleep(1);
}

Choosing the Right Configuration

  • Use fixed VUs and duration for simple, constant load testing.
  • Use stages for ramping up and down load gradually.
  • Use scenarios for more complex and controlled testing setups.

References

  1. Scenarios & Executors – https://grafana.com/docs/k6/latest/using-k6/scenarios/

Golden Feedbacks for Python Sessions 1.0 from last year (2024)

13 February 2025 at 08:49

Many Thanks to Shrini for documenting it last year. This serves as a good reference to improve my skills. Hope it will help many.

📢 What Participants wanted to improve

🚶‍♂️ Go a bit slower so that everyone can understand clearly without feeling rushed.


📚 Provide more basics and examples to make learning easier for beginners.


🖥 Spend the first week explaining programming basics so that newcomers don’t feel lost.


📊 Teach flowcharting methods to help participants understand the logic behind coding.


🕹 Try teaching Scratch as an interactive way to introduce programming concepts.


🗓 Offer weekend batches for those who prefer learning on weekends.


🗣 Encourage more conversations so that participants can actively engage in discussions.


👥 Create sub-groups to allow participants to collaborate and support each other.


🎉 Get “cheerleaders” within the team to make the classes more fun and interactive.


📢 Increase promotion efforts to reach a wider audience and get more participants.


🔍 Provide better examples to make concepts easier to grasp.


❓ Conduct more Q&A sessions so participants can ask and clarify their doubts.


🎙 Ensure that each participant gets a chance to speak and express their thoughts.


📹 Showing your face in videos can help in building a more personal connection with the learners.


🏆 Organize mini-hackathons to provide hands-on experience and encourage practical learning.


🔗 Foster more interactions and connections between participants to build a strong learning community.


✍ Encourage participants to write blogs daily to document their learning and share insights.


🎤 Motivate participants to give talks in class and other communities to build confidence.

📝 Other Learnings & Suggestions

📵 Avoid creating WhatsApp groups for communication, as the 1024 member limit makes it difficult to manage multiple groups.


✉ Telegram works fine for now, but explore using mailing lists as an alternative for structured discussions.


🔕 Mute groups when necessary to prevent unnecessary messages like “Hi, Hello, Good Morning.”


📢 Teach participants how to join mailing lists like ChennaiPy and KanchiLUG and guide them on asking questions in forums like Tamil Linux Community.


📝 Show participants how to create a free blog on platforms like dev.to or WordPress to share their learning journey.


🛠 Avoid spending too much time explaining everything in-depth, as participants should start coding a small project by the 5th or 6th class.


📌 Present topics as solutions to project ideas or real-world problem statements instead of just theory.


👤 Encourage using names when addressing people, rather than calling them “Sir” or “Madam,” to maintain an equal and friendly learning environment.


💸 Zoom is costly, and since only around 50 people complete the training, consider alternatives like Jitsi or Google Meet for better cost-effectiveness.

Will try to incorporate these learnings in our upcoming sessions.

🚀 Let’s make this learning experience engaging, interactive, and impactful! 🎯

20 Essential Git Command-Line Tricks Every Developer Should Know

5 February 2025 at 16:14

Git is a powerful version control system that every developer should master. Whether you’re a beginner or an experienced developer, knowing a few handy Git command-line tricks can save you time and improve your workflow. Here are 20 essential Git tips and tricks to boost your efficiency.

1. Undo the Last Commit (Without Losing Changes)

git reset --soft HEAD~1

If you made a commit but want to undo it while keeping your changes, this command resets the last commit but retains the modified files in your staging area.

This is useful when you realize you need to make more changes before committing.

If you also want to remove the changes from the staging area but keep them in your working directory, use,

git reset HEAD~1

2. Discard Unstaged Changes

git checkout -- <file>

Use this to discard local changes in a file before staging. Be careful, as this cannot be undone! If you want to discard all unstaged changes in your working directory, use,

git reset --hard HEAD

3. Delete a Local Branch

git branch -d branch-name

Removes a local branch safely if it’s already merged. If it’s not merged and you still want to delete it, use -D

git branch -D branch-name

4. Delete a Remote Branch

git push origin --delete branch-name

Deletes a branch from the remote repository, useful for cleaning up old feature branches. If you mistakenly deleted the branch and want to restore it, you can use

git checkout -b branch-name origin/branch-name

if it still exists remotely.

5. Rename a Local Branch

git branch -m old-name new-name

Useful when you want to rename a branch locally without affecting the remote repository. To update the remote reference after renaming, push the renamed branch and delete the old one,

git push origin -u new-name
git push origin --delete old-name

6. See the Commit History in a Compact Format

git log --oneline --graph --decorate --all

A clean and structured way to view Git history, showing branches and commits in a visual format. If you want to see a detailed history with diffs, use

git log -p

7. Stash Your Changes Temporarily

git stash

If you need to switch branches but don’t want to commit yet, stash your changes and retrieve them later with

git stash pop

To see all stashed changes

git stash list

8. Find the Author of a Line in a File

git blame file-name

Shows who made changes to each line in a file. Helpful for debugging or reviewing historical changes. If you want to ignore whitespace changes

git blame -w file-name

9. View a File from a Previous Commit

git show commit-hash:path/to/file

Useful for checking an older version of a file without switching branches. If you want to restore the file from an old commit

git checkout commit-hash -- path/to/file

10. Reset a File to the Last Committed Version

git checkout HEAD -- file-name

Restores the file to the last committed state, removing any local changes. If you want to reset all files

git reset --hard HEAD

11. Clone a Specific Branch

git clone -b branch-name --single-branch repository-url

Instead of cloning the entire repository, this fetches only the specified branch, saving time and space. If you want all branches but don’t want to check them out initially:

git clone --mirror repository-url

12. Change the Last Commit Message

git commit --amend -m "New message"

Use this to correct a typo in your last commit message before pushing. Be cautious—if you’ve already pushed, use

git push --force-with-lease

13. See the List of Tracked Files

git ls-files

Displays all files being tracked by Git, which is useful for auditing your repository. To see ignored files

git ls-files --others --ignored --exclude-standard

14. Check the Difference Between Two Branches

git diff branch-1..branch-2

Compares changes between two branches, helping you understand what has been modified. To see only file names that changed

git diff --name-only branch-1..branch-2

15. Add a Remote Repository

git remote add origin repository-url

Links a remote repository to your local project, enabling push and pull operations. To verify remote repositories

git remote -v

16. Remove a Remote Repository

git remote remove origin

Unlinks your repository from a remote source, useful when switching remotes.

17. View the Last Commit Details

git show HEAD

Shows detailed information about the most recent commit, including the changes made. To see only the commit message

git log -1 --pretty=%B

18. Check What’s Staged for Commit

git diff --staged

Displays changes that are staged for commit, helping you review before finalizing a commit.

19. Fetch and Rebase from a Remote Branch

git pull --rebase origin main

Combines fetching and rebasing in one step, keeping your branch up-to-date cleanly. If conflicts arise, resolve them manually and continue with

git rebase --continue

20. View All Git Aliases

git config --global --list | grep alias

If you’ve set up aliases, this command helps you see them all. Aliases can make your Git workflow faster by shortening common commands. For example

git config --global alias.co checkout

allows you to use git co instead of git checkout.

Try these tricks in your daily development to level up your Git skills!

RSVP for K6 : Load Testing Made Easy in Tamil

5 February 2025 at 10:57

Ensuring your applications perform well under high traffic is crucial. Join us for an interactive K6 Bootcamp, where we’ll explore performance testing, load testing strategies, and real-world use cases to help you build scalable and resilient systems.

🎯 What is K6 and Why Should You Learn It?

Modern applications must handle thousands (or millions!) of users without breaking. K6 is an open-source, developer-friendly performance testing tool that helps you

✅ Simulate real-world traffic and identify performance bottlenecks.
✅ Write tests in JavaScript – no need for complex tools!
✅ Run efficient load tests on APIs, microservices, and web applications.
✅ Integrate with CI/CD pipelines to automate performance testing.
✅ Gain deep insights with real-time performance metrics.

By mastering K6, you’ll gain the skills to predict failures before they happen, optimize performance, and build systems that scale with confidence!

📌 Bootcamp Details

📅 Date: Feb 23 2024 – Sunday
🕒 Time: 10:30 AM
🌐 Mode: Online (Link Will be shared in Email after RSVP)
🗣 Language: தமிழ்

🎓 Who Should Attend?

  • Developers – Ensure APIs and services perform well under load.
  • QA Engineers – Validate system reliability before production.
  • SREs / DevOps Engineers – Continuously test performance in CI/CD pipelines.

RSVP Now

🔥 Don’t miss this opportunity to master load testing with K6 and take your performance engineering skills to the next level!

Got questions? Drop them in the comments or reach out to me. See you at the bootcamp! 🚀

Our Previous Monthly meets – https://www.youtube.com/watch?v=cPtyuSzeaa8&list=PLiutOxBS1MizPGGcdfXF61WP5pNUYvxUl&pp=gAQB

Our Previous Sessions,

  1. Python – https://www.youtube.com/watch?v=lQquVptFreE&list=PLiutOxBS1Mizte0ehfMrRKHSIQcCImwHL&pp=gAQB
  2. Docker – https://www.youtube.com/watch?v=nXgUBanjZP8&list=PLiutOxBS1Mizi9IRQM-N3BFWXJkb-hQ4U&pp=gAQB
  3. Postgres – https://www.youtube.com/watch?v=04pE5bK2-VA&list=PLiutOxBS1Miy3PPwxuvlGRpmNo724mAlt&pp=gAQB

Learning Notes #68 – Buildpacks and Dockerfile

2 February 2025 at 09:32

  1. What is an OCI ?
  2. Does Docker Create OCI Images?
  3. What is a Buildpack ?
  4. Overview of Buildpack Process
  5. Builder: The Image That Executes the Build
    1. Components of a Builder Image
    2. Stack: The Combination of Build and Run Images
  6. Installation and Initial Setups
  7. Basic Build of an Image (Python Project)
    1. Building an image using buildpack
    2. Building an Image using Dockerfile
  8. Unique Benefits of Buildpacks
    1. No Need for a Dockerfile (Auto-Detection)
    2. Automatic Security Updates
    3. Standardized & Reproducible Builds
    4. Extensibility: Custom Buildpacks
  9. Generating SBOM in Buildpacks
    1. a) Using pack CLI to Generate SBOM
    2. b) Generate SBOM in Docker

Last few days, i was exploring on Buildpacks. I am amused at this tool features on reducing the developer’s pain. In this blog i jot down my experience on Buildpacks.

Before going to try Buildpacks, we need to understand what is an OCI ?

What is an OCI ?

An OCI Image (Open Container Initiative Image) is a standard format for container images, defined by the Open Container Initiative (OCI) to ensure interoperability across different container runtimes (Docker, Podman, containerd, etc.).

It consists of,

  1. Manifest – Metadata describing the image (layers, config, etc.).
  2. Config JSON – Information about how the container should run (CMD, ENV, etc.).
  3. Filesystem Layers – The actual file system of the container.

OCI Image Specification ensures that container images built once can run on any OCI-compliant runtime.

Does Docker Create OCI Images?

Yes, Docker creates OCI-compliant images. Since Docker v1.10+, Docker has been aligned with the OCI Image Specification, and all Docker images are OCI-compliant by default.

  • When you build an image with docker build, it follows the OCI Image format.
  • When you push/pull images to registries like Docker Hub, they follow the OCI Image Specification.

However, Docker also supports its legacy Docker Image format, which existed before OCI was introduced. Most modern registries and runtimes (Kubernetes, Podman, containerd) support OCI images natively.

What is a Buildpack ?

A buildpack is a framework for transforming application source code into a runnable image by handling dependencies, compilation, and configuration. Buildpacks are widely used in cloud environments like Heroku, Cloud Foundry, and Kubernetes (via Cloud Native Buildpacks).

Overview of Buildpack Process

The buildpack process consists of two primary phases

  • Detection Phase: Determines if the buildpack should be applied based on the app’s dependencies.
  • Build Phase: Executes the necessary steps to prepare the application for running in a container.

Buildpacks work with a lifecycle manager (e.g., Cloud Native Buildpacks’ lifecycle) that orchestrates the execution of multiple buildpacks in an ordered sequence.

Builder: The Image That Executes the Build

A builder is an image that contains all necessary components to run a buildpack.

Components of a Builder Image

  1. Build Image – Used during the build phase (includes compilers, dependencies, etc.).
  2. Run Image – A minimal environment for running the final built application.
  3. Lifecycle – The core mechanism that executes buildpacks, orchestrates the process, and ensures reproducibility.

Stack: The Combination of Build and Run Images

  • Build Image + Run Image = Stack
  • Build Image: Base OS with tools required for building (e.g., Ubuntu, Alpine).
  • Run Image: Lightweight OS with only the runtime dependencies for execution.

Installation and Initial Setups

Basic Build of an Image (Python Project)

Project Source: https://github.com/syedjaferk/gh_action_docker_build_push_fastapi_app

Building an image using buildpack

Before running these commands, ensure you have Pack CLI (pack) installed.

a) Detect builder suggest

pack builder suggest

b) Build the image

pack build my-app --builder paketobuildpacks/builder:base

c) Run the image locally


docker run -p 8080:8080 my-python-app

Building an Image using Dockerfile

a) Dockerfile


FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .

RUN pip install -r requirements.txt

COPY ./random_id_generator ./random_id_generator
COPY app.py app.py

EXPOSE 8080

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]

b) Build and Run


docker build -t my-python-app .
docker run -p 8080:8080 my-python-app

Unique Benefits of Buildpacks

No Need for a Dockerfile (Auto-Detection)

Buildpacks automatically detect the language and dependencies, removing the need for Dockerfile.


pack build my-python-app --builder paketobuildpacks/builder:base

It detects Python, installs dependencies, and builds the app into a container. 🚀 Docker requires a Dockerfile, which developers must manually configure and maintain.

Automatic Security Updates

Buildpacks automatically patch base images for security vulnerabilities.

If there’s a CVE in the OS layer, Buildpacks update the base image without rebuilding the app.


pack rebase my-python-app

No need to rebuild! It replaces only the OS layers while keeping the app the same.

Standardized & Reproducible Builds

Ensures consistent images across environments (dev, CI/CD, production). Example: Running the same build locally and on Heroku/Cloud Run,


pack build my-app

Extensibility: Custom Buildpacks

Developers can create custom Buildpacks to add special dependencies.

Example: Adding ffmpeg to a Python buildpack,


pack buildpack package my-custom-python-buildpack --path .

Generating SBOM in Buildpacks

a) Using pack CLI to Generate SBOM

After building an image with pack, run,


pack sbom download my-python-app --output-dir ./sbom
  • This fetches the SBOM for your built image.
  • The SBOM is saved in the ./sbom/ directory.

✅ Supported formats:

  • SPDX (sbom.spdx.json)
  • CycloneDX (sbom.cdx.json)

b) Generate SBOM in Docker


trivy image --format cyclonedx -o sbom.json my-python-app

Both are helpful in creating images. Its all about the tradeoffs.

RabbitMQ – All You Need To Know To Start Building Scalable Platforms

1 February 2025 at 02:39

  1. Introduction
  2. What is a Message Queue ?
  3. So Problem Solved !!! Not Yet
  4. RabbitMQ: Installation
  5. RabbitMQ: An Introduction (Optional)
    1. What is RabbitMQ?
    2. Why Use RabbitMQ?
    3. Key Features and Use Cases
  6. Building Blocks of Message Broker
    1. Connection & Channels
    2. Queues – Message Store
    3. Exchanges – Message Distributor and Binding
  7. Producing, Consuming and Acknowledging
  8. Problem #1 – Task Queue for Background Job Processing
    1. Context
    2. Problem
    3. Proposed Solution
  9. Problem #2 – Broadcasting NEWS to all subscribers
    1. Problem
    2. Solution Overview
    3. Step 1: Producer (Publisher)
    4. Step 2: Consumers (Subscribers)
      1. Consumer 1: Mobile App Notifications
      2. Consumer 2: Email Alerts
      3. Consumer 3: Web Notifications
      4. How It Works
  10. Intermediate Resources
    1. Prefetch Count
    2. Request Reply Pattern
    3. Dead Letter Exchange
    4. Alternate Exchanges
    5. Lazy Queues
    6. Quorom Queues
    7. Change Data Capture
    8. Handling Backpressure in Distributed Systems
    9. Choreography Pattern
    10. Outbox Pattern
    11. Queue Based Loading
    12. Two Phase Commit Protocol
    13. Competing Consumer
    14. Retry Pattern
    15. Can We Use Database as a Queue
  11. Let’s Connect

Introduction

Let’s take the example of an online food ordering system like Swiggy or Zomato. Suppose a user places an order through the mobile app. If the application follows a synchronous approach, it would first send the order request to the restaurant’s system and then wait for confirmation. If the restaurant is busy, the app will have to keep waiting until it receives a response.

If the restaurant’s system crashes or temporarily goes offline, the order will fail, and the user may have to restart the process.

This approach leads to a poor user experience, increases the chances of failures, and makes the system less scalable, as multiple users waiting simultaneously can cause a bottleneck.

In a traditional synchronous communication model, one service directly interacts with another and waits for a response before proceeding. While this approach is simple and works for small-scale applications, it introduces several challenges, especially in systems that require high availability and scalability.

The main problems with synchronous communication include slow performance, system failures, and scalability issues. If the receiving service is slow or temporarily unavailable, the sender has no choice but to wait, which can degrade the overall performance of the application.

Moreover, if the receiving service crashes, the entire process fails, leading to potential data loss or incomplete transactions.

In this book, we are going to solve how this can be solved with a message queue.

What is a Message Queue ?

A message queue is a system that allows different parts of an application (or different applications) to communicate with each other asynchronously by sending and receiving messages.

It acts like a buffer or an intermediary where messages are stored until the receiving service is ready to process them.

How It Works

  1. A producer (sender) creates a message and sends it to the queue.
  2. The message sits in the queue until a consumer (receiver) picks it up.
  3. The consumer processes the message and removes it from the queue.

This process ensures that the sender does not have to wait for the receiver to be available, making the system faster, more reliable, and scalable.

Real-Life Example

Imagine a fast-food restaurant where customers place orders at the counter. Instead of waiting at the counter for their food, customers receive a token number and move aside. The kitchen prepares the order in the background, and when it’s ready, the token number is called for pickup.

In this analogy,

  • The counter is the producer (sending orders).
  • The queue is the token system (storing orders).
  • The kitchen is the consumer (processing orders).
  • The customer picks up the food when ready (message is consumed).

Similarly, in applications, a message queue helps decouple systems, allowing them to work at their own pace without blocking each other. RabbitMQ, Apache Kafka, and Redis are popular message queue systems used in modern software development. 🚀

So Problem Solved !!! Not Yet

It seems like problem is solved, but the message life cycle in the queue is need to handled.

  • Message Routing & Binding (Optional) – How a message is routed ?. If an exchange is used, the message is routed based on predefined rules.
  • Message Storage (Queue Retention) – How long a message stays in the queue. The message stays in the queue until a consumer picks it up.
  • If the consumer successfully processes the message, it sends an acknowledgment (ACK), and the message is removed. If the consumer fails, the message requeues or moves to a dead-letter queue (DLQ).
  • Messages that fail multiple times, are not acknowledged, or expire may be moved to a Dead-Letter Queue for further analysis.
  • Messages stored only in memory can be lost if RabbitMQ crashes.
  • Messages not consumed within their TTL expire.
  • If a consumer fails to acknowledge a message, it may be reprocessed twice.
  • Messages failing multiple times may be moved to a DLQ.
  • Too many messages in the queue due to slow consumers can cause system slowdowns.
  • Network failures can disrupt message delivery between producers, RabbitMQ, and consumers.
  • Messages with corrupt or bad data may cause repeated consumer failures.

To handle all the above problems, we need a tool. Stable, Battle tested, Reliable tool. RabbitMQ is one kind of that tool. In this book we will cover the basics of RabbitMQ.

RabbitMQ: Installation

For RabbitMQ Installation please refer to https://www.rabbitmq.com/docs/download. In this book we will go with RabbitMQ docker.

docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:4.0-management


RabbitMQ: An Introduction (Optional)

What is RabbitMQ?

Imagine you’re sending messages between friends, but instead of delivering them directly, you drop them in a mailbox, and your friend picks them up when they are ready. RabbitMQ acts like this mailbox, but for computer programs. It helps applications communicate asynchronously, meaning they don’t have to wait for each other to process data.

RabbitMQ is a message broker, which means it handles and routes messages between different parts of an application. It ensures that messages are delivered efficiently, even when some components are running at different speeds or go offline temporarily.

Why Use RabbitMQ?

Modern applications often consist of multiple services that need to exchange data. Sometimes, one service produces data faster than another can consume it. Instead of forcing the slower service to catch up or making the faster service wait, RabbitMQ allows the fast service to place messages in a queue. The slow service can then process them at its own pace.

Some key benefits of using RabbitMQ include,

  • Decoupling services: Components communicate via messages rather than direct calls, reducing dependencies.
  • Scalability: RabbitMQ allows multiple consumers to process messages in parallel.
  • Reliability: It supports message durability and acknowledgments, preventing message loss.
  • Flexibility: Works with many programming languages and integrates well with different systems.
  • Efficient Load Balancing: Multiple consumers can share the message load to prevent overload on a single component.

Key Features and Use Cases

RabbitMQ is widely used in different applications, including

  • Chat applications: Messages are queued and delivered asynchronously to users.
  • Payment processing: Orders are placed in a queue and processed sequentially.
  • Event-driven systems: Used for microservices communication and event notification.
  • IoT systems: Devices publish data to RabbitMQ, which is then processed by backend services.
  • Job queues: Background tasks such as sending emails or processing large files.

Building Blocks of Message Broker

Connection & Channels

In RabbitMQ, connections and channels are fundamental concepts for communication between applications and the broker,

Connections: A connection is a TCP link between a client (producer or consumer) and the RabbitMQ broker. Each connection consumes system resources and is relatively expensive to create and maintain.

Channels: A channel is a virtual communication path inside a connection. It allows multiple logical streams of data over a single TCP connection, reducing overhead. Channels are lightweight and preferred for performing operations like publishing and consuming messages.

Queues – Message Store

A queue is a message buffer that temporarily holds messages until a consumer retrieves and processes them.

1. Queues operate on a FIFO (First In, First Out) basis, meaning messages are processed in the order they arrive (unless priorities or other delivery strategies are set).

2. Queues persist messages if they are declared as durable and the messages are marked as persistent, ensuring reliability even if RabbitMQ restarts.

3. Multiple consumers can subscribe to a queue, and messages can be distributed among them in a round-robin manner.

Consumption by multiple consumers,

Can also be broadcasted,

4. If no consumers are available, messages remain in the queue until a consumer connects.

Analogy: Think of a queue as a to-do list where tasks (messages) are stored until someone (a worker/consumer) picks them up and processes them.

Exchanges – Message Distributor and Binding

An exchange is responsible for routing messages to one or more queues based on routing rules.

When a producer sends a message, it doesn’t go directly to a queue but first reaches an exchange, which decides where to forward it.🔥

The blue color line is called as Binding. A binding is the link between the exchange and the queue, guiding messages to the right place.

RabbitMQ supports different types of exchanges

Direct Exchange (direct)

  • Routes messages to queues based on an exact match between the routing key and the queue’s binding key.
  • Example: Sending messages to a specific queue based on a severity level (info, error, warning).


Fanout Exchange (fanout)

  • Routes messages to all bound queues, ignoring routing keys.
  • Example: Broadcasting notifications to multiple services at once.

Topic Exchange (topic)

  • Routes messages based on pattern matching using * (matches one word) and # (matches multiple words).
  • Example: Routing logs where log.info goes to one queue, log.error goes to another, and log.* captures all.

Headers Exchange (headers)

  • Routes messages based on message headers instead of routing keys.
  • Example: Delivering messages based on metadata like device: mobile or region: US.

Analogy: An exchange is like a traffic controller that decides which road (queue) a vehicle (message) should take based on predefined rules.

Binding

A binding is a link between an exchange and a queue that defines how messages should be routed.

  • When a queue is bound to an exchange with a binding key, messages with a matching routing key are delivered to that queue.
  • A queue can have multiple bindings to different exchanges, allowing it to receive messages from multiple sources.

Example:

  • A queue named error_logs can be bound to a direct exchange with a binding key error.
  • Another queue, all_logs, can be bound to the same exchange with a binding key # (wildcard in a topic exchange) to receive all logs.

Analogy: A binding is like a GPS route guiding messages (vehicles) from the exchange (traffic controller) to the right queue (destination).

Producing, Consuming and Acknowledging

RabbitMQ follows the producer-exchange-queue-consumer model,

  • Producing messages (Publishing): A producer creates a message and sends it to RabbitMQ, which routes it to the correct queue.
  • Consuming messages (Subscribing): A consumer listens for messages from the queue and processes them.
  • Acknowledgment: The consumer sends an acknowledgment (ack) after successfully processing a message.
  • Durability: Ensures messages and queues survive RabbitMQ restarts.

Why do we need an Acknowledgement ?

  1. Ensures message reliability – Prevents messages from being lost if a consumer crashes.
  2. Prevents message loss – Messages are redelivered if no ACK is received.
  3. Avoids unintentional message deletion – Messages stay in the queue until properly processed.
  4. Supports at-least-once delivery – Ensures every message is processed at least once.
  5. Enables load balancing – Distributes messages fairly among multiple consumers.
  6. Allows manual control – Consumers can acknowledge only after successful processing.
  7. Handles redelivery – Messages can be requeued and sent to another consumer if needed.

Problem #1 – Task Queue for Background Job Processing

Context

A company runs an image processing application where users upload images that need to be resized, watermarked, and optimized before they can be served. Processing these images synchronously would slow down the user experience, so the company decides to implement an asynchronous task queue using RabbitMQ.

Problem

  • Users upload large images that require multiple processing steps.
  • Processing each image synchronously blocks the application, leading to slow response times.
  • High traffic results in queue buildup, making it challenging to scale the system efficiently.

Proposed Solution

1. Producer Service

  • Publishes image processing tasks to a RabbitMQ exchange (task_exchange).
  • Sends the image filename as the message body to the queue (image_queue).

2. Worker Consumers

  • Listen for new image processing tasks from the queue.
  • Process each image (resize, watermark, optimize, etc.).
  • Acknowledge completion to ensure no duplicate processing.

3. Scalability

  • Multiple workers can run in parallel to process images faster.

producer.py

import pika

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare exchange and queue
channel.exchange_declare(exchange='task_exchange', exchange_type='direct')
channel.queue_declare(queue='image_queue')

# Bind queue to exchange
channel.queue_bind(exchange='task_exchange', queue='image_queue', routing_key='image_task')

# List of images to process
images = ["image1.jpg", "image2.jpg", "image3.jpg"]

for image in images:
    channel.basic_publish(exchange='task_exchange', routing_key='image_task', body=image)
    print(f" [x] Sent {image}")

connection.close()

consumer.py

import pika
import time

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

# Declare exchange and queue
channel.exchange_declare(exchange='task_exchange', exchange_type='direct')
channel.queue_declare(queue='image_queue')

# Bind queue to exchange
channel.queue_bind(exchange='task_exchange', queue='image_queue', routing_key='image_task')

def process_image(ch, method, properties, body):
    print(f" [x] Processing {body.decode()}")
    time.sleep(2)  # Simulate processing time
    print(f" [x] Finished {body.decode()}")
    ch.basic_ack(delivery_tag=method.delivery_tag)

# Start consuming
channel.basic_consume(queue='image_queue', on_message_callback=process_image)
print(" [*] Waiting for image tasks. To exit press CTRL+C")
channel.start_consuming()

Problem #2 – Broadcasting NEWS to all subscribers

Problem

A news application wants to send breaking news alerts to all subscribers, regardless of their location or interest.

Use a fanout exchange (news_alerts_exchange) to broadcast messages to all connected queues, ensuring all users receive the alert.

🔹 Example

  • mobile_app_queue (for users receiving push notifications)
  • email_alert_queue (for users receiving email alerts)
  • web_notification_queue (for users receiving notifications on the website)

Solution Overview

  • We create a fanout exchange called news_alerts_exchange.
  • Multiple queues (mobile_app_queue, email_alert_queue, and web_notification_queue) are bound to this exchange.
  • A producer publishes messages to the exchange.
  • Each consumer listens to its respective queue and receives the alert.

Step 1: Producer (Publisher)

This script publishes a breaking news alert to the fanout exchange.

import pika

# Establish connection
connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

# Declare a fanout exchange
channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

# Publish a message
message = "Breaking News: Major event happening now!"
channel.basic_publish(exchange="news_alerts_exchange", routing_key="", body=message)

print(f" [x] Sent: {message}")

# Close connection
connection.close()

Step 2: Consumers (Subscribers)

Each consumer listens to its respective queue and processes the alert.

Consumer 1: Mobile App Notifications

import pika

# Establish connection
connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

# Declare exchange
channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

# Declare a queue (auto-delete if no consumers)
queue_name = "mobile_app_queue"
channel.queue_declare(queue=queue_name)
channel.queue_bind(exchange="news_alerts_exchange", queue=queue_name)

# Callback function
def callback(ch, method, properties, body):
    print(f" [Mobile App] Received: {body.decode()}")

# Consume messages
channel.basic_consume(queue=queue_name, on_message_callback=callback, auto_ack=True)
print(" [*] Waiting for news alerts...")
channel.start_consuming()

Consumer 2: Email Alerts

import pika

connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

queue_name = "email_alert_queue"
channel.queue_declare(queue=queue_name)
channel.queue_bind(exchange="news_alerts_exchange", queue=queue_name)

def callback(ch, method, properties, body):
    print(f" [Email Alert] Received: {body.decode()}")

channel.basic_consume(queue=queue_name, on_message_callback=callback, auto_ack=True)
print(" [*] Waiting for news alerts...")
channel.start_consuming()

Consumer 3: Web Notifications

import pika

connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()

channel.exchange_declare(exchange="news_alerts_exchange", exchange_type="fanout")

queue_name = "web_notification_queue"
channel.queue_declare(queue=queue_name)
channel.queue_bind(exchange="news_alerts_exchange", queue=queue_name)

def callback(ch, method, properties, body):
    print(f" [Web Notification] Received: {body.decode()}")

channel.basic_consume(queue=queue_name, on_message_callback=callback, auto_ack=True)
print(" [*] Waiting for news alerts...")
channel.start_consuming()

How It Works

  1. The producer sends a news alert to the fanout exchange (news_alerts_exchange).
  2. All queues (mobile_app_queue, email_alert_queue, web_notification_queue) bound to the exchange receive the message.
  3. Each consumer listens to its queue and processes the alert.

This setup ensures all users receive the alert simultaneously across different platforms. 🚀

Intermediate Resources

Prefetch Count

Prefetch is a mechanism that defines how many messages can be delivered to a consumer at a time before the consumer sends an acknowledgment back to the broker. This ensures that the consumer does not get overwhelmed with too many unprocessed messages, which could lead to high memory usage and potential performance issues.

To Know More: https://parottasalna.com/2024/12/29/learning-notes-16-prefetch-count-rabbitmq/

Request Reply Pattern

The Request-Reply Pattern is a fundamental communication style in distributed systems, where a requester sends a message to a responder and waits for a reply. It’s widely used in systems that require synchronous communication, enabling the requester to receive a response for further processing.

To Know More: https://parottasalna.com/2024/12/28/learning-notes-15-request-reply-pattern-rabbitmq/

Dead Letter Exchange

A dead letter is a message that cannot be delivered to its intended queue or is rejected by a consumer. Common scenarios where messages are dead lettered include,

  1. Message Rejection: A consumer explicitly rejects a message without requeuing it.
  2. Message TTL (Time-To-Live) Expiry: The message remains in the queue longer than its TTL.
  3. Queue Length Limit: The queue has reached its maximum capacity, and new messages are dropped.
  4. Routing Failures: Messages that cannot be routed to any queue from an exchange.

To Know More: https://parottasalna.com/2024/12/28/learning-notes-14-dead-letter-exchange-rabbitmq/

Alternate Exchanges

An alternate exchange in RabbitMQ is a fallback exchange configured for another exchange. If a message cannot be routed to any queue bound to the primary exchange, RabbitMQ will publish the message to the alternate exchange instead. This mechanism ensures that undeliverable messages are not lost but can be processed in a different way, such as logging, alerting, or storing them for later inspection.

To Know More: https://parottasalna.com/2024/12/27/learning-notes-12-alternate-exchanges-rabbitmq/

Lazy Queues

  • Lazy Queues are designed to store messages primarily on disk rather than in memory.
  • They are optimized for use cases involving large message backlogs where minimizing memory usage is critical.

To Know More: https://parottasalna.com/2024/12/26/learning-notes-10-lazy-queues-rabbitmq/

Quorom Queues

  • Quorum Queues are distributed queues built on the Raft consensus algorithm.
  • They are designed for high availability, durability, and data safety by replicating messages across multiple nodes in a RabbitMQ cluster.
  • Its a replacement of Mirrored Queues.

To Know More: https://parottasalna.com/2024/12/25/learning-notes-9-quorum-queues-rabbitmq/

Change Data Capture

CDC stands for Change Data Capture. It’s a technique that listens to a database and captures every change that happens in it. These changes can then be sent to other systems to,

  • Keep data in sync across multiple databases.
  • Power real-time analytics dashboards.
  • Trigger notifications for certain database events.
  • Process data streams in real time.

To Know More: https://parottasalna.com/2025/01/19/learning-notes-63-change-data-capture-what-does-it-do/

Handling Backpressure in Distributed Systems

Backpressure occurs when a downstream system (consumer) cannot keep up with the rate of data being sent by an upstream system (producer). In distributed systems, this can arise in scenarios such as

  • A message queue filling up faster than it is drained.
  • A database struggling to handle the volume of write requests.
  • A streaming system overwhelmed by incoming data.

To Know More: https://parottasalna.com/2025/01/07/learning-notes-45-backpressure-handling-in-distributed-systems/

Choreography Pattern

In the Choreography Pattern, services communicate directly with each other via asynchronous events, without a central controller. Each service is responsible for a specific part of the workflow and responds to events produced by other services. This pattern allows for a more autonomous and loosely coupled system.

To Know More: https://parottasalna.com/2025/01/05/learning-notes-38-choreography-pattern-cloud-pattern/

Outbox Pattern

The Outbox Pattern is a proven architectural solution to this problem, helping developers manage data consistency, especially when dealing with events, messaging systems, or external APIs.

To Know More: https://parottasalna.com/2025/01/03/learning-notes-31-outbox-pattern-cloud-pattern/

Queue Based Loading

The Queue-Based Loading Pattern leverages message queues to decouple and coordinate tasks between producers (such as applications or services generating data) and consumers (services or workers processing that data). By using queues as intermediaries, this pattern allows systems to manage workloads efficiently, ensuring seamless and scalable operation.

To Know More: https://parottasalna.com/2025/01/03/learning-notes-30-queue-based-loading-cloud-patterns/

Two Phase Commit Protocol

The Two-Phase Commit (2PC) protocol is a distributed algorithm used to ensure atomicity in transactions spanning multiple nodes or databases. Atomicity ensures that either all parts of a transaction are committed or none are, maintaining consistency in distributed systems.

To Know More: https://parottasalna.com/2025/01/03/learning-notes-29-two-phase-commit-protocol-acid-in-distributed-systems/

Competing Consumer

The competing consumer pattern involves multiple consumers that independently compete to process messages or tasks from a shared queue. This pattern is particularly effective in scenarios where the rate of incoming tasks is variable or high, as it allows multiple consumers to process tasks concurrently.

To Know More: https://parottasalna.com/2025/01/01/learning-notes-24-competing-consumer-messaging-queue-patterns/

Retry Pattern

The Retry Pattern is a design strategy used to manage transient failures by retrying failed operations. Instead of immediately failing an operation after an error, the pattern retries it with an optional delay or backoff strategy. This is particularly useful in distributed systems where failures are often temporary.

To Know More: https://parottasalna.com/2024/12/31/learning-notes-23-retry-pattern-cloud-patterns/

Can We Use Database as a Queue

Developers try to use their RDBMS as a way to do background processing or service communication. While this can often appear to ‘get the job done’, there are a number of limitations and concerns with this approach.

There are two divisions to any asynchronous processing: the service(s) that create processing tasks and the service(s) that consume and process these tasks accordingly.

To Know More: https://parottasalna.com/2024/06/15/can-we-use-database-as-queue-in-asynchronous-process/

Let’s Connect

Telegram: https://t.me/parottasalna/1

LinkedIn: https://www.linkedin.com/in/syedjaferk/

Whatsapp Channel: https://whatsapp.com/channel/0029Vavu8mF2v1IpaPd9np0s

Youtube: https://www.youtube.com/@syedjaferk

Github: https://github.com/syedjaferk/

Learning Notes #67 – Build and Push to a Registry (Docker Hub) with GH-Actions

28 January 2025 at 02:30

GitHub Actions is a powerful tool for automating workflows directly in your repository.In this blog, we’ll explore how to efficiently set up GitHub Actions to handle Docker workflows with environments, secrets, and protection rules.

Why Use GitHub Actions for Docker?

My Code base is in Github and i want to tryout gh-actions to build and push images to docker hub seamlessly.

Setting Up GitHub Environments

GitHub Environments let you define settings specific to deployment stages. Here’s how to configure them:

1. Create an Environment

Go to your GitHub repository and navigate to Settings > Environments. Click New environment, name it (e.g., production), and save.

2. Add Secrets and Variables

Inside the environment settings, click Add secret to store sensitive information like DOCKER_USERNAME and DOCKER_TOKEN.

Use Variables for non-sensitive configuration, such as the Docker image name.

3. Optional: Set Protection Rules

Enforce rules like requiring manual approval before deployments. Restrict deployments to specific branches (e.g., main).

Sample Workflow for Building and Pushing Docker Images

Below is a GitHub Actions workflow for automating the build and push of a Docker image based on a minimal Flask app.

Workflow: .github/workflows/docker-build-push.yml


name: Build and Push Docker Image

on:
  push:
    branches:
      - main  # Trigger workflow on pushes to the `main` branch

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    environment: production  # Specify the environment to use

    steps:
      # Checkout the repository
      - name: Checkout code
        uses: actions/checkout@v3

      # Log in to Docker Hub using environment secrets
      - name: Log in to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}

      # Build the Docker image using an environment variable
      - name: Build Docker image
        env:
          DOCKER_IMAGE_NAME: ${{ vars.DOCKER_IMAGE_NAME }}
        run: |
          docker build -t ${{ secrets.DOCKER_USERNAME }}/$DOCKER_IMAGE_NAME:${{ github.run_id }} .

      # Push the Docker image to Docker Hub
      - name: Push Docker image
        env:
          DOCKER_IMAGE_NAME: ${{ vars.DOCKER_IMAGE_NAME }}
        run: |
          docker push ${{ secrets.DOCKER_USERNAME }}/$DOCKER_IMAGE_NAME:${{ github.run_id }}

To Actions on live: https://github.com/syedjaferk/gh_action_docker_build_push_fastapi_app/actions

Learning Notes #66 – What is SBOM ? Software Bill of Materials

26 January 2025 at 09:16

Yesterday, i came to know about SBOM, from my friend Prasanth Baskar. Let’s say you’re building a website.

You decide to use a popular open-source tool to handle user logins. Here’s the catch,

  • That library uses another library to store data.
  • That tool depends on another library to handle passwords.

Now, if one of those libraries has a bug or security issue, how do you even know it’s there? In this blog, i will jot down my understanding on SBOM with Trivy.

What is SBOM ?

A Software Bill of Materials (SBOM) is a list of everything that makes up a piece of software.

Think of it as,

  • A shopping list for all the tools, libraries, and pieces used to build the software.
  • A recipe card showing what’s inside and how it’s structured.

For software, this means,

  • Components: These are the “ingredients,” such as open-source libraries, frameworks, and tools.
  • Versions: Just like you might want to know if the cake uses almond flour or regular flour, knowing the version of a software component matters.
  • Licenses: Did the baker follow the rules for the ingredients they used? Software components also come with licenses that dictate how they can be used.

So How come its Important ?

1. Understanding What You’re Using

When you download or use software, especially something complex, you often don’t know what’s inside. An SBOM helps you understand what components are being used are they secure? Are they trustworthy?

2. Finding Problems Faster

If someone discovers that a specific ingredient is bad—like flour with bacteria in it—you’d want to know if that’s in your cake. Similarly, if a software library has a security issue, an SBOM helps you figure out if your software is affected and needs fixing.

For example,

When the Log4j vulnerability made headlines, companies that had SBOMs could quickly identify whether they used Log4j and take action.

3. Building Trust

Imagine buying food without a label or list of ingredients.

You’d feel doubtful, right ? Similarly, an SBOM builds trust by showing users exactly what’s in the software they’re using.

4. Avoiding Legal Trouble

Some software components come with specific rules or licenses about how they can be used. An SBOM ensures these rules are followed, avoiding potential legal headaches.

How to Create an SBOM?

For many developers, creating an SBOM manually would be impossible because modern software can have hundreds (or even thousands!) of components.

Thankfully, there are tools that automatically create SBOMs. Examples include,

  • Trivy: A lightweight tool to generate SBOMs and find vulnerabilities.
  • CycloneDX: A popular SBOM format supported by many tools https://cyclonedx.org/
  • SPDX: Another format designed to make sharing SBOMs easier https://spdx.dev/

These tools can scan your software and automatically list out every component, its version, and its dependencies.

We will see example on generating a SBOM file for nginx using trivy.

How Trivy Works ?

On running trivy scan,

1. It downloads Trivy DB including vulnerability information.

2. Pull Missing layers in cache.

3. Analyze layers and stores information in cache.

4. Detect security issues and write to SBOM file.

Note: a CVE refers to a Common Vulnerabilities and Exposures identifier. A CVE is a unique code used to catalog and track publicly known security vulnerabilities and exposures in software or systems.

How to Generate SBOMs with Trivy

Step 1: Install Trivy in Ubuntu

sudo apt-get install wget gnupg
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | gpg --dearmor | sudo tee /usr/share/keyrings/trivy.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/trivy.gpg] https://aquasecurity.github.io/trivy-repo/deb generic main" | sudo tee -a /etc/apt/sources.list.d/trivy.list
sudo apt-get update
sudo apt-get install trivy

More on Installation: https://github.com/aquasecurity/trivy/blob/main/docs/getting-started/installation.md

Step 2: Generate an SBOM

Trivy allows you to create SBOMs in formats like CycloneDX or SPDX.

trivy image --format cyclonedx --output sbom.json nginx:latest

It generates the SBOM file.

It can be incorporated into Github CI/CD.

Event Summary: FOSS United Chennai Meetup – 25-01-2025

26 January 2025 at 04:53

🚀 Attended the FOSS United Chennai Meetup Yesterday! 🚀

After, attending Grafana & Friends Meetup, straightly went to FOSS United Chennai Meetup at YuniQ in Taramani.

Had a chance to meet my Friends face to face after a long time. Sakhil Ahamed E. , Dhanasekar T, Dhanasekar Chellamuthu, Thanga Ayyanar, Parameshwar Arunachalam, Guru Prasath S, Krisha, Gopinathan Asokan

Talks Summary,

1. Ansh Arora, Gave a tour on FOSS United, How its formed, Motto, FOSS Hack, FOSS Clubs.

2. Karthikeyan A K, Gave a talk on his open source product injee (The no configuration instant database for frontend developers.). It’s a great tool. He gave a personal demo for me. It’s a great tool with lot of potentials. Would like to contribute !.

3. Justin Benito, How they celebrated New Year with https://tamilnadu.tech
It’s single go to page for events in Tamil Nadu. If you are interested ,go to the repo https://lnkd.in/geKFqnFz and contribute.

From Kaniyam Foundation we are maintaining a Google Calendar for a long time on Tech Events happening in Tamil Nadu https://lnkd.in/gbmGMuaa.

4. Prasanth Baskar, gave a talk on Harbor, OSS Container Registry with SBOM and more functionalities. SBOM was new to me.

5. Thanga Ayyanar, gave a talk on Static Site Generation with Emacs.

At the end, we had a group photo and went for tea. Got to meet my Juniors from St. Joseph’s Institute of Technology in this meet. Had a discussion with Parameshwar Arunachalam on his BuildToLearn Experience. They started prototyping an Tinder app for Tamil Words. After that had a small discussion on our Feb 8th Glug Inauguration at St. Joseph’s Institute of Technology Dr. KARTHI M .

Happy to see, lot of minds travelling from different districts to attend this meet.

Event Summary: Grafana & Friends Meetup Chennai – 25-01-2025

26 January 2025 at 04:47

🚀 Attended the Grafana & Friends Meetup Yesterday! 🚀

I usually have a question. As a developer, i have logs, isn’t that enough. With curious mind, i attended Grafana & Friends Chennai meetup (Jan 25th 2025)

Had an awesome time meeting fellow tech enthusiasts (devops engineers) and learning about cool ways to monitor and understand data better.
Big shoutout to the Grafana Labs community and Presidio for hosting such a great event!

Sandwich and Juice was nice 😋

Talk Summary,

1⃣ Making Data Collection Easier with Grafana Alloy
Dinesh J. and Krithika R shared how Grafana Alloy, combined with Open Telemetry, makes it super simple to collect and manage data for better monitoring.

2⃣ Running Grafana in Kubernetes
Lakshmi Narasimhan Parthasarathy (https://lnkd.in/gShxtucZ) showed how to set up Grafana in Kubernetes in 4 different ways (vanilla, helm chart, grafana operator, kube-prom-stack). He is building a SaaS product https://lnkd.in/gSS9XS5m (Heroku on your own servers).

3⃣ Observability for Frontend Apps with Grafana Faro
Selvaraj Kuppusamy show how Grafana Faro can help frontend developers monitor what’s happening on websites and apps in real time. This makes it easier to spot and fix issues quickly. Were able to see core web vitals, and traces too. I was surprised about this.

Techies i had interaction with,

Prasanth Baskar, who is an open source contributor at Cloud Native Computing Foundation (CNCF) on project https://lnkd.in/gmHjt9Bs. I was also happy to know that he knows **parottasalna** (that’s me) and read some blogs. Happy To Hear that.

Selvaraj Kuppusamy, Devops Engineer, he is also conducting Grafana and Friends chapter in Coimbatore on Feb 1. I will attend that aswell.

Saranraj Chandrasekaran who is also a devops engineer, Had a chat with him on devops and related stuffs.

To all of them, i shared about KanchiLUG (https://lnkd.in/gasCnxXv) and Parottasalna (https://parottasalna.com/) and My Channel on Tech https://lnkd.in/gKcyE-b5.

Thanks Achanandhi M for organising this wonderful meetup. You did well. I came to Achanandhi M from medium. He regularly writes blog on cloud related stuffs. https://lnkd.in/ghUS-GTc Checkout his blog.

Also, He shared some tasks for us,

1. Create your First Grafana Dashboard.
Objective: Create a basic Grafana Dashboard to visualize data in various formats such as tables, charts and graphs. Aslo, try to connect to multiple data sources to get diverse data for your dashboard.

2. Monitor your linux system’s health with prometheus, Node Exporter and Grafana.
Objective: Use prometheus, Node Exporter adn Grafana to monitor your linux machines health system by tracking key metrics like CPU, memory and disk usage.


3. Using Grafana Faro to track User Actions (Like Button Clicks) and Identify the Most Used Features.

Give a try on these.

RSVP for RabbitMQ: Build Scalable Messaging Systems in Tamil

24 January 2025 at 11:21

Hi All,

Invitation to RabbitMQ Session

🔹 Topic: RabbitMQ: Asynchronous Communication
🔹 Date: Feb 2 Sunday
🔹 Time: 10:30 AM to 1 PM
🔹 Venue: Online. Will be shared in mail after RSVP.

Join us for an in-depth session on RabbitMQ in தமிழ், where we’ll explore,

  • Message queuing fundamentals
  • Connections, channels, and virtual hosts
  • Exchanges, queues, and bindings
  • Publisher confirmations and consumer acknowledgments
  • Use cases and live demos

Whether you’re a developer, DevOps enthusiast, or curious learner, this session will empower you with the knowledge to build scalable and efficient messaging systems.

📌 Don’t miss this opportunity to level up your messaging skills!

RSVP closed !

Our Previous Monthly meetshttps://www.youtube.com/watch?v=cPtyuSzeaa8&list=PLiutOxBS1MizPGGcdfXF61WP5pNUYvxUl&pp=gAQB

Our Previous Sessions,

  1. Python – https://www.youtube.com/watch?v=lQquVptFreE&list=PLiutOxBS1Mizte0ehfMrRKHSIQcCImwHL&pp=gAQB
  2. Docker – https://www.youtube.com/watch?v=nXgUBanjZP8&list=PLiutOxBS1Mizi9IRQM-N3BFWXJkb-hQ4U&pp=gAQB
  3. Postgres – https://www.youtube.com/watch?v=04pE5bK2-VA&list=PLiutOxBS1Miy3PPwxuvlGRpmNo724mAlt&pp=gAQB

Our Social Handles,

Learning Notes #65 – Application Logs, Metrics, MDC

21 January 2025 at 05:45

I am big fan of logs. Would like to log everything. All the request, response of an API. But is it correct ? Though logs helped our team greatly during this new year, i want to know, is there a better approach to log things. That search made this blog. In this blog i jot down notes on logging. Lets log it.

Throughout this blog, i try to generalize things. Not biased to a particular language. But here and there you can see me biased towards Python. Also this is my opinion. Not a hard rule.

Which is a best logger ?

I’m not here to argue about which logger is the best, they all have their problems. But the worst one is usually the one you build yourself. Sure, existing loggers aren’t perfect, but trying to create your own is often a much bigger mistake.

1. Why Logging Matters

Logging provides visibility into your application’s behavior, helping to,

  • Diagnose and troubleshoot issues (This is most common usecase)
  • Monitor application health and performance (Metrics)
  • Meet compliance and auditing requirements (Audit Logs)
  • Enable debugging in production environments (we all do this.)

However, poorly designed logging strategies can lead to excessive log volumes, higher costs, and difficulty in pinpointing actionable insights.

2. Logging Best Practices

a. Use Structured Logs

Long story short, instead of unstructured plain text, use JSON or other structured formats. This makes parsing and querying easier, especially in log aggregation tools.


{
  "timestamp": "2025-01-20T12:34:56Z",
  "level": "INFO",
  "message": "User login successful",
  "userId": 12345,
  "sessionId": "abcde12345"
}

b. Leverage Logging Levels

Define and adhere to appropriate logging levels to avoid log bloat:

  • DEBUG: Detailed information for debugging.
  • INFO: General operational messages.
  • WARNING: Indications of potential issues.
  • ERROR: Application errors that require immediate attention.
  • CRITICAL: Severe errors leading to application failure.

c. Avoid Sensitive Data

Sanitize your logs to exclude sensitive information like passwords, PII, or API keys. Instead, mask or hash such data. Don’t add token even for testing.


d. Include Contextual Information

Incorporate metadata like request IDs, user IDs, or transaction IDs to trace specific events effectively.


3. Log Ingestion at Scale

As applications scale, log ingestion can become a bottleneck. Here’s how to manage it,

a. Centralized Logging

Stream logs to centralized systems like Elasticsearch, Logstash, Kibana (ELK), or cloud-native services like AWS CloudWatch, Azure Monitor, or Google Cloud Logging.

b. Optimize Log Volume

  • Log only necessary information.
  • Use log sampling to reduce verbosity in high-throughput systems.
  • Rotate logs to limit disk usage.

c. Use Asynchronous Logging

Asynchronous loggers improve application performance by delegating logging tasks to separate threads or processes. (Not Suitable all time. It has its own problems)

d. Method return values are usually important

If you have a log in the method and don’t include the return value of the method, you’re missing important information. Make an effort to include that at the expense of slightly less elegant looking code.

e. Include filename in error messages

Mention the path/to/file:line-number to pinpoint the location of the issue.

3. Logging Don’ts

a. Don’t Log Everything at the Same Level

Logging all messages at the INFO or DEBUG level creates noise and makes it difficult to identify critical issues.

b. Don’t Hardcode Log Messages

Avoid static, vague, or generic log messages. Use dynamic and descriptive messages that include relevant context.

# Bad Example
Error occurred.

# Good Example
Error occurred while processing payment for user_id=12345, transaction_id=abc-6789.

c. Don’t Log Sensitive or Regulated Data

Exposing personally identifiable information (PII), passwords, or other sensitive data in logs can lead to compliance violations (e.g., GDPR, HIPAA).

d. Don’t Ignore Log Rotation

Failing to implement log rotation can result in disk space exhaustion, especially in high traffic systems (Log Retention).

e. Don’t Overlook Log Correlation

Logs without request IDs, session IDs, or contextual metadata make it difficult to correlate related events.

f. Don’t Forget to Monitor Log Costs

Logging everything without considering storage and processing costs can lead to financial inefficiency in large-scale systems.

g. Keep the log message short

Long and verbose messages are a cost. The cost is in reading time and ingestion time.

h. Never use log message in loop

This might seem obvious, but just to be clear -> logging inside a loop, even if the log level isn’t visible by default, can still hurt performance. It’s best to avoid this whenever possible.

If you absolutely need to log something at a hidden level and decide to break this guideline, keep it short and straightforward.

i. Log item you already “have”

We should avoid this,


logger.info("Reached X and value of method is {}", method());

Here, just for the logging purpose, we are calling the method() again. Even if the method is cheap. You’re effectively running the method regardless of the respective logging levels!

j. Dont log iterables

Even if it’s a small list. The concern is that the list might grow and “overcrowd” the log. Writing the content of the list to the log can balloon it up and slow processing noticeably. Also kills time in debugging.

k. Don’t Log What the Framework Logs for You

There are great things to log. E.g. the name of the current thread, the time, etc. But those are already written into the log by default almost everywhere. Don’t duplicate these efforts.

l.Don’t log Method Entry/Exit

Log only important events in the system. Entering or exiting a method isn’t an important event. E.g. if I have a method that enables feature X the log should be “Feature X enabled” and not “enable_feature_X entered”. I have done this a lot.

m. Dont fill the method

A complex method might include multiple points of failure, so it makes sense that we’d place logs in multiple points in the method so we can detect the failure along the way. Unfortunately, this leads to duplicate logging and verbosity.

Errors will typically map to error handling code which should be logged in generically. So all error conditions should already be covered.

This creates situations where we sometimes need to change the flow/behavior of the code, so logging will be more elegant.

n. Don’t use AOP logging

AOP (Aspect-Oriented Programming) logging allows you to automatically add logs at specific points in your application, such as when methods are entered or exited.

In Python, AOP-style logging can be implemented using decorators or middleware that inject logs into specific points, such as method entry and exit. While it might seem appealing for detailed tracing, the same problems apply as in other languages like Java.


import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def log_method_entry_exit(func):
    def wrapper(*args, **kwargs):
        logger.info(f"Entering: {func.__name__} with args={args} kwargs={kwargs}")
        result = func(*args, **kwargs)
        logger.info(f"Exiting: {func.__name__} with result={result}")
        return result
    return wrapper

# Example usage
@log_method_entry_exit
def example_function(x, y):
    return x + y

example_function(5, 3)

Why Avoid AOP Logging in Python

  1. Performance Impact:
    • Injecting logs into every method increases runtime overhead, especially if used extensively in large-scale systems.
    • In Python, where function calls already add some overhead, this can significantly affect performance.
  2. Log Verbosity:
    • If this decorator is applied to every function or method in a system, it produces an enormous amount of log data.
    • Debugging becomes harder because the meaningful logs are lost in the noise of entry/exit logs.
  3. Limited Usefulness:
    • During local development, tools like Python debuggers (pdb), profilers (cProfile, line_profiler), or tracing libraries like trace are far more effective for inspecting function behavior and performance.
  4. CI Issues:
    • Enabling such verbose logging during CI test runs can make tracking test failures more difficult because the logs are flooded with entry/exit messages, obscuring the root cause of failures.

Use Python-specific tools like pdb, ipdb, or IDE-integrated debuggers to inspect code locally.

o. Dont Double log

It’s pretty common to log an error when we’re about to throw an error. However, since most error code is generic, it’s likely there’s a log in the generic error handling code.

4. Ensuring Scalability

To keep your logging system robust and scalable,

  • Monitor Log Storage: Set alerts for log storage thresholds.
  • Implement Compression: Compress log files to reduce storage costs.
  • Automate Archival and Deletion: Regularly archive old logs and purge obsolete data.
  • Benchmark Logging Overhead: Measure the performance impact of logging on your application.

5. Logging for Metrics

Below, is the list of items that i wish can be logged for metrics.

General API Metrics

  1. General API Metrics on HTTP methods, status codes, latency/duration, request size.
  2. Total requests per endpoint over time. Requests per minute/hour.
  3. Frequency and breakdown of 4XX and 5XX errors.
  4. User ID or API client making the request.

{
  "timestamp": "2025-01-20T12:34:56Z",
  "endpoint": "/projects",
  "method": "POST",
  "status_code": 201,
  "user_id": 12345,
  "request_size_bytes": 512,
  "response_size_bytes": 256,
  "duration_ms": 120
}

Business Specific Metrics

  1. Objects (session) creations: No. of projects created (daily/weekly)
  2. Average success/failure rate.
  3. Average time to create a session.
  4. Frequency of each action on top of session.

{
  "timestamp": "2025-01-20T12:35:00Z",
  "endpoint": "/projects/12345/actions",
  "action": "edit",
  "status_code": 200,
  "user_id": 12345,
  "duration_ms": 98
}

Performance Metrics

  1. Database query metrics on execution time, no. of queries per request.
  2. Third party service metrics on time spent, success/failure rates of external calls.

{
  "timestamp": "2025-01-20T12:37:15Z",
  "endpoint": "/projects/12345",
  "db_query_time_ms": 45,
  "external_api_time_ms": 80,
  "status_code": 200,
  "duration_ms": 130
}

Scalability Metrics

  1. Concurrency metrics on max request handled.
  2. Request queue times during load.
  3. System Metrics on CPU and Memory usage during request processing (this will be auto captured).

Usage Metrics

  1. Traffic analysis on peak usage times.
  2. Most/Least used endpoints.

6. Mapped Diagnostic Context (MDC)

MDC is the one, i longed for most. Also went into trouble by implementing without a middleware.

Mapped Diagnostic Context (MDC) is a feature provided by many logging frameworks, such as Logback, Log4j, and SLF4J. It allows developers to attach contextual information (key-value pairs) to the logging events, which can then be automatically included in log messages.

This context helps in differentiating and correlating log messages, especially in multi-threaded applications.

Why Use MDC?

  1. Enhanced Log Clarity: By adding contextual information like user IDs, session IDs, or transaction IDs, MDC enables logs to provide more meaningful insights.
  2. Easier Debugging: When logs contain thread-specific context, tracing the execution path of a specific transaction or user request becomes straightforward.
  3. Reduced Log Ambiguity: MDC ensures that logs from different threads or components do not get mixed up, avoiding confusion.

Common Use Cases

  1. Web Applications: Logging user sessions, request IDs, or IP addresses to trace the lifecycle of a request.
  2. Microservices: Propagating correlation IDs across services for distributed tracing.
  3. Background Tasks: Tracking specific jobs or tasks in asynchronous operations.

Limitations (Curated from other blogs. I havent tried yet )

  1. Thread Boundaries: MDC is thread-local, so its context does not automatically propagate across threads (e.g., in asynchronous executions). For such scenarios, you may need to manually propagate the MDC context.
  2. Overhead: Adding and managing MDC context introduces a small runtime overhead, especially in high-throughput systems.
  3. Configuration Dependency: Proper MDC usage often depends on correctly configuring the logging framework.


2025-01-21 14:22:15.123 INFO  [thread-1] [userId=12345, transactionId=abc123] Starting transaction
2025-01-21 14:22:16.456 DEBUG [thread-1] [userId=12345, transactionId=abc123] Processing request
2025-01-21 14:22:17.789 ERROR [thread-1] [userId=12345, transactionId=abc123] Error processing request: Invalid input
2025-01-21 14:22:18.012 INFO  [thread-1] [userId=12345, transactionId=abc123] Transaction completed

In Fastapi, we can implement this via a middleware,


import logging
import uuid
from fastapi import FastAPI, Request
from starlette.middleware.base import BaseHTTPMiddleware

# Configure the logger
logger = logging.getLogger("uvicorn")
logger.setLevel(logging.INFO)

# Create a custom formatter with MDC placeholders
class CustomFormatter(logging.Formatter):
    def format(self, record):
        record.user_id = getattr(record, "user_id", "unknown")
        record.transaction_id = getattr(record, "transaction_id", str(uuid.uuid4()))
        return super().format(record)

# Set the logging format with MDC keys
formatter = CustomFormatter(
    "%(asctime)s %(levelname)s [%(threadName)s] [userId=%(user_id)s, transactionId=%(transaction_id)s] %(message)s"
)

# Apply the formatter to the handler
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logger.addHandler(console_handler)

# FastAPI application
app = FastAPI()

# Custom Middleware to add MDC context
class RequestContextMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        # Add MDC info before handling the request
        user_id = request.headers.get("X-User-ID", "default-user")
        transaction_id = str(uuid.uuid4())
        logging.getLogger().info(f"Request started: {user_id}, {transaction_id}")

        # Add MDC info to log
        logging.getLogger().user_id = user_id
        logging.getLogger().transaction_id = transaction_id

        response = await call_next(request)

        # Optionally, log additional information when the response is done
        logging.getLogger().info(f"Request finished: {user_id}, {transaction_id}")

        return response

# Add custom middleware to the FastAPI app
app.add_middleware(RequestContextMiddleware)

@app.get("/")
async def read_root():
    logger.info("Handling the root endpoint.")
    return {"message": "Hello, World!"}

@app.get("/items/{item_id}")
async def read_item(item_id: int):
    logger.info(f"Fetching item with ID {item_id}")
    return {"item_id": item_id}

Hope, you might have got a better idea on logging.

Learning Notes #64 – E-Tags and Last-Modified Headers

20 January 2025 at 16:57

Today morning, i started with a video on E-Tags (came as first in youtube suggestion). In this blog i jot down my notes on E-Tags and how it helps in saving bandwidth. Also how Last-Modified header is better than E-Tags.

In the world of web development, ensuring efficient resource management and improved performance is crucial. Two key mechanisms that help in achieving this are E-Tags (Entity Tags) and the Last-Modified header.

These HTTP features facilitate caching and conditional requests, reducing bandwidth usage and improving user experience.

What is an E-Tag?

An Entity Tag (E-Tag) is an HTTP header used for web cache validation. It acts as a unique identifier for a specific version of a resource on the server. When a resource changes, its E-Tag also changes, enabling clients (e.g., browsers) to determine if their cached version of the resource is still valid.

How E-Tags Work

1. Response with E-Tag: When a client requests a resource, the server responds with the resource and an E-Tag in the HTTP header.


HTTP/1.1 200 OK
ETag: "abc123"
Content-Type: application/json
Content-Length: 200

2. Subsequent Requests: On subsequent requests, the client includes the E-Tag in the If-None-Match header.


GET /resource HTTP/1.1
If-None-Match: "abc123"

3. Server Response

If the resource hasn’t changed, the server responds with a 304 Not Modified status, saving bandwidth,


HTTP/1.1 304 Not Modified

If the resource has changed, the server responds with a 200 OK status and a new E-Tag,


HTTP/1.1 200 OK
ETag: "xyz789"

Benefits of E-Tags

  • Precise cache validation based on resource version.
  • Reduced bandwidth usage as unchanged resources are not re-downloaded.
  • Improved user experience with faster loading times for unchanged resources.

What is the Last-Modified Header?

The Last-Modified header indicates the last time a resource was modified on the server. It’s a simpler mechanism compared to E-Tags but serves a similar purpose in caching and validation.

How Last-Modified Works

1. Response with Last-Modified: When a client requests a resource, the server includes the Last-Modified header in its response,


HTTP/1.1 200 OK
Last-Modified: Wed, 17 Jan 2025 10:00:00 GMT
Content-Type: image/png
Content-Length: 1024

    2. Subsequent Requests: On future requests, the client includes the If-Modified-Since header.

    
    GET /image.png HTTP/1.1
    If-Modified-Since: Wed, 17 Jan 2025 10:00:00 GMT
    

    3. Server Response

    If the resource hasn’t changed, the server responds with a 304 Not Modified status,

    
    HTTP/1.1 304 Not Modified
    

    If the resource has changed, the server sends the updated resource with a new Last-Modified value,

    
    HTTP/1.1 200 OK
    Last-Modified: Thu, 18 Jan 2025 12:00:00 GMT
    

    E-Tags and Last-Modified headers are powerful tools for improving web application performance. By enabling conditional requests and efficient caching, they reduce server load and bandwidth usage while enhancing the user experience. Remember, these 2 are pretty old mechanisms, which are been used tilldate.

    Learning Notes #63 – Change Data Capture. What does it do ?

    19 January 2025 at 16:22

    Few days back i came across a concept of CDC. Like a notifier of database events. Instead of polling, this enables event to be available in a queue, which can be consumed by many consumers. In this blog, i try to explain the concepts, types in a theoretical manner.

    You run a library. Every day, books are borrowed, returned, or new books are added. What if you wanted to keep a live record of all these activities so you always know the exact state of your library?

    This is essentially what Change Data Capture (CDC) does for your databases. It’s a way to track changes (like inserts, updates, or deletions) in your database tables and send them to another system, like a live dashboard or a backup system. (Might be a bad example. Don’t lose hope. Continue …)

    CDC is widely used in modern technology to power,

    • Real-Time Analytics: Live dashboards that show sales, user activity, or system performance.
    • Data Synchronization: Keeping multiple databases or microservices in sync.
    • Event-Driven Architectures: Triggering notifications, workflows, or downstream processes based on database changes.
    • Data Pipelines: Streaming changes to data lakes or warehouses for further processing.
    • Backup and Recovery: Incremental backups by capturing changes instead of full data dumps.

    It’s a critical part of tools like Debezium, Kafka, and cloud services such as AWS Database Migration Service (DMS) and Azure Data Factory. CDC enables companies to move towards real-time data-driven decision-making.

    What is CDC?

    CDC stands for Change Data Capture. It’s a technique that listens to a database and captures every change that happens in it. These changes can then be sent to other systems to,

    • Keep data in sync across multiple databases.
    • Power real-time analytics dashboards.
    • Trigger notifications for certain database events.
    • Process data streams in real time.

    In short, CDC ensures your data is always up-to-date wherever it’s needed.

    Why is CDC Useful?

    Imagine you have an online store. Whenever someone,

    • Places an order,
    • Updates their shipping address, or
    • Cancels an order,

    you need these changes to be reflected immediately across,

    • The shipping system.
    • The inventory system.
    • The email notification service.

    Instead of having all these systems query the database (this is one of main reasons) constantly (which is slow and inefficient), CDC automatically streams these changes to the relevant systems.

    This means,

    1. Real-Time Updates: Systems receive changes instantly.
    2. Improved Performance: Your database isn’t overloaded with repeated queries.
    3. Consistency: All systems stay in sync without manual intervention.

    How Does CDC Work?

    Note: I haven’t yet tried all these. But conceptually having a feeling.

    CDC relies on tracking changes in your database. There are a few ways to do this,

    1. Query-Based CDC

    This method repeatedly checks the database for changes. For example:

    • Every 5 minutes, it queries the database: “What changed since my last check?”
    • Any new or modified data is identified and processed.

    Drawbacks: This can miss changes if the timing isn’t right, and it’s not truly real-time (Long Polling).

    2. Log-Based CDC

    Most modern databases (like PostgreSQL or MySQL) keep logs of every operation. Log-based CDC listens to these logs and captures changes as they happen.

    Advantages

    • It’s real-time.
    • It’s lightweight since it doesn’t query the database directly.

    3. Trigger-Based CDC

    In this method, the database uses triggers to log changes into a separate table. Whenever a change occurs, a trigger writes a record of it.

    Advantages: Simple to set up.

    Drawbacks: Can slow down the database if not carefully managed.

    Tools That Make CDC Easy

    Several tools simplify CDC implementation. Some popular ones are,

    1. Debezium: Open-source and widely used for log-based CDC with databases like PostgreSQL, MySQL, and MongoDB.
    2. Striim: A commercial tool for real-time data integration.
    3. AWS Database Migration Service (DMS): A cloud-based CDC service.
    4. StreamSets: Another tool for real-time data movement.

    These tools integrate with databases, capture changes, and deliver them to systems like RabbitMQ, Kafka, or cloud storage.

    To help visualize CDC, think of,

    • Social Media Feeds: When someone likes or comments on a post, you see the update instantly. This is CDC in action.
    • Bank Notifications: Whenever you make a transaction, your bank app updates instantly. Another example of CDC.

    In upcoming blogs, will include Debezium implementation with CDC.

    Learning Notes #62 – Serverless – Just like riding a taxi

    19 January 2025 at 04:55

    What is Serverless Computing?

    Serverless computing allows developers to run applications without having to manage the underlying infrastructure. You write code, deploy it, and the cloud provider takes care of the rest from provisioning servers to scaling applications.

    Popular serverless platforms include AWS Lambda, Azure Functions, and Google Cloud Functions.

    The Taxi Analogy

    Imagine traveling to a destination. There are multiple ways to get there,

    1. Owning a Car (Traditional Servers): You own and maintain your car. This means handling maintenance, fuel, insurance, parking, and everything else that comes with it. It’s reliable and gives you control, but it’s also time-consuming and expensive to manage.
    2. Hiring a Taxi (Serverless): With a taxi, you simply book a ride when you need it. You don’t worry about maintaining the car, fueling it, or where it’s parked afterward. You pay only for the distance traveled, and the service scales to your needs whether you’re alone or with friends.

    Why Serverless is Like Taking a Taxi ?

    1. No Infrastructure Management – With serverless, you don’t have to manage or worry about servers, just like you don’t need to maintain a taxi.
    2. Pay-As-You-Go – In a taxi, you pay only for the distance traveled. Similarly, in serverless, you’re billed only for the compute time your application consumes.
    3. On-Demand Availability – Need a ride at midnight? A taxi is just a booking away. Serverless functions work the same way available whenever you need them, scaling up or down as required.
    4. Scalability – Whether you’re a solo traveler or part of a group, taxis can adapt by providing a small car or a larger vehicle. Serverless computing scales resources automatically based on traffic, ensuring optimal performance.
    5. Focus on the Destination – When you take a taxi, you focus on reaching your destination without worrying about the vehicle. Serverless lets you concentrate on writing and deploying code rather than worrying about servers.

    Key Benefits of Serverless (and Taxi Rides)

    • Cost-Effectiveness – Avoid upfront costs. No need to buy servers (or cars) you might not fully utilize.
    • Flexibility – Serverless platforms support multiple programming languages and integrations.
      Taxis, too, come in various forms: regular cars, SUVs, and even luxury rides for special occasions.
    • Reduced Overhead – Free yourself from maintenance tasks, whether it’s patching servers or checking tire pressure.

    When Not to Choose Serverless (or a Taxi)

    1. Predictable, High-Volume Usage – Owning a car might be cheaper if you’re constantly on the road. Similarly, for predictable and sustained workloads, traditional servers or containers might be more cost-effective than serverless.
    2. Special Requirements – Need a specific type of vehicle, like a truck for moving furniture? Owning one might make sense. Similarly, applications with unique infrastructure requirements may not be a perfect fit for serverless.
    3. Latency Sensitivity – Taxis take time to arrive after booking. Likewise, serverless functions may experience cold starts, adding slight delays. For ultra-low-latency applications, other architectures may be preferable.

    ❌
    ❌