❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Neuralink: Send humans to the next level

28 December 2024 at 07:26

Hi guys, Today’s topic feels like something straight out of a science fiction movie. But it has become a reality through the Elon Musk company β€œNeuralink”. Elon Musk is a person who thinks always differently. The different man thinks differently and achieves in different ways. Such one achievement is β€œNeuralink” the Brain Computer Interface aliasΒ β€œBCI”.

Neuralink is a neurotechnology company co-founded by Elon Musk in 2016. Guys, it’s neurotechnology not nanotechnology founded and used by Iron Man alias Tony Stark. Ok, jokes apart. The primary goal of the company is to create a direct communication pathway between the human brain and the computer to enhance human capabilities. As simply, Neuralink is a small, advanced chip designed to be implanted in the cerebrum of the human brain to give a pathway between the human brain and the computer via Bluetooth.

How do they implant the Chip in theΒ Brain?

The humans made mistakes. But the mistake happened here it caused danger to the patient's life. So, for this, the company should specially develop and manufacture the machine called as R1 Robot. They hand over the overall operation as good. It is programmed perfectly for thisΒ surgery.

For more details about the Implantation of the Chip in the Brain Watch the below videoΒ πŸ‘‡

How itΒ works?

In the chip, there is an ultra-thin, flexible thread. There are 1024 threads and more in it to detect electrical signals generated by neurons. Here we all are doubting right β€œWhy need to detect electrical signals generated by neurons?” Because neurons only transmit the data all over the human body. Like the Electrical Wire passing the current. When our brain says to walk the neurons transmit the data to the leg and itΒ happens.

Did you know about AI in Neural Networks? It’s actually created with the inspiration ofΒ humans.

Then the electrical signals are sent to a chip. The chip processes the signals and converts them into digital data wirelessly transmitted to external devices via Bluetooth. This allows for bidirectional communication, where the brain can send signals to devices and receive input in return. The system uses advanced algorithms and machine learning to decode the neuralΒ data.

What is the useΒ case?

There is an enormous use case for the neuralink. In the world, not everyone is born with perfect vision, speech, or physical abilities. Many individuals face challenges from birth, such as blindness, deafness, and paralysis. Even for some of them, it happened due to some accidents. Come on guys, try to feel that. How the pain it is? In just imagine, it is like hell, right? Because today we are not just ready to live without mobile phones. In this world, how does that feel for theΒ people?

But it was corrected by the technology of neuralink. How?

By Neuralink, for blind people, the specialized camera acts as an Eye, for deaf people the external microphone acts as an Ear. Likewise, for all people who have problems in their body, it gives a solution forΒ it.

Now many of them think, then how it is used for us. Just imagine guys, like in Hollywood movies the eye acts as a Camera, and when humans forget anything they recall by it. How the feels gives the secondary memory is not in the laptop or system. It’s in our brain. We save the details as per like we save the details in the variousΒ devices.

When you read it’s like a Sci-Fi Movie. But, it becomes a reality. Nowadays many of them fear AI and Humanoids have become like β€œTERMINATOR” movies in Humanoids. But it’s actually not possible. Because the person who is working on AI should learn first about the Ethical Concerns in AI. There is a standalone section or training about it. The AI should train in this manner only. It should not be trained in this manner. So don’t worry guys back to theΒ topic.

Cons of Neuralink?

Now we see the dark side of neuralink. Every technology has a light and dark side, right? The best example of it is in our hands, the β€œInternet”. The Internet is a good tool to learn anything in the world. But it is also a bad tool to affect people. Likewise, Neuralink has had some darkΒ side.

When we like to say that in a word it is β€œSafety, Privacy and Control”.

In the end, it’s a technology, right?
Every technology is hackable and vulnerable. Whether they hacked, after the chip integrated into your brain. We became the toy for the hackers. They do anything using us. Maybe it’s possible.
Then the next thing, it’s affects our privacy asΒ easily.
Because the eye becomes a camera. Then they record all the things we see. It’s a thread, right? When the company is like to see any of the privacy it’s easily possible. Because the neuralink is not worked under a man. It’s worked by the whole company. So, it’s a big question mark forΒ privacy?

Likewise, there are some problems withΒ it.

I hope by the blog, I explain about Neuralink in a goodΒ manner.

When feel this content is valuable follow me for more upcomingΒ Blogs.

Connect withΒ Me:

Achieving Better User Engaging via Realistic Load Testing in K6

1 March 2025 at 05:55

Introduction

Load testing is essential to evaluate how a system behaves under expected and peak loads. Traditionally, we rely on metrics like requests per second (RPS), response time, and error rates. However, an insightful approach called Average Load Testing has been discussed recently. This blog explores that concept in detail, providing practical examples to help you apply it effectively.

Understanding Average Load Testing

Average Load Testing focuses on simulating real-world load patterns rather than traditional peak load tests. Instead of sending a fixed number of requests per second, this approach

  • Generates requests based on the average concurrency over time.
  • More accurately reflects real-world traffic patterns.
  • Helps identify performance bottlenecks in a realistic manner.

Setting Up Load Testing with K6

K6 is an excellent tool for implementing Average Load Testing. Let’s go through practical examples of setting up such tests.

Install K6

brew install k6  # macOS
sudo apt install k6  # Ubuntu/Debian
docker pull grafana/k6  # Using Docker

Example 1: Basic K6 Script for Average Load Testing

import http from 'k6/http';
import { sleep } from 'k6';

export let options = {
  scenarios: {
    avg_load: {
      executor: 'constant-arrival-rate',
      rate: 10, // 10 requests per second
      timeUnit: '1s',
      duration: '2m',
      preAllocatedVUs: 20,
      maxVUs: 50,
    },
  },
};

export default function () {
  let res = http.get('https://test-api.example.com');
  console.log(`Response time: ${res.timings.duration}ms`);
  sleep(1);
}

Explanation

  • The constant-arrival-rate executor ensures a steady request rate.
  • rate: 10 sends 10 requests per second.
  • duration: '2m' runs the test for 2 minutes.
  • preAllocatedVUs: 20 and maxVUs: 50 define virtual users needed to sustain the load.
  • The script logs response times to the console.

Example 2: Testing with Varying Load

To better reflect real-world scenarios, we can use ramping arrival rate to simulate gradual increases in traffic

import http from 'k6/http';
import { sleep } from 'k6';

export let options = {
  scenarios: {
    ramping_load: {
      executor: 'ramping-arrival-rate',
      startRate: 5, // Start with 5 requests/sec
      timeUnit: '1s',
      preAllocatedVUs: 50,
      maxVUs: 100,
      stages: [
        { duration: '1m', target: 20 },
        { duration: '2m', target: 50 },
        { duration: '3m', target: 100 },
      ],
    },
  },
};

export default function () {
  let res = http.get('https://test-api.example.com');
  console.log(`Response time: ${res.timings.duration}ms`);
  sleep(1);
}

Explanation

  • The ramping-arrival-rate gradually increases requests per second over time.
  • The stages array defines a progression from 5 to 100 requests/sec over 6 minutes.
  • Logs response times to help analyze system performance.

Example 3: Load Testing with Multiple Endpoints

In real applications, multiple endpoints are often tested simultaneously. Here’s how to test different API routes

import http from 'k6/http';
import { check, sleep } from 'k6';

export let options = {
  scenarios: {
    multiple_endpoints: {
      executor: 'constant-arrival-rate',
      rate: 15, // 15 requests per second
      timeUnit: '1s',
      duration: '2m',
      preAllocatedVUs: 30,
      maxVUs: 60,
    },
  },
};

export default function () {
  let urls = [
    'https://test-api.example.com/users',
    'https://test-api.example.com/orders',
    'https://test-api.example.com/products'
  ];
  
  let res = http.get(urls[Math.floor(Math.random() * urls.length)]);
  check(res, {
    'is status 200': (r) => r.status === 200,
  });
  console.log(`Response time: ${res.timings.duration}ms`);
  sleep(1);
}

Explanation

  • The script randomly selects an API endpoint to test different routes.
  • Uses check to ensure status codes are 200.
  • Logs response times for deeper insights.

Analyzing Results

To analyze test results, you can store logs or metrics in a database or monitoring tool and visualize trends over time. Some popular options include

  • Prometheus for time-series data storage.
  • InfluxDB for handling large-scale performance metrics.
  • ELK Stack (Elasticsearch, Logstash, Kibana) for log-based analysis.

Average Load Testing provides a more realistic way to measure system performance. By leveraging K6, you can create flexible, real-world simulations to optimize your applications effectively.

Golden Feedbacks for Python Sessions 1.0 from last year (2024)

13 February 2025 at 08:49

Many Thanks to Shrini for documenting it last year. This serves as a good reference to improve my skills. Hope it will help many.

πŸ“’ What Participants wanted to improve

πŸšΆβ€β™‚οΈ Go a bit slower so that everyone can understand clearly without feeling rushed.


πŸ“š Provide more basics and examples to make learning easier for beginners.


πŸ–₯ Spend the first week explaining programming basics so that newcomers don’t feel lost.


πŸ“Š Teach flowcharting methods to help participants understand the logic behind coding.


πŸ•Ή Try teaching Scratch as an interactive way to introduce programming concepts.


πŸ—“ Offer weekend batches for those who prefer learning on weekends.


πŸ—£ Encourage more conversations so that participants can actively engage in discussions.


πŸ‘₯ Create sub-groups to allow participants to collaborate and support each other.


πŸŽ‰ Get β€œcheerleaders” within the team to make the classes more fun and interactive.


πŸ“’ Increase promotion efforts to reach a wider audience and get more participants.


πŸ” Provide better examples to make concepts easier to grasp.


❓ Conduct more Q&A sessions so participants can ask and clarify their doubts.


πŸŽ™ Ensure that each participant gets a chance to speak and express their thoughts.


πŸ“Ή Showing your face in videos can help in building a more personal connection with the learners.


πŸ† Organize mini-hackathons to provide hands-on experience and encourage practical learning.


πŸ”— Foster more interactions and connections between participants to build a strong learning community.


✍ Encourage participants to write blogs daily to document their learning and share insights.


🎀 Motivate participants to give talks in class and other communities to build confidence.

πŸ“ Other Learnings & Suggestions

πŸ“΅ Avoid creating WhatsApp groups for communication, as the 1024 member limit makes it difficult to manage multiple groups.


βœ‰ Telegram works fine for now, but explore using mailing lists as an alternative for structured discussions.


πŸ”• Mute groups when necessary to prevent unnecessary messages like β€œHi, Hello, Good Morning.”


πŸ“’ Teach participants how to join mailing lists like ChennaiPy and KanchiLUG and guide them on asking questions in forums like Tamil Linux Community.


πŸ“ Show participants how to create a free blog on platforms like dev.to or WordPress to share their learning journey.


πŸ›  Avoid spending too much time explaining everything in-depth, as participants should start coding a small project by the 5th or 6th class.


πŸ“Œ Present topics as solutions to project ideas or real-world problem statements instead of just theory.


πŸ‘€ Encourage using names when addressing people, rather than calling them β€œSir” or β€œMadam,” to maintain an equal and friendly learning environment.


πŸ’Έ Zoom is costly, and since only around 50 people complete the training, consider alternatives like Jitsi or Google Meet for better cost-effectiveness.

Will try to incorporate these learnings in our upcoming sessions.

πŸš€ Let’s make this learning experience engaging, interactive, and impactful! 🎯

Learn REST-API

By: krishna
5 February 2025 at 06:01

REST API

A REST API (Representational State Transfer) is a web service that allows different applications to communicate over the internet using standard HTTP methods like GET, POST, PUT, and DELETE. It follows REST principles, making it lightweight, scalable, and easy to use.

REST API Principles

Client-Server Architecture

A REST API follows a client-server architecture, where the client and server are separate. The client sends requests, and the server processes them and responds, allowing different clients like web and mobile apps to communicate with the same backend.

Statelessness

Before understanding statelessness, you need to understand statefulness.

  • Statefulness: The server stores and manages user session data, such as authentication details and recent activities.
  • Statelessness: The server does not store any information. Each request is independent, and the client must include all necessary data, like authentication details and query parameters, in every request.

This behavior makes the server scalable, reliable, and reduces server load.

Caching

Caching improves performance by storing responses that can be reused, reducing the need for repeated requests to the server. This minimizes response time and server load.

Uniform Interface

A uniform interface ensures consistency by using standard HTTP methods like GET, POST, PUT, and DELETE. This makes the API predictable and easy to use.

Layered System

A layered system divides tasks into separate layers, like security, caching, and user requests. The client only interacts with the top layer, making the system easier to manage and more flexible.

Start To Code

I use Node.js and some popular packages:

  • Express: A Node.js web framework used for building web applications and APIs.
  • Joi: A package used to validate user input, ensuring data integrity and security.

basic code

const express = require("express");
const app = express();
const joi = require("joi");
app.use(express.json());

//data
customers = [
    {name : "user1", id : 1},
    {name : "user2", id : 2},
    {name : "user3", id : 3}
]

//listen
const port = process.env.PORT || 8080;
app.listen(port, ()=> console.log("listening on ",port));

//function
function validateUserName(customer){
    schema = joi.object({
	name : joi.string().min(3).required()
    });
    return schema.validate(customer)
}

GET

GET is used to retrieve data from the server. the response code is 200 if successful.

app.get('/api/customers',(req,res)=>{
    res.send(customers);
});

get specific user details

app.get('/api/customer/:id', (req,res)=>{
    const user_details = customers.find(user => req.params.id == user.id );
    if(!user_details){
        res.status(404).send("Data Not Found");
    }else{
        res.status(200).send(user_details)
    }
});

POST

The POST method is used to upload data to the server. The response code is 201, and I used the validateUserName function to validate a username.

app.post('/api/customer/add',(req, res)=>{
    const {error} = validateUserName(req.body);
    if(error){	
	res.status(400).send(error.details[0].message);
    }
    else{
	customer = {
	    name : req.body.name,
	    id   : customers.length + 1
	}
	customers.push(customer);
	res.status(201).send("data inserted successfully");
    }
});

Β 

PATCH

The PATCH method is used to update existing data partially. To update the entire user data, the PUT method should be used.

app.patch('/api/customer/:id', (req, res)=>{
    const customer = customers.find(user => user.id == req.params.id);
    const {error} = validateUserName(req.body);
    if(!customer){
	res.status(404).send("Data Not Found");
    }
    else if(error){
	console.log(error)
	res.status(400).send(error.details[0].message);
    }
    else{
	customer.name = req.body.name;
	res.status(200).send("successfully updated");
    }
});

Β 

DELETE

The DELETE method is used to remove user data.

app.delete('/api/customer/:id', (req,res)=>{
    const user = customers.find(user => user.id == req.params.id);
    index = customers.indexOf(user);
    if(!user){
	console.log("test")
	res.status(404).send("Data Not Found");
    }
    else{
	customers.splice(index,1);
	res.status(200).send("successfully deleted");
    }
});

Β 

What I Learned

CRUD Operations with REST API

I learned the basics of REST API and CRUD operations, including the uniform methods GET, POST, PUT, PATCH, and DELETE.

Status Codes

REST APIs strictly follow status codes:

  • 200 – OK
  • 201 – Created successfully
  • 400 – Bad request
  • 204 – No content
  • 404 – Page not found

Joi Package

For server-side validation, the Joi package is used. It helps verify user data easily.

Middleware

Using app.use(express.json()) as middleware ensures that for POST, PATCH, and PUT methods, JSON-formatted user data is parsed into an object accessible via req.body.

Β 

Learning Notes #68 – Buildpacks and Dockerfile

2 February 2025 at 09:32

  1. What is an OCI ?
  2. Does Docker Create OCI Images?
  3. What is a Buildpack ?
  4. Overview of Buildpack Process
  5. Builder: The Image That Executes the Build
    1. Components of a Builder Image
    2. Stack: The Combination of Build and Run Images
  6. Installation and Initial Setups
  7. Basic Build of an Image (Python Project)
    1. Building an image using buildpack
    2. Building an Image using Dockerfile
  8. Unique Benefits of Buildpacks
    1. No Need for a Dockerfile (Auto-Detection)
    2. Automatic Security Updates
    3. Standardized & Reproducible Builds
    4. Extensibility: Custom Buildpacks
  9. Generating SBOM in Buildpacks
    1. a) Using pack CLI to Generate SBOM
    2. b) Generate SBOM in Docker

Last few days, i was exploring on Buildpacks. I am amused at this tool features on reducing the developer’s pain. In this blog i jot down my experience on Buildpacks.

Before going to try Buildpacks, we need to understand what is an OCI ?

What is an OCI ?

An OCI Image (Open Container Initiative Image) is a standard format for container images, defined by the Open Container Initiative (OCI) to ensure interoperability across different container runtimes (Docker, Podman, containerd, etc.).

It consists of,

  1. Manifest – Metadata describing the image (layers, config, etc.).
  2. Config JSON – Information about how the container should run (CMD, ENV, etc.).
  3. Filesystem Layers – The actual file system of the container.

OCI Image Specification ensures that container images built once can run on any OCI-compliant runtime.

Does Docker Create OCI Images?

Yes, Docker creates OCI-compliant images. Since Docker v1.10+, Docker has been aligned with the OCI Image Specification, and all Docker images are OCI-compliant by default.

  • When you build an image with docker build, it follows the OCI Image format.
  • When you push/pull images to registries like Docker Hub, they follow the OCI Image Specification.

However, Docker also supports its legacy Docker Image format, which existed before OCI was introduced. Most modern registries and runtimes (Kubernetes, Podman, containerd) support OCI images natively.

What is a Buildpack ?

A buildpack is a framework for transforming application source code into a runnable image by handling dependencies, compilation, and configuration. Buildpacks are widely used in cloud environments like Heroku, Cloud Foundry, and Kubernetes (via Cloud Native Buildpacks).

Overview of Buildpack Process

The buildpack process consists of two primary phases

  • Detection Phase: Determines if the buildpack should be applied based on the app’s dependencies.
  • Build Phase: Executes the necessary steps to prepare the application for running in a container.

Buildpacks work with a lifecycle manager (e.g., Cloud Native Buildpacks’ lifecycle) that orchestrates the execution of multiple buildpacks in an ordered sequence.

Builder: The Image That Executes the Build

A builder is an image that contains all necessary components to run a buildpack.

Components of a Builder Image

  1. Build Image – Used during the build phase (includes compilers, dependencies, etc.).
  2. Run Image – A minimal environment for running the final built application.
  3. Lifecycle – The core mechanism that executes buildpacks, orchestrates the process, and ensures reproducibility.

Stack: The Combination of Build and Run Images

  • Build Image + Run Image = Stack
  • Build Image: Base OS with tools required for building (e.g., Ubuntu, Alpine).
  • Run Image: Lightweight OS with only the runtime dependencies for execution.

Installation and Initial Setups

Basic Build of an Image (Python Project)

Project Source: https://github.com/syedjaferk/gh_action_docker_build_push_fastapi_app

Building an image using buildpack

Before running these commands, ensure you have Pack CLI (pack) installed.

a) Detect builder suggest

pack builder suggest

b) Build the image

pack build my-app --builder paketobuildpacks/builder:base

c) Run the image locally


docker run -p 8080:8080 my-python-app

Building an Image using Dockerfile

a) Dockerfile


FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .

RUN pip install -r requirements.txt

COPY ./random_id_generator ./random_id_generator
COPY app.py app.py

EXPOSE 8080

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]

b) Build and Run


docker build -t my-python-app .
docker run -p 8080:8080 my-python-app

Unique Benefits of Buildpacks

No Need for a Dockerfile (Auto-Detection)

Buildpacks automatically detect the language and dependencies, removing the need for Dockerfile.


pack build my-python-app --builder paketobuildpacks/builder:base

It detects Python, installs dependencies, and builds the app into a container. πŸš€ Docker requires a Dockerfile, which developers must manually configure and maintain.

Automatic Security Updates

Buildpacks automatically patch base images for security vulnerabilities.

If there’s a CVE in the OS layer, Buildpacks update the base image without rebuilding the app.


pack rebase my-python-app

No need to rebuild! It replaces only the OS layers while keeping the app the same.

Standardized & Reproducible Builds

Ensures consistent images across environments (dev, CI/CD, production). Example: Running the same build locally and on Heroku/Cloud Run,


pack build my-app

Extensibility: Custom Buildpacks

Developers can create custom Buildpacks to add special dependencies.

Example: Adding ffmpeg to a Python buildpack,


pack buildpack package my-custom-python-buildpack --path .

Generating SBOM in Buildpacks

a) Using pack CLI to Generate SBOM

After building an image with pack, run,


pack sbom download my-python-app --output-dir ./sbom
  • This fetches the SBOM for your built image.
  • The SBOM is saved in the ./sbom/ directory.

βœ… Supported formats:

  • SPDX (sbom.spdx.json)
  • CycloneDX (sbom.cdx.json)

b) Generate SBOM in Docker


trivy image --format cyclonedx -o sbom.json my-python-app

Both are helpful in creating images. Its all about the tradeoffs.

About SQL

By: vsraj80
31 January 2025 at 16:15

Structured Query Language

Relational Data-Base Management System

SQL is a Free Open Source Software

MySQL Client – front end MySQL Server – back end

Functions of SQL Client

  1. Validating the password and authenticating

2. Receiving input from client end and convert it as token and send to sql server

3. Getting the results from SQL server to user

Functions of SQL Server

SQL server consists 2 Major part

Receiving the request from client and return the response after processing

1.Management Layer

a.Decoding the data

b.Validating and parsing(analyzing) the data

c.Sending the catched queries to Storage Engine

2.Storage Engine

a.Managing Database,tables,indexes

b.sending the data to other shared SQL Server

Install SQL in Ubuntu

sudo apt-get install mysql-server

To make secure configure as below

sudo mysql_secure_installation

1.It used to removes Anonymous users

2.Allow the root only from the local host

3.Removing the test database

MySQL Configuration options

/etc/mysql is the MySQL configuration directory

To Start MySQL

sudo service mysql start

To Stop MySQL

sudo service mysql stop

To Restart MySQL

sudo service mysql restart

MySQL Clients

Normally we will use mysql in command line

But in linux we can access through following GUI

MySQL Work Bench

sudo apt-Β­get install MySQLΒ­-workbench

MySQL Navigator

sudo aptΒ­-get install MySQLΒ­-navigator

EMMA

sudo aptΒ­-get install emma

PHP MYAdmin

sudo aptitude install phpmyadmin

MySQL Admin

sudo aptΒ­-get install MySQLΒ­-admin

Kinds of MySQL

1.GUI based Desktop based application

2.Web based application

3.Shell based application -(text-only based applications)

To connect the server with MySQL client

mysql -u root -p

To connect with a particular host , user name, database name

mysql - h

mysql -u

mysql -p

if not given the above host/username/password , it will take default local server/ uinux user name and without password for authentication.

to find more options about mysql

mysql -?

to disconnect the client with server

exit

from page 33 to 39 need to understand and read agan.

Minimal Typing Practice Application in Python

By: krishna
30 January 2025 at 09:40

Introduction

This is a Python-based single-file application designed for typing practice. It provides a simple interface to improve typing accuracy and speed. Over time, this minimal program has gradually increased my typing skill.

What I Learned from This Project

  • 2D Array Validation
    I first simply used a 1D array to store user input, but I noticed some issues. After implementing a 2D array, I understood why the 2D array was more appropriate for handling user inputs.
  • Tkinter
    I wanted to visually see and update correct, wrong, and incomplete typing inputs, but I didn’t know how to implement it in the terminal. So, I used a simple Tkinter gui window

Run This Program

It depends on the following applications:

  • Python 3
  • python3-tk

Installation Command on Debian-Based Systems

sudo apt install python3 python3-tk

Clone repository and run program

git clone https://github.com/github-CS-krishna/TerminalTyping
cd TerminalTyping
python3 terminalType.py

Links

For more details, refer to the README documentation on GitHub.

This will help you understand how to use it.

source code(github)

Learning Notes #67 – Build and Push to a Registry (Docker Hub) with GH-Actions

28 January 2025 at 02:30

GitHub Actions is a powerful tool for automating workflows directly in your repository.In this blog, we’ll explore how to efficiently set up GitHub Actions to handle Docker workflows with environments, secrets, and protection rules.

Why Use GitHub Actions for Docker?

My Code base is in Github and i want to tryout gh-actions to build and push images to docker hub seamlessly.

Setting Up GitHub Environments

GitHub Environments let you define settings specific to deployment stages. Here’s how to configure them:

1. Create an Environment

Go to your GitHub repository and navigate to Settings > Environments. Click New environment, name it (e.g., production), and save.

2. Add Secrets and Variables

Inside the environment settings, click Add secret to store sensitive information like DOCKER_USERNAME and DOCKER_TOKEN.

Use Variables for non-sensitive configuration, such as the Docker image name.

3. Optional: Set Protection Rules

Enforce rules like requiring manual approval before deployments. Restrict deployments to specific branches (e.g., main).

Sample Workflow for Building and Pushing Docker Images

Below is a GitHub Actions workflow for automating the build and push of a Docker image based on a minimal Flask app.

Workflow: .github/workflows/docker-build-push.yml


name: Build and Push Docker Image

on:
  push:
    branches:
      - main  # Trigger workflow on pushes to the `main` branch

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    environment: production  # Specify the environment to use

    steps:
      # Checkout the repository
      - name: Checkout code
        uses: actions/checkout@v3

      # Log in to Docker Hub using environment secrets
      - name: Log in to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}

      # Build the Docker image using an environment variable
      - name: Build Docker image
        env:
          DOCKER_IMAGE_NAME: ${{ vars.DOCKER_IMAGE_NAME }}
        run: |
          docker build -t ${{ secrets.DOCKER_USERNAME }}/$DOCKER_IMAGE_NAME:${{ github.run_id }} .

      # Push the Docker image to Docker Hub
      - name: Push Docker image
        env:
          DOCKER_IMAGE_NAME: ${{ vars.DOCKER_IMAGE_NAME }}
        run: |
          docker push ${{ secrets.DOCKER_USERNAME }}/$DOCKER_IMAGE_NAME:${{ github.run_id }}

To Actions on live: https://github.com/syedjaferk/gh_action_docker_build_push_fastapi_app/actions

Event Summary: Grafana & Friends Meetup Chennai – 25-01-2025

26 January 2025 at 04:47

πŸš€ Attended the Grafana & Friends Meetup Yesterday! πŸš€

I usually have a question. As a developer, i have logs, isn’t that enough. With curious mind, i attended Grafana & Friends Chennai meetup (Jan 25th 2025)

Had an awesome time meeting fellow tech enthusiasts (devops engineers) and learning about cool ways to monitor and understand data better.
Big shoutout to the Grafana Labs community and Presidio for hosting such a great event!

Sandwich and Juice was nice πŸ˜‹

Talk Summary,

1⃣ Making Data Collection Easier with Grafana Alloy
Dinesh J. and Krithika R shared how Grafana Alloy, combined with Open Telemetry, makes it super simple to collect and manage data for better monitoring.

2⃣ Running Grafana in Kubernetes
Lakshmi Narasimhan Parthasarathy (https://lnkd.in/gShxtucZ) showed how to set up Grafana in Kubernetes in 4 different ways (vanilla, helm chart, grafana operator, kube-prom-stack). He is building a SaaS product https://lnkd.in/gSS9XS5m (Heroku on your own servers).

3⃣ Observability for Frontend Apps with Grafana Faro
Selvaraj Kuppusamy show how Grafana Faro can help frontend developers monitor what’s happening on websites and apps in real time. This makes it easier to spot and fix issues quickly. Were able to see core web vitals, and traces too. I was surprised about this.

Techies i had interaction with,

Prasanth Baskar, who is an open source contributor at Cloud Native Computing Foundation (CNCF) on project https://lnkd.in/gmHjt9Bs. I was also happy to know that he knows **parottasalna** (that’s me) and read some blogs. Happy To Hear that.

Selvaraj Kuppusamy, Devops Engineer, he is also conducting Grafana and Friends chapter in Coimbatore on Feb 1. I will attend that aswell.

Saranraj Chandrasekaran who is also a devops engineer, Had a chat with him on devops and related stuffs.

To all of them, i shared about KanchiLUG (https://lnkd.in/gasCnxXv) and Parottasalna (https://parottasalna.com/) and My Channel on Tech https://lnkd.in/gKcyE-b5.

Thanks Achanandhi M for organising this wonderful meetup. You did well. I came to Achanandhi M from medium. He regularly writes blog on cloud related stuffs. https://lnkd.in/ghUS-GTc Checkout his blog.

Also, He shared some tasks for us,

1. Create your First Grafana Dashboard.
Objective: Create a basic Grafana Dashboard to visualize data in various formats such as tables, charts and graphs. Aslo, try to connect to multiple data sources to get diverse data for your dashboard.

2. Monitor your linux system’s health with prometheus, Node Exporter and Grafana.
Objective: Use prometheus, Node Exporter adn Grafana to monitor your linux machines health system by tracking key metrics like CPU, memory and disk usage.


3. Using Grafana Faro to track User Actions (Like Button Clicks) and Identify the Most Used Features.

Give a try on these.

Learning Notes #63 – Change Data Capture. What does it do ?

19 January 2025 at 16:22

Few days back i came across a concept of CDC. Like a notifier of database events. Instead of polling, this enables event to be available in a queue, which can be consumed by many consumers. In this blog, i try to explain the concepts, types in a theoretical manner.

You run a library. Every day, books are borrowed, returned, or new books are added. What if you wanted to keep a live record of all these activities so you always know the exact state of your library?

This is essentially what Change Data Capture (CDC) does for your databases. It’s a way to track changes (like inserts, updates, or deletions) in your database tables and send them to another system, like a live dashboard or a backup system. (Might be a bad example. Don’t lose hope. Continue …)

CDC is widely used in modern technology to power,

  • Real-Time Analytics: Live dashboards that show sales, user activity, or system performance.
  • Data Synchronization: Keeping multiple databases or microservices in sync.
  • Event-Driven Architectures: Triggering notifications, workflows, or downstream processes based on database changes.
  • Data Pipelines: Streaming changes to data lakes or warehouses for further processing.
  • Backup and Recovery: Incremental backups by capturing changes instead of full data dumps.

It’s a critical part of tools like Debezium, Kafka, and cloud services such as AWS Database Migration Service (DMS) and Azure Data Factory. CDC enables companies to move towards real-time data-driven decision-making.

What is CDC?

CDC stands for Change Data Capture. It’s a technique that listens to a database and captures every change that happens in it. These changes can then be sent to other systems to,

  • Keep data in sync across multiple databases.
  • Power real-time analytics dashboards.
  • Trigger notifications for certain database events.
  • Process data streams in real time.

In short, CDC ensures your data is always up-to-date wherever it’s needed.

Why is CDC Useful?

Imagine you have an online store. Whenever someone,

  • Places an order,
  • Updates their shipping address, or
  • Cancels an order,

you need these changes to be reflected immediately across,

  • The shipping system.
  • The inventory system.
  • The email notification service.

Instead of having all these systems query the database (this is one of main reasons) constantly (which is slow and inefficient), CDC automatically streams these changes to the relevant systems.

This means,

  1. Real-Time Updates: Systems receive changes instantly.
  2. Improved Performance: Your database isn’t overloaded with repeated queries.
  3. Consistency: All systems stay in sync without manual intervention.

How Does CDC Work?

Note: I haven’t yet tried all these. But conceptually having a feeling.

CDC relies on tracking changes in your database. There are a few ways to do this,

1. Query-Based CDC

This method repeatedly checks the database for changes. For example:

  • Every 5 minutes, it queries the database: β€œWhat changed since my last check?”
  • Any new or modified data is identified and processed.

Drawbacks: This can miss changes if the timing isn’t right, and it’s not truly real-time (Long Polling).

2. Log-Based CDC

Most modern databases (like PostgreSQL or MySQL) keep logs of every operation. Log-based CDC listens to these logs and captures changes as they happen.

Advantages

  • It’s real-time.
  • It’s lightweight since it doesn’t query the database directly.

3. Trigger-Based CDC

In this method, the database uses triggers to log changes into a separate table. Whenever a change occurs, a trigger writes a record of it.

Advantages: Simple to set up.

Drawbacks: Can slow down the database if not carefully managed.

Tools That Make CDC Easy

Several tools simplify CDC implementation. Some popular ones are,

  1. Debezium: Open-source and widely used for log-based CDC with databases like PostgreSQL, MySQL, and MongoDB.
  2. Striim: A commercial tool for real-time data integration.
  3. AWS Database Migration Service (DMS): A cloud-based CDC service.
  4. StreamSets: Another tool for real-time data movement.

These tools integrate with databases, capture changes, and deliver them to systems like RabbitMQ, Kafka, or cloud storage.

To help visualize CDC, think of,

  • Social Media Feeds: When someone likes or comments on a post, you see the update instantly. This is CDC in action.
  • Bank Notifications: Whenever you make a transaction, your bank app updates instantly. Another example of CDC.

In upcoming blogs, will include Debezium implementation with CDC.

Learning Notes #62 – Serverless – Just like riding a taxi

19 January 2025 at 04:55

What is Serverless Computing?

Serverless computing allows developers to run applications without having to manage the underlying infrastructure. You write code, deploy it, and the cloud provider takes care of the rest from provisioning servers to scaling applications.

Popular serverless platforms include AWS Lambda, Azure Functions, and Google Cloud Functions.

The Taxi Analogy

Imagine traveling to a destination. There are multiple ways to get there,

  1. Owning a Car (Traditional Servers): You own and maintain your car. This means handling maintenance, fuel, insurance, parking, and everything else that comes with it. It’s reliable and gives you control, but it’s also time-consuming and expensive to manage.
  2. Hiring a Taxi (Serverless): With a taxi, you simply book a ride when you need it. You don’t worry about maintaining the car, fueling it, or where it’s parked afterward. You pay only for the distance traveled, and the service scales to your needs whether you’re alone or with friends.

Why Serverless is Like Taking a Taxi ?

  1. No Infrastructure Management – With serverless, you don’t have to manage or worry about servers, just like you don’t need to maintain a taxi.
  2. Pay-As-You-Go – In a taxi, you pay only for the distance traveled. Similarly, in serverless, you’re billed only for the compute time your application consumes.
  3. On-Demand Availability – Need a ride at midnight? A taxi is just a booking away. Serverless functions work the same way available whenever you need them, scaling up or down as required.
  4. Scalability – Whether you’re a solo traveler or part of a group, taxis can adapt by providing a small car or a larger vehicle. Serverless computing scales resources automatically based on traffic, ensuring optimal performance.
  5. Focus on the Destination – When you take a taxi, you focus on reaching your destination without worrying about the vehicle. Serverless lets you concentrate on writing and deploying code rather than worrying about servers.

Key Benefits of Serverless (and Taxi Rides)

  • Cost-Effectiveness – Avoid upfront costs. No need to buy servers (or cars) you might not fully utilize.
  • Flexibility – Serverless platforms support multiple programming languages and integrations.
    Taxis, too, come in various forms: regular cars, SUVs, and even luxury rides for special occasions.
  • Reduced Overhead – Free yourself from maintenance tasks, whether it’s patching servers or checking tire pressure.

When Not to Choose Serverless (or a Taxi)

  1. Predictable, High-Volume Usage – Owning a car might be cheaper if you’re constantly on the road. Similarly, for predictable and sustained workloads, traditional servers or containers might be more cost-effective than serverless.
  2. Special Requirements – Need a specific type of vehicle, like a truck for moving furniture? Owning one might make sense. Similarly, applications with unique infrastructure requirements may not be a perfect fit for serverless.
  3. Latency Sensitivity – Taxis take time to arrive after booking. Likewise, serverless functions may experience cold starts, adding slight delays. For ultra-low-latency applications, other architectures may be preferable.

Learning Notes #57 – Partial Indexing in Postgres

16 January 2025 at 14:36

Today, i learnt about partial indexing in postgres, how its optimizes the indexing process to filter subset of table more efficiently. In this blog, i jot down notes on partial indexing.

Partial indexing in PostgreSQL is a powerful feature that provides a way to optimize database performance by creating indexes that apply only to a subset of a table’s rows. This selective indexing can result in reduced storage space, faster index maintenance, and improved query performance, especially when queries frequently involve filters or conditions that only target a portion of the data.

An index in PostgreSQL, like in other relational database management systems, is a data structure that improves the speed of data retrieval operations. However, creating an index on an entire table can sometimes be inefficient, especially when dealing with very large datasets where queries often focus on specific subsets of the data. This is where partial indexing becomes invaluable.

Unlike a standard index that covers every row in a table, a partial index only includes rows that satisfy a specified condition. This condition is defined using a WHERE clause when the index is created.

To understand the mechanics, let us consider a practical example.

Suppose you have a table named orders that stores details about customer orders, including columns like order_id, customer_id, order_date, status, and total_amount. If the majority of your queries focus on pending orders those where the status is pending, creating a partial index specifically for these rows can significantly improve performance.

Example 1:

Here’s how you can create such an index,

CREATE INDEX idx_pending_orders
ON orders (order_date)
WHERE status = 'pending';

In this example, the index idx_pending_orders includes only the rows where status equals pending. This means that any query that involves filtering by status = 'pending' and utilizes the order_date column will leverage this index. For instance, the following query would benefit from the partial index,

SELECT *
FROM orders
WHERE status = 'pending'
AND order_date > '2025-01-01';

The benefits of this approach are significant. By indexing only the rows with status = 'pending', the size of the index is much smaller compared to a full table index.

This reduction in size not only saves disk space but also speeds up the process of scanning the index, as there are fewer entries to traverse. Furthermore, updates or modifications to rows that do not meet the WHERE condition are excluded from index maintenance, thereby reducing the overhead of maintaining the index and improving performance for write operations.

Example 2:

Let us explore another example. Suppose your application frequently queries orders that exceed a certain total amount. You can create a partial index tailored to this use case,

CREATE INDEX idx_high_value_orders
ON orders (customer_id)
WHERE total_amount > 1000;

This index would optimize queries like the following,

SELECT *
FROM orders
WHERE total_amount > 1000
AND customer_id = 123;

The key advantage here is that the index only includes rows where total_amount > 1000. For datasets with a wide range of order amounts, this can dramatically reduce the number of indexed entries. Queries that filter by high-value orders become faster because the database does not need to sift through irrelevant rows.

Additionally, as with the previous example, index maintenance is limited to the subset of rows matching the condition, improving overall performance for insertions and updates.

Partial indexes are also useful for enforcing constraints in a selective manner. Consider a scenario where you want to ensure that no two active promotions exist for the same product. You can achieve this using a unique partial index

CREATE UNIQUE INDEX idx_unique_active_promotion
ON promotions (product_id)
WHERE is_active = true;

This index guarantees that only one row with is_active = true can exist for each product_id.

In conclusion, partial indexing in PostgreSQL offers a flexible and efficient way to optimize database performance by targeting specific subsets of data.

Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX

12 January 2025 at 09:14

Redis, a high-performance in-memory key-value store, is widely used for caching, session management, and various other scenarios where fast data retrieval is essential. One of its key features is the ability to set expiration times for keys. However, when using the SET command with the EX option, developers might encounter unexpected behaviors where the expiration time is seemingly lost. Let’s explore this issue in detail.

Understanding SET with EX

The Redis SET command with the EX option allows you to set a key’s value and specify its expiration time in seconds. For instance


SET key value EX 60

This command sets the key key to the value value and sets an expiration time of 60 seconds.

The Problem

In certain cases, the expiration time might be unexpectedly lost. This typically happens when subsequent operations overwrite the key without specifying a new expiration. For example,


SET key value1 EX 60
SET key value2

In the above sequence,

  1. The first SET command assigns a value to key and sets an expiration of 60 seconds.
  2. The second SET command overwrites the value of key but does not include an expiration time, resulting in the key persisting indefinitely.

This behavior can lead to subtle bugs, especially in applications that rely on key expiration for correctness or resource management.

Why Does This Happen?

The Redis SET command is designed to replace the entire state of a key, including its expiration. When you use SET without the EX, PX, or EXAT options, the expiration is removed, and the key becomes persistent. This behavior aligns with the principle that SET is a complete update operation.

When using Redis SET with EX, be mindful of operations that might overwrite keys without reapplying expiration. Understanding Redis’s behavior and implementing robust patterns can save you from unexpected issues, ensuring your application remains efficient and reliable.

Learning Notes #52 – Hybrid Origin Failover Pattern

12 January 2025 at 06:29

Today, i learnt about failover patterns from AWS https://aws.amazon.com/blogs/networking-and-content-delivery/three-advanced-design-patterns-for-high-available-applications-using-amazon-cloudfront/ . In this blog i jot down my understanding on this pattern for future reference,

Hybrid origin failover is a strategy that combines two distinct approaches to handle origin failures effectively, balancing speed and resilience.

The Need for Origin Failover

When an application’s primary origin server becomes unavailable, the ability to reroute traffic to a secondary origin ensures continuity. The failover process determines how quickly and effectively this switch happens. Broadly, there are two approaches to implement origin failover:

  1. Stateful Failover with DNS-based Routing
  2. Stateless Failover with Application Logic

Each has its strengths and limitations, which the hybrid approach aims to mitigate.

Stateful Failover

Stateful failover is a system that allows a standby server to take over for a failed server and continue active sessions. It’s used to create a resilient network infrastructure and avoid service interruptions.

This method relies on a DNS service with health checks to detect when the primary origin is unavailable. Here’s how it works,

  1. Health Checks: The DNS service continuously monitors the health of the primary origin using health checks (e.g., HTTP, HTTPS).
  2. DNS Failover: When the primary origin is marked unhealthy, the DNS service resolves the origin’s domain name to the secondary origin’s IP address.
  3. TTL Impact: The failover process honors the DNS Time-to-Live (TTL) settings. A low TTL ensures faster propagation, but even in the most optimal configurations, this process introduces a delayβ€”often around 60 to 70 seconds.
  4. Stateful Behavior: Once failover occurs, all traffic is routed to the secondary origin until the primary origin is marked healthy again.

Implementation from AWS (as-is from aws blog)

The first approach is usingΒ Amazon Route 53 Failover routing policy with health checks on the origin domain name that’s configured as the origin in CloudFront. When the primary origin becomes unhealthy, Route 53 detects it, and then starts resolving the origin domain name with the IP address of the secondary origin. CloudFront honors the origin DNS TTL, which means that traffic will start flowing to the secondary origin within the DNS TTLs.Β The most optimal configuration (Fast Check activated, a failover threshold of 1, and 60 second DNS TTL) means that the failover will take 70 seconds at minimum to occur. When it does, all of the traffic is switched to the secondary origin, since it’s a stateful failover. Note that this design can be further extended with Route 53 Application Recovery Control for more sophisticated application failover across multiple AWS Regions, Availability Zones, and on-premises.

The second approach is using origin failover, a native feature of CloudFront. This capability of CloudFront tries for the primary origin of every request, and if a configured 4xx or 5xx error is received, then CloudFront attempts a retry with the secondary origin. This approach is simple to configure and provides immediate failover. However, it’s stateless, which means every request must fail independently, thus introducing latency to failed requests. For transient origin issues, this additional latency is an acceptable tradeoff with the speed of failover, but it’s not ideal when the origin is completely out of service. Finally, this approach only works for the GET/HEAD/OPTIONS HTTP methods, because other HTTP methods are not allowed on a CloudFront cache behavior with Origin Failover enabled.

Advantages

  • Works for all HTTP methods and request types.
  • Ensures complete switchover, minimizing ongoing failures.

Disadvantages

  • Relatively slower failover due to DNS propagation time.
  • Requires a reliable health-check mechanism.

Approach 2: Stateless Failover with Application Logic

This method handles failover at the application level. If a request to the primary origin fails (e.g., due to a 4xx or 5xx HTTP response), the application or CDN immediately retries the request with the secondary origin.

How It Works

  1. Primary Request: The application sends a request to the primary origin.
  2. Failure Handling: If the response indicates a failure (configurable for specific error codes), the request is retried with the secondary origin.
  3. Stateless Behavior: Each request operates independently, so failover happens on a per-request basis without waiting for a stateful switchover.

Implementation from AWS (as-is from aws blog)

The hybrid origin failover pattern combines both approaches to get the best of both worlds. First, you configure both of your origins with a Failover Policy in Route 53 behind a single origin domain name. Then, you configure an origin failover group with the single origin domain name as primary origin, and the secondary origin domain name as secondary origin. This means that when the primary origin becomes unavailable, requests are immediately retried with the secondary origin until the stateful failover of Route 53 kicks in within tens of seconds, after which requests go directly to the secondary origin without any latency penalty. Note that this pattern only works with the GET/HEAD/OPTIONS HTTP methods.

Advantages

  • Near-instantaneous failover for failed requests.
  • Simple to configure and doesn’t depend on DNS TTL.

Disadvantages

  • Adds latency for failed requests due to retries.
  • Limited to specific HTTP methods like GET, HEAD, and OPTIONS.
  • Not suitable for scenarios where the primary origin is entirely down, as every request must fail first.

The Hybrid Origin Failover Pattern

The hybrid origin failover pattern combines the strengths of both approaches, mitigating their individual limitations. Here’s how it works:

  1. DNS-based Stateful Failover: A DNS service with health checks monitors the primary origin and switches to the secondary origin if the primary becomes unhealthy. This ensures a complete and stateful failover within tens of seconds.
  2. Application-level Stateless Failover: Simultaneously, the application or CDN is configured to retry failed requests with a secondary origin. This provides an immediate failover mechanism for transient or initial failures.

Implementation Steps

  1. DNS Configuration
    • Set up health checks on the primary origin.
    • Define a failover policy in the DNS service, which resolves the origin domain name to the secondary origin when the primary is unhealthy.
  2. Application Configuration
    • Configure the application or CDN to use an origin failover group.
    • Specify the primary origin domain as the primary origin and the secondary origin domain as the backup.

Behavior

  • Initially, if the primary origin encounters issues, requests are retried immediately with the secondary origin.
  • Meanwhile, the DNS failover switches all traffic to the secondary origin within tens of seconds, eliminating retry latencies for subsequent requests.

Benefits of Hybrid Origin Failover

  1. Faster Failover: Immediate retries for failed requests minimize initial impact, while DNS failover ensures long-term stability.
  2. Reduced Latency: After DNS failover, subsequent requests don’t experience retry delays.
  3. High Resilience: Combines stateful and stateless failover for robust redundancy.
  4. Simplicity and Scalability: Leverages existing DNS and application/CDN features without complex configurations.

Limitations and Considerations

  1. HTTP Method Constraints: Stateless failover works only for GET, HEAD, and OPTIONS methods, limiting its use for POST or PUT requests.
  2. TTL Impact: Low TTLs reduce propagation delays but increase DNS query rates, which could lead to higher costs.
  3. Configuration Complexity: Combining DNS and application-level failover requires careful setup and testing to avoid misconfigurations.
  4. Secondary Origin Capacity: Ensure the secondary origin can handle full traffic loads during failover.

Learning Notes #50 – Fixed Partition Pattern | Distributed Pattern

9 January 2025 at 16:51

Today, i learnt about fixed partition, where it handles about balancing the data among servers without high movement of data. In this blog, i jot down notes on how fixed partition helps in solving the problem.

This entire blog is inspired from https://www.linkedin.com/pulse/distributed-systems-design-pattern-fixed-partitions-retail-kumar-v-c34pc/?trackingId=DMovSwEZSfCzKZEKa7yJrg%3D%3D

Problem Statement

In a distributed key-value store system, data items need to be mapped to a set of cluster nodes to ensure efficient storage and retrieval. The system must satisfy the following requirements,

  1. Uniform Distribution: Data should be evenly distributed across all cluster nodes to avoid overloading any single node.
  2. Deterministic Mapping: Given a data item, the specific node responsible for storing it should be determinable without querying all the nodes in the cluster.

A common approach to achieve these goals is to use hashing with a modulo operation. For example, if there are three nodes in the cluster, the key is hashed, and the hash value modulo the number of nodes determines the node to store the data. However, this method has a critical drawback,

Rebalancing Issue: When the cluster size changes (e.g., nodes are added or removed), the mapping for most keys changes. This requires the system to move almost all the data to new nodes, leading to significant overhead in terms of time and resources, especially when dealing with large data volumes.

Challenge: How can we design a mapping mechanism that minimizes data movement during cluster size changes while maintaining uniform distribution and deterministic mapping?

Solution

There is a concept of Fixed Partitioning,

What Is Fixed Partitioning?

This pattern organizes data into a predefined number of fixed partitions that remain constant over time. Data is assigned to these partitions using a hashing algorithm, ensuring that the mapping of data to partitions is permanent. The system separates the fixed partitioning of data from the physical servers managing these partitions, enabling seamless scaling.

Key Features of Fixed Partitioning

  1. Fixed Number of Partitions
    • The number of partitions is determined during system initialization (e.g., 8 partitions).
    • Data is assigned to these partitions based on a consistent hashing algorithm.
  2. Stable Data Mapping
    • Each piece of data is permanently mapped to a specific partition.
    • This eliminates the need for large-scale data reshuffling when scaling the system.
  3. Adjustable Partition-to-Server Mapping
    • Partitions can be reassigned to different servers as the system scales.
    • Only the physical location of the partitions changes; the fixed mapping remains intact.
  4. Balanced Load Distribution
    • Partitions are distributed evenly across servers to balance the workload.
    • Adding new servers involves reassigning partitions without moving or reorganizing data within the partitions.

Naive Example

We have a banking system with transactions stored in 8 fixed partitions, distributed based on a customer’s account ID.


CREATE TABLE transactions (
    id SERIAL PRIMARY KEY,
    account_id INT NOT NULL,
    transaction_amount NUMERIC(10, 2) NOT NULL,
    transaction_date DATE NOT NULL
) PARTITION BY HASH (account_id);

1. Create Partition


DO $$
BEGIN
    FOR i IN 0..7 LOOP
        EXECUTE format(
            'CREATE TABLE transactions_p%s PARTITION OF transactions FOR VALUES WITH (modulus 8, remainder %s);',
            i, i
        );
    END LOOP;
END $$;

This creates 8 partitions (transactions_p0 to transactions_p7) based on the hash remainder of account_id modulo 8.

2. Inserting Data

When inserting data into the transactions table, PostgreSQL automatically places it into the correct partition based on the account_id.


INSERT INTO transactions (account_id, transaction_amount, transaction_date)
VALUES (12345, 500.00, '2025-01-01');

The hash of 12345 % 8 determines the target partition (e.g., transactions_p5).

3. Querying Data

Querying the base table works transparently across all partitions


SELECT * FROM transactions WHERE account_id = 12345;

PostgreSQL automatically routes the query to the correct partition.

4. Scaling by Adding Servers

Initial Setup:

Suppose we have 4 servers managing the partitions,

  • Server 1: transactions_p0, transactions_p1
  • Server 2: transactions_p2, transactions_p3
  • Server 3: transactions_p4, transactions_p5
  • Server 4: transactions_p6, transactions_p7

Adding a New Server:

When a 5th server is added, we redistribute partitions,

  • Server 1: transactions_p0
  • Server 2: transactions_p1
  • Server 3: transactions_p2, transactions_p3
  • Server 4: transactions_p4
  • Server 5: transactions_p5, transactions_p6, transactions_p7

Partition Migration

  • During the migration, transactions_p5 is copied from Server 3 to Server 5.
  • Once the migration is complete, Server 5 becomes responsible for transactions_p5.

Benefits:

  1. Minimal Data Movement – When scaling, only the partitions being reassigned are copied to new servers. Data within partitions remains stable.
  2. Optimized Performance – Queries are routed directly to the relevant partition, minimizing scan times.
  3. Scalability – Adding servers is straightforward, as it involves reassigning partitions, not reorganizing data.

What happens when a new server is added then. Don’t we need to copy the data ?

When a partition is moved to a new server (e.g., partition_b from server_A to server_B), the data in the partition must be copied to the new server. However,

  1. The copying is limited to the partition being reassigned.
  2. No data within the partition is reorganized.
  3. Once the partition is fully migrated, the original copy is typically deleted.

For example, in PostgreSQL,

  • Export the Partition pg_dump -t partition_b -h server_A -U postgres > partition_b.sql
  • Import on New Server: psql -h server_B -U postgres -d mydb < partition_b.sql

Connect postman to salesforce

3 January 2025 at 16:27

Today, I want to capture notes that I learnt from trailhead academy on connecting postman to a salesforce org.

To make postman allow changes at Salesforce org, we have to enable CORS policy in Salesforce. See below what does CORS mean.

CORS- Cross Origin Resource Sharing

It is a browser feature that controls how resources are requested from one site to another site. By configuring CORS, it enables special permissions for other external websites to access our salesforce data. In this case, we are enabling CORS for postman to access salesforce.

  • From setup ==> search for CORS ==> Add https://*.postman.co and https://*.postman.com URL
  • After that, in postman desktop -Do below steps one by one.
  • Create a separate workspace for Salesforce APIs to play around.
  • Search for Salesforce APIs. It does list out all the available collections.
  • Fork β€œSalesforce Platform API” and it will available to your local postman workspace.
  • After that, go to β€œAuthorization” click on β€œGenerate token” and copy β€œinstance” URL.
  • Configure β€œ_endpoint” value from variable tab as β€œinstance” URL
  • All set and that’s it. You can play around whatever requests that are available.

Connect postman to salesforce

3 January 2025 at 16:27

Today, I want to capture notes that I learnt from trailhead academy on connecting postman to a salesforce org.

To make postman allow changes at Salesforce org, we have to enable CORS policy in Salesforce. See below what does CORS mean.

CORS- Cross Origin Resource Sharing

It is a browser feature that controls how resources are requested from one site to another site. By configuring CORS, it enables special permissions for other external websites to access our salesforce data. In this case, we are enabling CORS for postman to access salesforce.

  • From setup ==> search for CORS ==> Add https://*.postman.co and https://*.postman.com URL
  • After that, in postman desktop -Do below steps one by one.
  • Create a separate workspace for Salesforce APIs to play around.
  • Search for Salesforce APIs. It does list out all the available collections.
  • Fork β€œSalesforce Platform API” and it will available to your local postman workspace.
  • After that, go to β€œAuthorization” click on β€œGenerate token” and copy β€œinstance” URL.
  • Configure β€œ_endpoint” value from variable tab as β€œinstance” URL
  • All set and that’s it. You can play around whatever requests that are available.

Learning Notes #30 – Queue Based Loading | Cloud Patterns

3 January 2025 at 14:47

Today, i learnt about Queue Based Loading pattern, which helps to manage intermittent peak load to a service via queues. Basically decoupling Tasks from Services. In this blog i jot down notes on this pattern for my future self.

In today’s digital landscape, applications are expected to handle large-scale operations efficiently. Whether it’s processing massive data streams, ensuring real-time responsiveness, or integrating with multiple third-party services, scalability and reliability are paramount. One pattern that elegantly addresses these challenges is the Queue-Based Loading Pattern.

What Is the Queue-Based Loading Pattern?

The Queue-Based Loading Pattern leverages message queues to decouple and coordinate tasks between producers (such as applications or services generating data) and consumers (services or workers processing that data). By using queues as intermediaries, this pattern allows systems to manage workloads efficiently, ensuring seamless and scalable operation.

Key Components of the Pattern

  1. Producers: Producers are responsible for generating tasks or data. They send these tasks to a message queue instead of directly interacting with consumers. Examples include:
    • Web applications logging user activity.
    • IoT devices sending sensor data.
  2. Message Queue: The queue acts as a buffer, storing tasks until consumers are ready to process them. Popular tools for implementing queues include RabbitMQ, Apache Kafka, AWS SQS, and Redis.
  3. Consumers: Consumers retrieve messages from the queue and process them asynchronously. They are typically designed to handle tasks independently and at their own pace.
  4. Processing Logic: This is the core functionality that processes the tasks retrieved by consumers. For example, resizing images, sending notifications, or updating a database.

How It Works

  1. Task Generation: Producers push tasks to the queue as they are generated.
  2. Message Storage: The queue stores tasks in a structured manner (FIFO, priority-based, etc.) and ensures reliable delivery.
  3. Task Consumption: Consumers pull tasks from the queue, process them, and optionally acknowledge completion.
  4. Scalability: New consumers can be added dynamically to handle increased workloads, ensuring the system remains responsive.

Benefits of the Queue-Based Loading Pattern

  1. Decoupling: Producers and consumers operate independently, reducing tight coupling and improving system maintainability.
  2. Scalability: By adding more consumers, systems can easily scale to handle higher workloads.
  3. Fault Tolerance: If a consumer fails, messages remain in the queue, ensuring no data is lost.
  4. Load Balancing: Tasks are distributed evenly among consumers, preventing any single consumer from becoming a bottleneck.
  5. Asynchronous Processing: Consumers can process tasks in the background, freeing producers to continue generating data without delay.

Issues and Considerations

  1. Rate Limiting: Implement logic to control the rate at which services handle messages to prevent overwhelming the target resource. Test the system under load and adjust the number of queues or service instances to manage demand effectively.
  2. One-Way Communication: Message queues are inherently one-way. If tasks require responses, you may need to implement a separate mechanism for replies.
  3. Autoscaling Challenges: Be cautious when autoscaling consumers, as it can lead to increased contention for shared resources, potentially reducing the effectiveness of load leveling.
  4. Traffic Variability: Consider the variability of incoming traffic to avoid situations where tasks pile up faster than they are processed, creating a perpetual backlog.
  5. Queue Persistence: Ensure your queue is durable and capable of persisting messages. Crashes or system limits could lead to dropped messages, risking data loss.

Use Cases

  1. Email and Notification Systems: Sending bulk emails or push notifications without overloading the main application.
  2. Data Pipelines: Ingesting, transforming, and analyzing large datasets in real-time or batch processing.
  3. Video Processing: Queues facilitate tasks like video encoding and thumbnail generation.
  4. Microservices Communication: Ensures reliable and scalable communication between microservices.

Best Practices

  1. Message Durability: Configure your queue to persist messages to disk, ensuring they are not lost during system failures.
  2. Monitoring and Metrics: Use monitoring tools to track queue lengths, processing rates, and consumer health.
  3. Idempotency: Design consumers to handle duplicate messages gracefully.
  4. Error Handling and Dead Letter Queues (DLQs): Route failed messages to DLQs for later analysis and reprocessing.

Learning Notes #25 – Valet Key Pattern | Cloud Patterns

1 January 2025 at 17:20

Today, I learnt about Valet Key Pattern, which helps clients to directly access the resources without the server using a token. In this blog, i jot down notes on valet key pattern for better understanding.

The Valet Key Pattern is a security design pattern used to provide limited access to a resource or service without exposing full access credentials or permissions. It is akin to a physical valet key for a car, which allows the valet to drive the car without accessing the trunk or glove box. This pattern is widely employed in distributed systems, cloud services, and API design to ensure secure and controlled resource sharing.

Why Use the Valet Key Pattern?

Modern systems often require sharing access to specific resources while minimizing security risks. For instance:

  • A mobile app needs to upload files to a storage bucket but shouldn’t manage the entire bucket.
  • A third-party service requires temporary access to a user’s resource, such as a document or media file.
  • A system needs to allow time-bound or operation-restricted access to sensitive data.

In these scenarios, the Valet Key Pattern provides a practical solution by issuing a scoped, temporary, and revocable token (valet key) that grants specific permissions.

Core Principles of the Valet Key Pattern

  1. Scoped Access: The valet key grants access only to specific resources or operations.
  2. Time-Limited: The access token is typically valid for a limited duration to minimize exposure.
  3. Revocable: The issuing entity can revoke the token if necessary.
  4. Minimal Permissions: Permissions are restricted to the least privilege required to perform the intended task.

How the Valet Key Pattern Works

1. Resource Owner Issues a Valet Key

The resource owner (or controlling entity) generates a token with limited permissions. This token is often a signed JSON Web Token (JWT) or a pre-signed URL in the case of cloud storage.

2. Token Delivery to the Client

The token is securely delivered to the client or third-party application requiring access. For instance, the token might be sent via HTTPS or embedded in an API response.

3. Client Uses the Valet Key

The client includes the token in subsequent requests to access the resource. The resource server validates the token, checks its permissions, and allows or denies the requested operation accordingly.

4. Expiry or Revocation

Once the token expires or is revoked, it becomes invalid, ensuring the client can no longer access the resource.

Examples of the Valet Key Pattern in Action

1. Cloud Storage (Pre-signed URLs)

Amazon S3, Google Cloud Storage, and Azure Blob Storage allow generating pre-signed URLs that enable temporary, scoped access to specific files. For example, a user can upload a file using a URL valid for 15 minutes without needing direct access credentials.

2. API Design

APIs often issue temporary access tokens for limited operations. OAuth 2.0 tokens, for instance, can be scoped to allow access to specific endpoints or resources.

3. Media Sharing Platforms

Platforms like YouTube or Dropbox use the Valet Key Pattern to provide limited access to files. A shareable link often embeds permissions and expiration details.

Implementation Steps

1. Define Permissions Scope

Identify the specific operations or resources the token should allow. Use the principle of least privilege to limit permissions.

2. Generate Secure Tokens

Create tokens with cryptographic signing to ensure authenticity. Include metadata such as:

  • Resource identifiers
  • Permissions
  • Expiry time
  • Issuer information

3. Validate Tokens

The resource server must validate incoming tokens by checking the signature, expiration, and permissions.

4. Monitor and Revoke

Maintain a mechanism to monitor token usage and revoke them if misuse is detected.

Best Practices

  1. Use HTTPS: Always transmit tokens over secure channels to prevent interception.
  2. Minimize Token Lifetime: Short-lived tokens reduce the risk of misuse.
  3. Implement Auditing: Log token usage for monitoring and troubleshooting.
  4. Employ Secure Signing: Use robust cryptographic algorithms to sign tokens and prevent tampering.

Challenges

  • Token Management: Requires robust infrastructure for token generation, validation, and revocation.
  • Revocation Delays: Invalidation mechanisms may not instantly propagate in distributed systems.

❌
❌