Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Golden Feedbacks for Python Sessions 1.0 from last year (2024)

13 February 2025 at 08:49

Many Thanks to Shrini for documenting it last year. This serves as a good reference to improve my skills. Hope it will help many.

📢 What Participants wanted to improve

🚶‍♂️ Go a bit slower so that everyone can understand clearly without feeling rushed.


📚 Provide more basics and examples to make learning easier for beginners.


🖥 Spend the first week explaining programming basics so that newcomers don’t feel lost.


📊 Teach flowcharting methods to help participants understand the logic behind coding.


🕹 Try teaching Scratch as an interactive way to introduce programming concepts.


🗓 Offer weekend batches for those who prefer learning on weekends.


🗣 Encourage more conversations so that participants can actively engage in discussions.


👥 Create sub-groups to allow participants to collaborate and support each other.


🎉 Get “cheerleaders” within the team to make the classes more fun and interactive.


📢 Increase promotion efforts to reach a wider audience and get more participants.


🔍 Provide better examples to make concepts easier to grasp.


❓ Conduct more Q&A sessions so participants can ask and clarify their doubts.


🎙 Ensure that each participant gets a chance to speak and express their thoughts.


📹 Showing your face in videos can help in building a more personal connection with the learners.


🏆 Organize mini-hackathons to provide hands-on experience and encourage practical learning.


🔗 Foster more interactions and connections between participants to build a strong learning community.


✍ Encourage participants to write blogs daily to document their learning and share insights.


🎤 Motivate participants to give talks in class and other communities to build confidence.

📝 Other Learnings & Suggestions

📵 Avoid creating WhatsApp groups for communication, as the 1024 member limit makes it difficult to manage multiple groups.


✉ Telegram works fine for now, but explore using mailing lists as an alternative for structured discussions.


🔕 Mute groups when necessary to prevent unnecessary messages like “Hi, Hello, Good Morning.”


📢 Teach participants how to join mailing lists like ChennaiPy and KanchiLUG and guide them on asking questions in forums like Tamil Linux Community.


📝 Show participants how to create a free blog on platforms like dev.to or WordPress to share their learning journey.


🛠 Avoid spending too much time explaining everything in-depth, as participants should start coding a small project by the 5th or 6th class.


📌 Present topics as solutions to project ideas or real-world problem statements instead of just theory.


👤 Encourage using names when addressing people, rather than calling them “Sir” or “Madam,” to maintain an equal and friendly learning environment.


💸 Zoom is costly, and since only around 50 people complete the training, consider alternatives like Jitsi or Google Meet for better cost-effectiveness.

Will try to incorporate these learnings in our upcoming sessions.

🚀 Let’s make this learning experience engaging, interactive, and impactful! 🎯

A Step-by-Step Guide to LLM Function Calling in Python

By: angu10
12 February 2025 at 23:06

Function calling allows Claude to interact with external functions and tools in a structured way. This guide will walk you through implementing function calling with Claude using Python, complete with examples and best practices.

Prerequisites

To get started, you'll need:

  • Python 3.7+
  • anthropic Python package
  • A valid API key from Anthropic

Basic Setup

from anthropic import Anthropic
import json
# Initialize the client
anthropic = Anthropic(api_key='your-api-key')

Defining Functions

function_schema = {
    "name": "get_weather",
    "description": "Get the current weather for a specific location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City name or coordinates"
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "Temperature unit"
            }
        },
        "required": ["location"]
    }
}

Making Function Calls

A Step-by-Step Guide to LLM Function Calling in Python
Function calling allows Claude to interact with external functions and tools in a structured way. This guide will walk you through implementing function calling with Claude using Python, complete with examples and best practices.
Prerequisites
To get started, you'll need:
Python 3.7+
anthropic Python package
A valid API key from Anthropic

Basic Setup
from anthropic import Anthropic
import json
# Initialize the client
anthropic = Anthropic(api_key='your-api-key')
Defining Functions
function_schema = {
    "name": "get_weather",
    "description": "Get the current weather for a specific location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City name or coordinates"
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "Temperature unit"
            }
        },
        "required": ["location"]
    }
}
Making Function Calls
def get_weather(location, unit="celsius"):
    # This is a mock implementation but you can all call your API
    return {
        "location": location,
        "temperature": 22 if unit == "celsius" else 72,
        "conditions": "sunny"
    }
def process_function_call(message):
    try:
        # Parse the function call parameters
        params = json.loads(message.content)
        # Call the appropriate function
        if message.name == "get_weather":
            result = get_weather(**params)
            return json.dumps(result)
        else:
            raise ValueError(f"Unknown function: {message.name}")
    except Exception as e:
        return json.dumps({"error": str(e)})
# Example conversation with function calling
messages = [
    {
        "role": "user",
        "content": "What's the weather like in Paris?"
    }
]
while True:
    response = anthropic.messages.create(
        model="claude-3-5-haiku-latest",
        messages=messages,
        tools=[function_schema]
    )
    # Check if Claude wants to call a function
    if response.tool_calls:
        for tool_call in response.tool_calls:
            # Execute the function
            result = process_function_call(tool_call)
            # Add the function result to the conversation
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": tool_call.name,
                "content": result
            })
    else:
        # Normal response - print and break
        print(response.content)
        break

Best Practices

  1. Clear Function Descriptions
  • Write detailed descriptions for your functions
  • Specify parameter types and constraints clearly
  • Include examples in the descriptions when helpful
  1. Input Validation
  • Validate all function inputs before processing
  • Return meaningful error messages
  • Handle edge cases gracefully
  1. Response Formatting
  • Return consistent JSON structures
  • Include status indicators in responses
  • Format error messages uniformly

4 . Security Considerations

  • Validate and sanitize all inputs
  • Implement rate limiting if needed
  • Use appropriate authentication
  • Don't expose sensitive information in function descriptions

Conclusion

Function calling with Claude enables powerful integrations between the language model and external tools. By following these best practices and implementing proper error handling, you can create robust and reliable function-calling implementations.

Effortless Git Repo Switching with a Simple Bash Function!

By: krishna
12 February 2025 at 10:36

Why I Created This Function

At first, I used an alias

alias gitdir="cd ~/Git/" (This means gitdir switches to the ~/Git/ directory, but I wanted it to switch directly to a repository.)
So, I wrote a Bash function.

Write Code to .bashrc File

The .bashrc file runs when a new terminal window is opened.
So, we need to write the function inside this file.

Code

gitrepo() {
    # Exact Match
    repoList=$(ls $HOME/Git)
    if [ -n "$(echo "$repoList" | grep -w $1)" ]; then
	cd $HOME/Git/$1
    else
	# Relevant Match
	getRepoName=$(echo "$repoList" | grep -i -m 1 $1)
	
	if [ -n "$getRepoName" ]; then
	    cd "$HOME/Git/$getRepoName"
	else
	    echo "Repository Not Founded"
	    cd $HOME/Git
	fi
	
    fi   
}

Code Explanation

The $repoList variable stores the list of directories inside the Git folder.

Function Logic Has Two Parts:

  • Exact Match
  • Relevant Match

Exact Match

if [ -n "$(echo "$repoList" | grep -w $1)" ]; then
	cd $HOME/Git/$1

If condition: The $repoList variable parses input for grep.

  • grep -w matches only whole words.
  • $1 is the function’s argument in bash.
  • -n checks if a variable is not empty. Example syntax:
    [ a != "" ] is equivalent to [ -n a ]

Relevant Match

getRepoName=$(echo "$repoList" | grep -i -m 1 $1)
	if [ -n "$getRepoName" ]; then
	    cd "$HOME/Git/$getRepoName"

Relevant search: If no Exact Match is found, this logic is executed next.

getRepoName="$repoList" | grep -i -m 1 $1
  • -i ignores case sensitivity.
  • -m 1 returns only the first match.

Example of -m with grep:
ls | grep i3
It returns i3WM and i3status, but -m 1 ensures only i3WM is selected.

No Match

If no match is found, it simply changes the directory to the Git folder.

	else
	    echo "Repository Not Founded"
	    cd $HOME/Git

What I Learned

  • Basics of Bash functions
  • How to use .bashrc and reload changes.

📢 Python Learning 2.0 in Tamil – Call for Participants! 🚀

10 February 2025 at 07:58

After an incredible year of Python learning Watch our journey here, we’re back with an all new approach for 2025!

If you haven’t subscribed to our channel, don’t miss to do it ? Support Us by subscribing

This time, we’re shifting gears from theory to practice with mini projects that will help you build real-world solutions. Study materials will be shared beforehand, and you’ll work hands-on to solve practical problems building actual projects that showcase your skills.

🔑 What’s New?

✅ Real-world mini projects
✅ Task-based shortlisting process
✅ Limited seats for focused learning
✅ Dedicated WhatsApp group for discussions & mentorship
✅ Live streaming of sessions for wider participation
✅ Study materials, quizzes, surprise gifts, and more!

📋 How to Join?

  1. Fill the below RSVP – Open for 20 days (till – March 2) only!
  2. After RSVP closes, shortlisted participants will receive tasks via email.
  3. Complete the tasks to get shortlisted.
  4. Selected students will be added to an exclusive WhatsApp group for intensive training.
  5. It’s a COST-FREE learning. We require your time, effort and support.
  6. Course start date will be announced after RSVP.

📜 RSVP Form

☎ How to Contact for Queries ?

If you have any queries, feel free to message in whatsapp, telegram, signal on this number 9176409201.

You can also mail me at learnwithjafer@gmail.com

Follow us for more oppurtunities/updates and more…

Don’t miss this chance to level up your Python skills Cost Free with hands-on projects and exciting rewards! RSVP now and be part of Python Learning 2.0! 🚀

Our Previous Monthly meets – https://www.youtube.com/watch?v=cPtyuSzeaa8&list=PLiutOxBS1MizPGGcdfXF61WP5pNUYvxUl&pp=gAQB

Our Previous Sessions,

Postgres – https://www.youtube.com/watch?v=04pE5bK2-VA&list=PLiutOxBS1Miy3PPwxuvlGRpmNo724mAlt&pp=gAQB

Python – https://www.youtube.com/watch?v=lQquVptFreE&list=PLiutOxBS1Mizte0ehfMrRKHSIQcCImwHL&pp=gAQB

Docker – https://www.youtube.com/watch?v=nXgUBanjZP8&list=PLiutOxBS1Mizi9IRQM-N3BFWXJkb-hQ4U&pp=gAQB

Note: If you wish to support me for this initiative please share this with your friends, students and those who are in need.

Learning Notes #70 – RUFF An extremely fast Python linter and code formatter, written in Rust.

9 February 2025 at 11:00

In the field of Python development, maintaining clean, readable, and efficient code is needed.

The Ruff Python package is a faster linter and code formatter designed to boost code quality and developer productivity. Written in Rust, Ruff stands out for its blazing speed and comprehensive feature set.

This blog will delve into Ruff’s features, usage, and how it compares to other popular Python linters and formatters like flake8, pylint, and black.

What is Ruff?

Ruff is an extremely fast Python linter and code formatter that provides linting, code formatting, and static code analysis in a single package. It supports a wide range of rules out of the box, covering various Python standards and style guides.

Key Features of Ruff

  1. Lightning-fast Performance: Written in Rust, Ruff is significantly faster than traditional Python linters.
  2. All-in-One Tool: Combines linting, formatting, and static analysis.
  3. Extensive Rule Support: Covers rules from flake8, isort, pyflakes, pylint, and more.
  4. Customizable: Allows configuration of rules to fit specific project needs.
  5. Seamless Integration: Works well with CI/CD pipelines and popular code editors.

Installing Ruff


# Using pip
pip install ruff

# Using Homebrew (macOS/Linux)
brew install ruff

# Using UV
uv add ruff

Basic Usage

1. Linting a python file

# Lint a single file
ruff check app.py

# Lint an entire directory
ruff check src/

2. Auto Fixing Issues

ruff check src/ --fix

3. Formatting Code

While Ruff primarily focuses on linting, it also handles some formatting tasks

ruff format src/

Configuration

Ruff can be configured using a pyproject.toml file

[tool.ruff]
line-length = 88
exclude = ["migrations"]
select = ["E", "F", "W"]  # Enable specific rule categories
ignore = ["E501"]          # Ignore specific rules

Examples

import sys
import os

print("Hello World !")


def add(a, b):
    result = a + b
    return a

x= 1
y =2
print(x+y)

def append_to_list(value, my_list=[]):
    my_list.append(value)
    return my_list

def append_to_list(value, my_list=[]):
    my_list.append(value)
    return my_list

  1. Identifying Unused Imports
  2. Auto-fixing Imports
  3. Sorting Imports
  4. Detecting Unused Variables
  5. Enforcing Code Style (PEP 8 Violations)
  6. Detecting Mutable Default Arguments
  7. Fixing Line Length Issues

Integrating Ruff with Pre-commit

To ensure code quality before every commit, integrate Ruff with pre-commit

Step 1: Install Pre-Commit

pip install pre-commit

Step 2: Create a .pre-commit-config.yaml file

repos:
  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.1.0  # Use the latest version
    hooks:
      - id: ruff

Step 3: Install the Pre-commit Hook

pre-commit install

Step 4: Test the Hook

pre-commit run --all-files

This setup ensures that Ruff automatically checks your code for linting issues before every commit, maintaining consistent code quality.

When to Use Ruff

  • Large Codebases: Ideal for projects with thousands of files due to its speed.
  • CI/CD Pipelines: Reduces linting time, accelerating build processes.
  • Code Reviews: Ensures consistent coding standards across teams.
  • Open Source Projects: Simplifies code quality management.
  • Pre-commit Hooks: Ensures code quality before committing changes.

Integrating Ruff with CI/CD

name: Lint Code

on: [push, pull_request]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Set up Python
      uses: actions/setup-python@v2
      with:
        python-version: '3.10'
    - name: Install Ruff
      run: pip install ruff
    - name: Lint Code
      run: ruff check .

Ruff is a game-changer in the Python development ecosystem. Its unmatched speed, comprehensive rule set, and ease of use make it a powerful tool for developers aiming to maintain high code quality.

Whether you’re working on small scripts or large-scale applications, Ruff can streamline your linting and formatting processes, ensuring clean, efficient, and consistent code.

Python variable

By: krishna
7 February 2025 at 06:34

This blog explores Python variable usage and functionalities.

Store All Data Types

First, let’s list the data types supported in Python.

Python Supported Data Types

  • Primitive Data Types:
    • Integer
    • Float
    • String
    • Boolean
    • None
  • Non-Primitive Data Types:
    • List
    • Tuple
    • Set
    • Bytes and ByteArray
    • Complex
    • Class Object

Store and Experiment with All Data Types

phone = 1234567890
pi    = 3.14
name  = "python programming"
using_python   = True
example_none   = None
simple_list    = [1,2,3]
simple_dict    = {"name":"python"}
simple_set     = {1,1,1,1,1.0}
simple_tuple   = (1,2,3)
complex_number = 3+5j

print("number     = ",phone)
print("float      = ",pi)
print("boolean    = ",using_python)
print("None       = ",example_none)
print("list       = ",simple_list)
print("dictionary = ",simple_dict)
print("set        = ",simple_set)
print("tuple      = ",simple_tuple)
print("complex    = ",complex_number)

Output

number     =  1234567890
float      =  3.14
boolean    =  True
None       =  None
list       =  [1, 2, 3]
dictionary =  {'name': 'python'}
set        =  {1}
tuple      =  (1, 2, 3)
complex    =  (3+5j)

Type Function

This function is used to print the data type of a variable.

print(type(phone))
print(type(pi))
print(type(using_python))
print(type(simple_list))
print(type(simple_dict))
print(type(simple_set))
print(type(simple_tuple))
print(type(complex_number))
print(type(example_none))

Output

<class 'int'>
<class 'float'>
<class 'bool'>
<class 'list'>
<class 'dict'>
<class 'set'>
<class 'tuple'>
<class 'complex'>
<class 'NoneType'>

Bytes and ByteArray

These data types help modify large binary datasets like audio and image processing.

The memoryview() function allows modifying this data without zero-copy access to memory.

bytes is immutable, where as bytearray is mutable.

bytes

b1 = bytes([1,2,3,4])
bo = memoryview(b1)
print("byte example :",bo[0])

Output

byte example : 1

bytearray

b2 = bytearray(b"aaaa")
bo = memoryview(b2)
print("byte array :",b2)
bo[1] = 98
bo[2] = 99
bo[3] = 100
print("byte array :",b2)

Output

byte array : bytearray(b'aaaa')
byte array : bytearray(b'abcd')

Other Data Types

Frozenset

A frozenset is similar to a normal set, but it is immutable.

test_frozenset = frozenset([1,2,1])
for i in test_frozenset:
    print("frozen set item :",i)

Output

frozen set item : 1
frozen set item : 2

Range

The range data type specifies a number range (e.g., 1-10).

It is mostly used in loops and mathematical operations.

a = range(3)
print("a = ",a)
print("type = ",type(a))

Output

a =  range(0, 3)
type =  <class 'range'>

Type Casting

Type casting is converting a data type into another. Try explicit type casting to change data types.

value = 1
numbers = [1,2,3]
print("type casting int     : ",type(int(value)))
print("type casting float   : ",type(float(value)))
print("type casting string  : ",type(str(value)))
print("type casting boolean : ",type(bool("True")))
print("type casting list    : ",type(list(numbers)))
print("type casting set     : ",type(set(numbers)))
print("type casting tuple   : ",type(tuple(numbers)))

Output

type casting int     :  <class 'int'>
type casting float   :  <class 'float'>
type casting string  :  <class 'str'>
type casting boolean :  <class 'bool'>
type casting list    :  <class 'list'>
type casting set     :  <class 'set'>
type casting tuple   :  <class 'tuple'>

Delete Variable

Delete an existing variable using Python’s del keyword.

temp = 1
print("temp variable is : ",temp)
del temp
# print(temp)  => throw NameError : temp not defined

Output

temp variable is :  1

Find Variable Memory Address

Use the id() function to find the memory address of a variable.

temp = "hi"
print("address of temp variable : ",id(temp))

Output

address of temp variable :  140710894284672

Constants

Python does not have a direct keyword for constants, but namedtuple can be used to create constants.

from collections import namedtuple
const = namedtuple("const",["PI"])
math = const(3.14)

print("namedtuple PI = ",math.PI)
print("namedtuple type =",type(math))

Output

namedtuple PI =  3.14
namedtuple type = <class '__main__.const'>

Global Keyword

Before understanding global keyword, understand function-scoped variables.

  • Function inside variable are stored in stack memory.
  • A function cannot modify an outside (global) variable directly, but it can access it.
  • To create a reference to a global variable inside a function, use the global keyword.
message = "Hi"
def dummy():
    global message 
    message = message+" all"

dummy()
print(message)

Output

Hi all

Explicit Type Hint

Type hints are mostly used in function parameters and arguments.

They improve code readability and help developers understand variable types.

-> float : It indicates the return data type.

def area_of_circle(radius :float) -> float:
    PI :float = 3.14
    return PI * (radius * radius)

print("calculate area of circle = ",area_of_circle(2))

Output

calculate area of circle =  12.56

20 Essential Git Command-Line Tricks Every Developer Should Know

5 February 2025 at 16:14

Git is a powerful version control system that every developer should master. Whether you’re a beginner or an experienced developer, knowing a few handy Git command-line tricks can save you time and improve your workflow. Here are 20 essential Git tips and tricks to boost your efficiency.

1. Undo the Last Commit (Without Losing Changes)

git reset --soft HEAD~1

If you made a commit but want to undo it while keeping your changes, this command resets the last commit but retains the modified files in your staging area.

This is useful when you realize you need to make more changes before committing.

If you also want to remove the changes from the staging area but keep them in your working directory, use,

git reset HEAD~1

2. Discard Unstaged Changes

git checkout -- <file>

Use this to discard local changes in a file before staging. Be careful, as this cannot be undone! If you want to discard all unstaged changes in your working directory, use,

git reset --hard HEAD

3. Delete a Local Branch

git branch -d branch-name

Removes a local branch safely if it’s already merged. If it’s not merged and you still want to delete it, use -D

git branch -D branch-name

4. Delete a Remote Branch

git push origin --delete branch-name

Deletes a branch from the remote repository, useful for cleaning up old feature branches. If you mistakenly deleted the branch and want to restore it, you can use

git checkout -b branch-name origin/branch-name

if it still exists remotely.

5. Rename a Local Branch

git branch -m old-name new-name

Useful when you want to rename a branch locally without affecting the remote repository. To update the remote reference after renaming, push the renamed branch and delete the old one,

git push origin -u new-name
git push origin --delete old-name

6. See the Commit History in a Compact Format

git log --oneline --graph --decorate --all

A clean and structured way to view Git history, showing branches and commits in a visual format. If you want to see a detailed history with diffs, use

git log -p

7. Stash Your Changes Temporarily

git stash

If you need to switch branches but don’t want to commit yet, stash your changes and retrieve them later with

git stash pop

To see all stashed changes

git stash list

8. Find the Author of a Line in a File

git blame file-name

Shows who made changes to each line in a file. Helpful for debugging or reviewing historical changes. If you want to ignore whitespace changes

git blame -w file-name

9. View a File from a Previous Commit

git show commit-hash:path/to/file

Useful for checking an older version of a file without switching branches. If you want to restore the file from an old commit

git checkout commit-hash -- path/to/file

10. Reset a File to the Last Committed Version

git checkout HEAD -- file-name

Restores the file to the last committed state, removing any local changes. If you want to reset all files

git reset --hard HEAD

11. Clone a Specific Branch

git clone -b branch-name --single-branch repository-url

Instead of cloning the entire repository, this fetches only the specified branch, saving time and space. If you want all branches but don’t want to check them out initially:

git clone --mirror repository-url

12. Change the Last Commit Message

git commit --amend -m "New message"

Use this to correct a typo in your last commit message before pushing. Be cautious—if you’ve already pushed, use

git push --force-with-lease

13. See the List of Tracked Files

git ls-files

Displays all files being tracked by Git, which is useful for auditing your repository. To see ignored files

git ls-files --others --ignored --exclude-standard

14. Check the Difference Between Two Branches

git diff branch-1..branch-2

Compares changes between two branches, helping you understand what has been modified. To see only file names that changed

git diff --name-only branch-1..branch-2

15. Add a Remote Repository

git remote add origin repository-url

Links a remote repository to your local project, enabling push and pull operations. To verify remote repositories

git remote -v

16. Remove a Remote Repository

git remote remove origin

Unlinks your repository from a remote source, useful when switching remotes.

17. View the Last Commit Details

git show HEAD

Shows detailed information about the most recent commit, including the changes made. To see only the commit message

git log -1 --pretty=%B

18. Check What’s Staged for Commit

git diff --staged

Displays changes that are staged for commit, helping you review before finalizing a commit.

19. Fetch and Rebase from a Remote Branch

git pull --rebase origin main

Combines fetching and rebasing in one step, keeping your branch up-to-date cleanly. If conflicts arise, resolve them manually and continue with

git rebase --continue

20. View All Git Aliases

git config --global --list | grep alias

If you’ve set up aliases, this command helps you see them all. Aliases can make your Git workflow faster by shortening common commands. For example

git config --global alias.co checkout

allows you to use git co instead of git checkout.

Try these tricks in your daily development to level up your Git skills!

Learn REST-API

By: krishna
5 February 2025 at 06:01

REST API

A REST API (Representational State Transfer) is a web service that allows different applications to communicate over the internet using standard HTTP methods like GET, POST, PUT, and DELETE. It follows REST principles, making it lightweight, scalable, and easy to use.

REST API Principles

Client-Server Architecture

A REST API follows a client-server architecture, where the client and server are separate. The client sends requests, and the server processes them and responds, allowing different clients like web and mobile apps to communicate with the same backend.

Statelessness

Before understanding statelessness, you need to understand statefulness.

  • Statefulness: The server stores and manages user session data, such as authentication details and recent activities.
  • Statelessness: The server does not store any information. Each request is independent, and the client must include all necessary data, like authentication details and query parameters, in every request.

This behavior makes the server scalable, reliable, and reduces server load.

Caching

Caching improves performance by storing responses that can be reused, reducing the need for repeated requests to the server. This minimizes response time and server load.

Uniform Interface

A uniform interface ensures consistency by using standard HTTP methods like GET, POST, PUT, and DELETE. This makes the API predictable and easy to use.

Layered System

A layered system divides tasks into separate layers, like security, caching, and user requests. The client only interacts with the top layer, making the system easier to manage and more flexible.

Start To Code

I use Node.js and some popular packages:

  • Express: A Node.js web framework used for building web applications and APIs.
  • Joi: A package used to validate user input, ensuring data integrity and security.

basic code

const express = require("express");
const app = express();
const joi = require("joi");
app.use(express.json());

//data
customers = [
    {name : "user1", id : 1},
    {name : "user2", id : 2},
    {name : "user3", id : 3}
]

//listen
const port = process.env.PORT || 8080;
app.listen(port, ()=> console.log("listening on ",port));

//function
function validateUserName(customer){
    schema = joi.object({
	name : joi.string().min(3).required()
    });
    return schema.validate(customer)
}

GET

GET is used to retrieve data from the server. the response code is 200 if successful.

app.get('/api/customers',(req,res)=>{
    res.send(customers);
});

get specific user details

app.get('/api/customer/:id', (req,res)=>{
    const user_details = customers.find(user => req.params.id == user.id );
    if(!user_details){
        res.status(404).send("Data Not Found");
    }else{
        res.status(200).send(user_details)
    }
});

POST

The POST method is used to upload data to the server. The response code is 201, and I used the validateUserName function to validate a username.

app.post('/api/customer/add',(req, res)=>{
    const {error} = validateUserName(req.body);
    if(error){	
	res.status(400).send(error.details[0].message);
    }
    else{
	customer = {
	    name : req.body.name,
	    id   : customers.length + 1
	}
	customers.push(customer);
	res.status(201).send("data inserted successfully");
    }
});

 

PATCH

The PATCH method is used to update existing data partially. To update the entire user data, the PUT method should be used.

app.patch('/api/customer/:id', (req, res)=>{
    const customer = customers.find(user => user.id == req.params.id);
    const {error} = validateUserName(req.body);
    if(!customer){
	res.status(404).send("Data Not Found");
    }
    else if(error){
	console.log(error)
	res.status(400).send(error.details[0].message);
    }
    else{
	customer.name = req.body.name;
	res.status(200).send("successfully updated");
    }
});

 

DELETE

The DELETE method is used to remove user data.

app.delete('/api/customer/:id', (req,res)=>{
    const user = customers.find(user => user.id == req.params.id);
    index = customers.indexOf(user);
    if(!user){
	console.log("test")
	res.status(404).send("Data Not Found");
    }
    else{
	customers.splice(index,1);
	res.status(200).send("successfully deleted");
    }
});

 

What I Learned

CRUD Operations with REST API

I learned the basics of REST API and CRUD operations, including the uniform methods GET, POST, PUT, PATCH, and DELETE.

Status Codes

REST APIs strictly follow status codes:

  • 200 – OK
  • 201 – Created successfully
  • 400 – Bad request
  • 204 – No content
  • 404 – Page not found

Joi Package

For server-side validation, the Joi package is used. It helps verify user data easily.

Middleware

Using app.use(express.json()) as middleware ensures that for POST, PATCH, and PUT methods, JSON-formatted user data is parsed into an object accessible via req.body.

 

Minimal Typing Practice Application in Python

By: krishna
30 January 2025 at 09:40

Introduction

This is a Python-based single-file application designed for typing practice. It provides a simple interface to improve typing accuracy and speed. Over time, this minimal program has gradually increased my typing skill.

What I Learned from This Project

  • 2D Array Validation
    I first simply used a 1D array to store user input, but I noticed some issues. After implementing a 2D array, I understood why the 2D array was more appropriate for handling user inputs.
  • Tkinter
    I wanted to visually see and update correct, wrong, and incomplete typing inputs, but I didn’t know how to implement it in the terminal. So, I used a simple Tkinter gui window

Run This Program

It depends on the following applications:

  • Python 3
  • python3-tk

Installation Command on Debian-Based Systems

sudo apt install python3 python3-tk

Clone repository and run program

git clone https://github.com/github-CS-krishna/TerminalTyping
cd TerminalTyping
python3 terminalType.py

Links

For more details, refer to the README documentation on GitHub.

This will help you understand how to use it.

source code(github)

SelfHost #2 | BugSink – An Error Tracking Tool

26 January 2025 at 16:41

I am regular follower of https://selfh.st/ , last week they showcased about BugSink. Bugsink is a tool to track errors in your applications that you can self-host. It’s easy to install and use, is compatible with the Sentry SDK, and is scalable and reliable.

When an application breaks, finding and fixing the root cause quickly is critical. Hosted error tracking tools often make you trade privacy for convenience, and they can be expensive. On the other hand, self-hosted solutions are an alternative, but they are often a pain to set up and maintain.

What Is Error Tracking?

When code is deployed in production, errors are inevitable. They can arise from a variety of reasons like bugs in the code, network failures, integration mismatches, or even unforeseen user behavior. To ensure smooth operation and user satisfaction, error tracking is essential.

Error tracking involves monitoring and recording errors in your application code, particularly in production environments. A good error tracker doesn’t just log errors; it contextualizes them, offering insights that make troubleshooting straightforward.

Here are the key benefits of error tracking

  • Early Detection: Spot issues before they snowball into critical outages.
  • Context-Rich Reporting: Understand the “what, when, and why” of an error.
  • Faster Debugging: Detailed stack traces make it easier to pinpoint root causes.

Effective error tracking tools allow developers to respond to errors proactively, minimizing user impact.

Why Bugsink?

Bugsink takes error tracking to a new level by prioritizing privacy, simplicity, and compatibility.

1. Built for Self-Hosting

Unlike many hosted error tracking tools that require sensitive data to be shared with third-party servers, Bugsink is self-hosted. This ensures you retain full control over your data, a critical aspect for privacy-conscious teams.

2. Easy to Set Up and Manage

Whether you’re deploying it on your local server or in the cloud, the experience is smooth.

3. Resource Efficiency

Bugsink is designed to be lightweight and efficient. It doesn’t demand hefty server resources, making it an ideal choice for startups, small teams, or resource-constrained environments.

4. Compatible with Sentry

If you’ve used Sentry before, you’ll feel right at home with Bugsink. It offers Sentry compatibility, allowing you to migrate effortlessly or use it alongside existing tools. This compatibility also means you can leverage existing SDKs and integrations.

5. Proactive Notifications

Bugsink ensures you’re in the loop as soon as something goes wrong. Email notifications alert you the moment an error occurs, enabling swift action. This proactive approach reduces the mean time to resolution (MTTR) and keeps users happy.

Docs: https://www.bugsink.com/docs/

In this blog, i jot down my experience on using BugSink with Python.

1. Run using Docker

There are many ways proposed for BugSink installation, https://www.bugsink.com/docs/installation/. In this blog, i am trying using docker.


docker pull bugsink/bugsink:latest

docker run \
  -e SECRET_KEY=ab4xjs5wfnP2XrUwRJPtmk1sEnMcx9d2mta8vtbdZ4oOtvy5BJ \
  -e CREATE_SUPERUSER=admin:admin \
  -e PORT=8000 \
  -p 8000:8000 \
  bugsink/bugsink

2. Log In, Create a Team, Project

The Application will run at port 8000.

Login using admin/admin. Create a new team, by clicking the top right button.

Give a name to the team,

then create a project, under this team,

After creating a project, you will be able to see like below,

You will get an individual DSN , like http://9d0186dd7b854205bed8d60674f349ea@localhost:8000/1.

3. Attaching DSN to python app



import sentry_sdk

sentry_sdk.init(
    "http://d76bc0ccf4da4423b71d1fa80d6004a3@localhost:8000/1",

    send_default_pii=True,
    max_request_body_size="always",
    traces_sample_rate=0,
)

def divide(num1, num2):
    return num1/num2

divide(1, 0)


The above program, will throw an Zero Division Error, which will be reflected in BugSink application.

The best part is you will get the value of variables at that instance. In this example, you can see values of num1 and num2.

There are lot more awesome features out there https://www.bugsink.com/docs/.

RSVP for RabbitMQ: Build Scalable Messaging Systems in Tamil

24 January 2025 at 11:21

Hi All,

Invitation to RabbitMQ Session

🔹 Topic: RabbitMQ: Asynchronous Communication
🔹 Date: Feb 2 Sunday
🔹 Time: 10:30 AM to 1 PM
🔹 Venue: Online. Will be shared in mail after RSVP.

Join us for an in-depth session on RabbitMQ in தமிழ், where we’ll explore,

  • Message queuing fundamentals
  • Connections, channels, and virtual hosts
  • Exchanges, queues, and bindings
  • Publisher confirmations and consumer acknowledgments
  • Use cases and live demos

Whether you’re a developer, DevOps enthusiast, or curious learner, this session will empower you with the knowledge to build scalable and efficient messaging systems.

📌 Don’t miss this opportunity to level up your messaging skills!

RSVP closed !

Our Previous Monthly meetshttps://www.youtube.com/watch?v=cPtyuSzeaa8&list=PLiutOxBS1MizPGGcdfXF61WP5pNUYvxUl&pp=gAQB

Our Previous Sessions,

  1. Python – https://www.youtube.com/watch?v=lQquVptFreE&list=PLiutOxBS1Mizte0ehfMrRKHSIQcCImwHL&pp=gAQB
  2. Docker – https://www.youtube.com/watch?v=nXgUBanjZP8&list=PLiutOxBS1Mizi9IRQM-N3BFWXJkb-hQ4U&pp=gAQB
  3. Postgres – https://www.youtube.com/watch?v=04pE5bK2-VA&list=PLiutOxBS1Miy3PPwxuvlGRpmNo724mAlt&pp=gAQB

Our Social Handles,

Learning Notes #61 – Undo a git pull

18 January 2025 at 16:04

Today, i came across a blog on undo a git pull. In this blog, i have reiterated the blog in other words.

Mistakes happen. You run a git pull and suddenly find your repository in a mess. Maybe conflicts arose, or perhaps the changes merged from the remote branch aren’t what you expected.

Fortunately, Git’s reflog comes to the rescue, allowing you to undo a git pull and restore your repository to its previous state. Here’s how you can do it.

Understanding Reflog


Reflog is a powerful feature in Git that logs every update made to the tips of your branches and references. Even actions like resets or rebases leave traces in the reflog. This makes it an invaluable tool for troubleshooting and recovering from mistakes.

Whenever you perform a git pull, Git updates the branch pointer, and the reflog records this action. By examining the reflog, you can identify the exact state of your branch before the pull and revert to it if needed.

Step By Step Guide to UNDO a git pull

1. Check Your Current State Ensure you’re aware of the current state of your branch. If you have uncommitted changes, stash or commit them to avoid losing any work.


git stash
# or
git add . && git commit -m "Save changes before undoing pull"

2. Inspect the Reflog View the recent history of your branch using the reflog,


git reflog

This command will display a list of recent actions, showing commit hashes and descriptions. For example,


0a1b2c3 (HEAD -> main) HEAD@{0}: pull origin main: Fast-forward
4d5e6f7 HEAD@{1}: commit: Add new feature
8g9h0i1 HEAD@{2}: checkout: moving from feature-branch to main

3. Identify the Pre-Pull Commit Locate the commit hash of your branch’s state before the pull. In the above example, it’s 4d5e6f7, which corresponds to the commit made before the git pull.

4. Reset to the Previous Commit Use the git reset command to move your branch back to its earlier state,


git reset <commit-hash>

By default, it’s mixed so changes wont be removed but will be in staging.

The next time a pull operation goes awry, don’t panic—let the reflog guide you back to safety!

Learning Notes #56 – Push vs Pull Architecture

15 January 2025 at 16:16

Today, i learnt about push vs pull architecture, the choice between push and pull architectures can significantly influence system performance, scalability, and user experience. Both approaches have their unique advantages and trade-offs. Understanding these architectures and their ideal use cases can help developers and architects make informed decisions.

What is Push Architecture?

Push architecture is a communication pattern where the server actively sends data to clients as soon as it becomes available. This approach eliminates the need for clients to repeatedly request updates.

How it Works

  • The server maintains a connection with the client.
  • When new data is available, the server “pushes” it to the connected clients.
  • In a message queue context, producers send messages to a queue, and the queue actively delivers these messages to subscribed consumers without explicit requests.

Examples

  • Notifications in Mobile Apps: Users receive instant updates, such as chat messages or alerts.
  • Stock Price Updates: Financial platforms use push to provide real-time market data.
  • Message Queues with Push Delivery: Systems like RabbitMQ or Kafka configured to push messages to consumers.
  • Server-Sent Events (SSE) and WebSockets: These are common implementations of push.

Advantages

  • Low Latency: Clients receive updates instantly, improving responsiveness.
  • Reduced Redundancy: No need for clients to poll servers frequently, reducing bandwidth consumption.

Challenges

  • Complexity: Maintaining open connections, especially for many clients, can be resource-intensive.
  • Scalability: Requires robust infrastructure to handle large-scale deployments.

What is Pull Architecture?

Pull architecture involves clients actively requesting data from the server. This pattern is often used when real-time updates are not critical or predictable intervals suffice.

How it Works

  • The client periodically sends requests to the server.
  • The server responds with the requested data.
  • In a message queue context, consumers actively poll the queue to retrieve messages when ready.

Examples

  • Web Browsing: A browser sends HTTP requests to fetch pages and resources.
  • API Data Fetching: Applications periodically query APIs to update information.
  • Message Queues with Pull Delivery: Systems like SQS or Kafka where consumers poll for messages.
  • Polling: Regularly checking a server or queue for updates.

Advantages

  • Simpler Implementation: No need for persistent connections; standard HTTP requests or queue polling suffice.
  • Server Load Control: The server can limit the frequency of client requests to manage resources better.

Challenges

  • Latency: Updates are only received when the client requests them, which might lead to delays.
  • Increased Bandwidth: Frequent polling can waste resources if no new data is available.

AspectPush ArchitecturePull Architecture
LatencyLow – Real-time updatesHigher – Dependent on polling frequency
ComplexityHigher – Requires persistent connectionsLower – Simple request-response model
Bandwidth EfficiencyEfficient – Updates sent only when neededLess efficient – Redundant polling possible
ScalabilityChallenging – High client connection overheadEasier – Controlled client request intervals
Message Queue FlowMessages actively delivered to consumersConsumers poll the queue for messages
Use CasesReal-time applications (e.g., chat, live data)Non-critical updates (e.g., periodic reports)

Learning Notes #54 – Architecture Decision Records

14 January 2025 at 02:35

Last few days, i was learning on how to make a accountable decision on deciding technical stuffs. Then i came across ADR. So far i haven’t used or seen used by our team. I think this is a necessary step to be incorporated to make accountable decisions. In this blog i share details on ADR for my future reference.

What is an ADR?

An Architectural Decision Record (ADR) is a concise document that captures a single architectural decision, its context, the reasoning behind it, and its consequences. ADRs help teams document, share, and revisit architectural choices, ensuring transparency and better collaboration.

Why Use ADRs?

  1. Documentation: ADRs serve as a historical record of why certain decisions were made.
  2. Collaboration: They promote better understanding across teams.
  3. Traceability: ADRs link architectural decisions to specific project requirements and constraints.
  4. Accountability: They clarify who made a decision and when.
  5. Change Management: ADRs help evaluate the impact of changes and facilitate discussions around reversals or updates.

ADR Structure

A typical ADR document follows a standard format. Here’s an example:

  1. Title: A clear and concise title describing the decision.
  2. Context: Background information explaining the problem or opportunity.
  3. Decision: A summary of the chosen solution.
  4. Consequences: The positive and negative outcomes of the decision.
  5. Status: Indicates whether the decision is proposed, accepted, superseded, or deprecated.

Example:

Optimistic locking on MongoDB https://docs.google.com/document/d/1olCbicQeQzYpCxB0ejPDtnri9rWb2Qhs9_JZuvANAxM/edit?usp=sharing

References

  1. https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions
  2. https://www.infoq.com/podcasts/architecture-advice-process/
  3. Recommended: https://github.com/joelparkerhenderson/architecture-decision-record/tree/main

Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX

12 January 2025 at 09:14

Redis, a high-performance in-memory key-value store, is widely used for caching, session management, and various other scenarios where fast data retrieval is essential. One of its key features is the ability to set expiration times for keys. However, when using the SET command with the EX option, developers might encounter unexpected behaviors where the expiration time is seemingly lost. Let’s explore this issue in detail.

Understanding SET with EX

The Redis SET command with the EX option allows you to set a key’s value and specify its expiration time in seconds. For instance


SET key value EX 60

This command sets the key key to the value value and sets an expiration time of 60 seconds.

The Problem

In certain cases, the expiration time might be unexpectedly lost. This typically happens when subsequent operations overwrite the key without specifying a new expiration. For example,


SET key value1 EX 60
SET key value2

In the above sequence,

  1. The first SET command assigns a value to key and sets an expiration of 60 seconds.
  2. The second SET command overwrites the value of key but does not include an expiration time, resulting in the key persisting indefinitely.

This behavior can lead to subtle bugs, especially in applications that rely on key expiration for correctness or resource management.

Why Does This Happen?

The Redis SET command is designed to replace the entire state of a key, including its expiration. When you use SET without the EX, PX, or EXAT options, the expiration is removed, and the key becomes persistent. This behavior aligns with the principle that SET is a complete update operation.

When using Redis SET with EX, be mindful of operations that might overwrite keys without reapplying expiration. Understanding Redis’s behavior and implementing robust patterns can save you from unexpected issues, ensuring your application remains efficient and reliable.

Learning Notes #52 – Hybrid Origin Failover Pattern

12 January 2025 at 06:29

Today, i learnt about failover patterns from AWS https://aws.amazon.com/blogs/networking-and-content-delivery/three-advanced-design-patterns-for-high-available-applications-using-amazon-cloudfront/ . In this blog i jot down my understanding on this pattern for future reference,

Hybrid origin failover is a strategy that combines two distinct approaches to handle origin failures effectively, balancing speed and resilience.

The Need for Origin Failover

When an application’s primary origin server becomes unavailable, the ability to reroute traffic to a secondary origin ensures continuity. The failover process determines how quickly and effectively this switch happens. Broadly, there are two approaches to implement origin failover:

  1. Stateful Failover with DNS-based Routing
  2. Stateless Failover with Application Logic

Each has its strengths and limitations, which the hybrid approach aims to mitigate.

Stateful Failover

Stateful failover is a system that allows a standby server to take over for a failed server and continue active sessions. It’s used to create a resilient network infrastructure and avoid service interruptions.

This method relies on a DNS service with health checks to detect when the primary origin is unavailable. Here’s how it works,

  1. Health Checks: The DNS service continuously monitors the health of the primary origin using health checks (e.g., HTTP, HTTPS).
  2. DNS Failover: When the primary origin is marked unhealthy, the DNS service resolves the origin’s domain name to the secondary origin’s IP address.
  3. TTL Impact: The failover process honors the DNS Time-to-Live (TTL) settings. A low TTL ensures faster propagation, but even in the most optimal configurations, this process introduces a delay—often around 60 to 70 seconds.
  4. Stateful Behavior: Once failover occurs, all traffic is routed to the secondary origin until the primary origin is marked healthy again.

Implementation from AWS (as-is from aws blog)

The first approach is using Amazon Route 53 Failover routing policy with health checks on the origin domain name that’s configured as the origin in CloudFront. When the primary origin becomes unhealthy, Route 53 detects it, and then starts resolving the origin domain name with the IP address of the secondary origin. CloudFront honors the origin DNS TTL, which means that traffic will start flowing to the secondary origin within the DNS TTLs. The most optimal configuration (Fast Check activated, a failover threshold of 1, and 60 second DNS TTL) means that the failover will take 70 seconds at minimum to occur. When it does, all of the traffic is switched to the secondary origin, since it’s a stateful failover. Note that this design can be further extended with Route 53 Application Recovery Control for more sophisticated application failover across multiple AWS Regions, Availability Zones, and on-premises.

The second approach is using origin failover, a native feature of CloudFront. This capability of CloudFront tries for the primary origin of every request, and if a configured 4xx or 5xx error is received, then CloudFront attempts a retry with the secondary origin. This approach is simple to configure and provides immediate failover. However, it’s stateless, which means every request must fail independently, thus introducing latency to failed requests. For transient origin issues, this additional latency is an acceptable tradeoff with the speed of failover, but it’s not ideal when the origin is completely out of service. Finally, this approach only works for the GET/HEAD/OPTIONS HTTP methods, because other HTTP methods are not allowed on a CloudFront cache behavior with Origin Failover enabled.

Advantages

  • Works for all HTTP methods and request types.
  • Ensures complete switchover, minimizing ongoing failures.

Disadvantages

  • Relatively slower failover due to DNS propagation time.
  • Requires a reliable health-check mechanism.

Approach 2: Stateless Failover with Application Logic

This method handles failover at the application level. If a request to the primary origin fails (e.g., due to a 4xx or 5xx HTTP response), the application or CDN immediately retries the request with the secondary origin.

How It Works

  1. Primary Request: The application sends a request to the primary origin.
  2. Failure Handling: If the response indicates a failure (configurable for specific error codes), the request is retried with the secondary origin.
  3. Stateless Behavior: Each request operates independently, so failover happens on a per-request basis without waiting for a stateful switchover.

Implementation from AWS (as-is from aws blog)

The hybrid origin failover pattern combines both approaches to get the best of both worlds. First, you configure both of your origins with a Failover Policy in Route 53 behind a single origin domain name. Then, you configure an origin failover group with the single origin domain name as primary origin, and the secondary origin domain name as secondary origin. This means that when the primary origin becomes unavailable, requests are immediately retried with the secondary origin until the stateful failover of Route 53 kicks in within tens of seconds, after which requests go directly to the secondary origin without any latency penalty. Note that this pattern only works with the GET/HEAD/OPTIONS HTTP methods.

Advantages

  • Near-instantaneous failover for failed requests.
  • Simple to configure and doesn’t depend on DNS TTL.

Disadvantages

  • Adds latency for failed requests due to retries.
  • Limited to specific HTTP methods like GET, HEAD, and OPTIONS.
  • Not suitable for scenarios where the primary origin is entirely down, as every request must fail first.

The Hybrid Origin Failover Pattern

The hybrid origin failover pattern combines the strengths of both approaches, mitigating their individual limitations. Here’s how it works:

  1. DNS-based Stateful Failover: A DNS service with health checks monitors the primary origin and switches to the secondary origin if the primary becomes unhealthy. This ensures a complete and stateful failover within tens of seconds.
  2. Application-level Stateless Failover: Simultaneously, the application or CDN is configured to retry failed requests with a secondary origin. This provides an immediate failover mechanism for transient or initial failures.

Implementation Steps

  1. DNS Configuration
    • Set up health checks on the primary origin.
    • Define a failover policy in the DNS service, which resolves the origin domain name to the secondary origin when the primary is unhealthy.
  2. Application Configuration
    • Configure the application or CDN to use an origin failover group.
    • Specify the primary origin domain as the primary origin and the secondary origin domain as the backup.

Behavior

  • Initially, if the primary origin encounters issues, requests are retried immediately with the secondary origin.
  • Meanwhile, the DNS failover switches all traffic to the secondary origin within tens of seconds, eliminating retry latencies for subsequent requests.

Benefits of Hybrid Origin Failover

  1. Faster Failover: Immediate retries for failed requests minimize initial impact, while DNS failover ensures long-term stability.
  2. Reduced Latency: After DNS failover, subsequent requests don’t experience retry delays.
  3. High Resilience: Combines stateful and stateless failover for robust redundancy.
  4. Simplicity and Scalability: Leverages existing DNS and application/CDN features without complex configurations.

Limitations and Considerations

  1. HTTP Method Constraints: Stateless failover works only for GET, HEAD, and OPTIONS methods, limiting its use for POST or PUT requests.
  2. TTL Impact: Low TTLs reduce propagation delays but increase DNS query rates, which could lead to higher costs.
  3. Configuration Complexity: Combining DNS and application-level failover requires careful setup and testing to avoid misconfigurations.
  4. Secondary Origin Capacity: Ensure the secondary origin can handle full traffic loads during failover.

Learning Notes #51 – Postgres as a Queue using SKIP LOCKED

11 January 2025 at 06:56

Yesterday, i came across a blog from inferable.ai https://www.inferable.ai/blog/posts/postgres-skip-locked, which walkthrough about using postgres as a queue. In this blog, i jot down notes on using postgres as a queue for future references.

PostgreSQL is a robust relational database that can be used for more than just storing structured data. With the SKIP LOCKED feature introduced in PostgreSQL 9.5, you can efficiently turn a PostgreSQL table into a job queue for distributed processing.

Why Use PostgreSQL as a Queue?

Using PostgreSQL as a queue can be advantageous because,

  • Familiarity: If you’re already using PostgreSQL, there’s no need for an additional message broker.
  • Durability: PostgreSQL ensures ACID compliance, offering reliability for your job processing.
  • Simplicity: No need to manage another component like RabbitMQ or Kafka

Implementing a Queue with SKIP LOCKED

1. Create a Queue Table

To start, you need a table to store the jobs,


CREATE TABLE job_queue (
    id SERIAL PRIMARY KEY,
    job_data JSONB NOT NULL,
    status TEXT DEFAULT 'pending',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

This table has the following columns,

  • id: A unique identifier for each job.
  • job_data: The data or payload for the job.
  • status: Tracks the job’s state (‘pending’, ‘in_progress’, or ‘completed’).
  • created_at: Timestamp of job creation.

2. Insert Jobs into the Queue

Adding jobs is straightforward,


INSERT INTO job_queue (job_data)
VALUES ('{"task": "send_email", "email": "user@example.com"}');

3. Fetch Jobs for Processing with SKIP LOCKED

Workers will fetch jobs from the queue using SELECT ... FOR UPDATE SKIP LOCKED to avoid contention,

WITH next_job AS (
    SELECT id, job_data
    FROM job_queue
    WHERE status = 'pending'
    FOR UPDATE SKIP LOCKED
    LIMIT 1
)
UPDATE job_queue
SET status = 'in_progress'
FROM next_job
WHERE job_queue.id = next_job.id
RETURNING job_queue.id, job_queue.job_data;

Key Points:

  • FOR UPDATE locks the selected row to prevent other workers from picking it up.
  • SKIP LOCKED ensures locked rows are skipped, enabling concurrent workers to operate without waiting.
  • LIMIT 1 processes one job at a time per worker.

4. Mark Jobs as Completed

Once a worker finishes processing a job, it should update the job’s status,


UPDATE job_queue
SET status = 'completed'
WHERE id = $1; -- Replace $1 with the job ID

5. Delete Old or Processed Jobs

To keep the table clean, you can periodically remove completed jobs,


DELETE FROM job_queue
WHERE status = 'completed' AND created_at < NOW() - INTERVAL '30 days';

Example Worker Implementation

Here’s an example of a worker implemented in Python using psycopg2


import psycopg2
from psycopg2.extras import RealDictCursor

connection = psycopg2.connect("dbname=yourdb user=youruser")

while True:
    with connection.cursor(cursor_factory=RealDictCursor) as cursor:
        cursor.execute(
            """
            WITH next_job AS (
                SELECT id, job_data
                FROM job_queue
                WHERE status = 'pending'
                FOR UPDATE SKIP LOCKED
                LIMIT 1
            )
            UPDATE job_queue
            SET status = 'in_progress'
            FROM next_job
            WHERE job_queue.id = next_job.id
            RETURNING job_queue.id, job_queue.job_data;
            """
        )

        job = cursor.fetchone()
        if job:
            print(f"Processing job {job['id']}: {job['job_data']}")

            # Simulate job processing
            cursor.execute("UPDATE job_queue SET status = 'completed' WHERE id = %s", (job['id'],))

        else:
            print("No jobs available. Sleeping...")
            time.sleep(5)

    connection.commit()

Considerations

  1. Transaction Isolation: Use the REPEATABLE READ or SERIALIZABLE isolation level cautiously to avoid unnecessary locks.
  2. Row Locking: SKIP LOCKED only skips rows locked by other transactions, not those locked within the same transaction.
  3. Performance: Regularly archive or delete old jobs to prevent the table from growing indefinitely. Consider indexing the status column to improve query performance.
  4. Fault Tolerance: Ensure that workers handle crashes or timeouts gracefully. Use a timeout mechanism to revert jobs stuck in the ‘in_progress’ state.
  5. Scaling: Distribute workers across multiple nodes to handle a higher job throughput.
  6. The SKIP LOCKED clause only applies to row-level locks – the required ROW SHARE table-level lock is still taken normally.
  7. Using SKIP LOCKED provides an inconsistent view of the data by design. This is why it’s perfect for queue-like tables where we want to distribute work, but not suitable for general purpose work where consistency is required.

POTD #19 – Count the number of possible triangles | Geeks For Geeks

8 January 2025 at 15:08

Problem Statement

Geeks For Geeks – https://www.geeksforgeeks.org/problems/count-possible-triangles-1587115620/1

Given an integer array arr[]. Find the number of triangles that can be formed with three different array elements as lengths of three sides of the triangle.  A triangle with three given sides is only possible if sum of any two sides is always greater than the third side.


Input: arr[] = [4, 6, 3, 7]
Output: 3
Explanation: There are three triangles possible [3, 4, 6], [4, 6, 7] and [3, 6, 7]. Note that [3, 4, 7] is not a possible triangle.  


Input: arr[] = [10, 21, 22, 100, 101, 200, 300]
Output: 6
Explanation: There can be 6 possible triangles: [10, 21, 22], [21, 100, 101], [22, 100, 101], [10, 100, 101], [100, 101, 200] and [101, 200, 300]

My Approach


class Solution:
    #Function to count the number of possible triangles.
    def countTriangles(self, arr):
        # code here
        arr.sort()
        n = len(arr)
        cnt = 0
        for itr in range(2, n):
            left = 0
            right = itr - 1
            
            while left < right:
                
                if arr[left] + arr[right] > arr[itr]:
                    cnt += right - left
                    right -= 1
                else:
                    left += 1
        return cnt

Learning Notes #41 – Shared Lock and Exclusive Locks | Postgres

6 January 2025 at 14:07

Today, I learnt about various locking mechanism to prevent double update. In this blog, i make notes on Shared Lock and Exclusive Lock for my future self.

What Are Locks in Databases?

Locks are mechanisms used by a DBMS to control access to data. They ensure that transactions are executed in a way that maintains the ACID (Atomicity, Consistency, Isolation, Durability) properties of the database. Locks can be classified into several types, including

  • Shared Locks (S Locks): Allow multiple transactions to read a resource simultaneously but prevent any transaction from writing to it.
  • Exclusive Locks (X Locks): Allow a single transaction to modify a resource, preventing both reading and writing by other transactions.
  • Intent Locks: Used to signal the type of lock a transaction intends to acquire at a lower level.
  • Deadlock Prevention Locks: Special locks aimed at preventing deadlock scenarios.

Shared Lock

A shared lock is used when a transaction needs to read a resource (e.g., a database row or table) without altering it. Multiple transactions can acquire a shared lock on the same resource simultaneously. However, as long as one or more shared locks exist on a resource, no transaction can acquire an exclusive lock on that resource.


-- Transaction A: Acquire a shared lock on a row
BEGIN;
SELECT * FROM employees WHERE id = 1 FOR SHARE;
-- Transaction B: Acquire a shared lock on the same row
BEGIN;
SELECT * FROM employees WHERE id = 1 FOR SHARE;
-- Both transactions can read the row concurrently
-- Transaction C: Attempt to update the same row
BEGIN;
UPDATE employees SET salary = salary + 1000 WHERE id = 1;
-- Transaction C will be blocked until Transactions A and B release their locks

Key Characteristics of Shared Locks

1. Concurrent Reads

  • Shared locks allow multiple transactions to read the same resource at the same time.
  • This is ideal for operations like SELECT queries that do not modify data.

2. Write Blocking

  • While a shared lock is active, no transaction can modify the locked resource.
  • Prevents dirty writes and ensures read consistency.

3. Compatibility

  • Shared locks are compatible with other shared locks but not with exclusive locks.

When Are Shared Locks Used?

Shared locks are typically employed in read operations under certain isolation levels. For instance,

1. Read Committed Isolation Level:

  • Shared locks are held for the duration of the read operation.
  • Prevents dirty reads by ensuring the data being read is not modified by other transactions during the read.

2. Repeatable Read Isolation Level:

  • Shared locks are held until the transaction completes.
  • Ensures that the data read during a transaction remains consistent and unmodified.

3. Snapshot Isolation:

  • Shared locks may not be explicitly used, as the DBMS creates a consistent snapshot of the data for the transaction.

    Exclusive Locks

    An exclusive lock is used when a transaction needs to modify a resource. Only one transaction can hold an exclusive lock on a resource at a time, ensuring no other transactions can read or write to the locked resource.

    
    -- Transaction X: Acquire an exclusive lock to update a row
    BEGIN;
    UPDATE employees SET salary = salary + 1000 WHERE id = 2;
    -- Transaction Y: Attempt to read the same row
    BEGIN;
    SELECT * FROM employees WHERE id = 2;
    -- Transaction Y will be blocked until Transaction X completes
    -- Transaction Z: Attempt to update the same row
    BEGIN;
    UPDATE employees SET salary = salary + 500 WHERE id = 2;
    -- Transaction Z will also be blocked until Transaction X completes
    

    Key Characteristics of Exclusive Locks

    1. Write Operations: Exclusive locks are essential for operations like INSERT, UPDATE, and DELETE.

    2. Blocking Reads and Writes: While an exclusive lock is active, no other transaction can read or write to the resource.

    3. Isolation: Ensures that changes made by one transaction are not visible to others until the transaction is complete.

      When Are Exclusive Locks Used?

      Exclusive locks are typically employed in write operations or any operation that modifies the database. For instance:

      1. Transactional Updates – A transaction that updates a row acquires an exclusive lock to ensure no other transaction can access or modify the row during the update.

      2. Table Modifications – When altering a table structure, the DBMS may place an exclusive lock on the entire table.

      Benefits of Shared and Exclusive Locks

      Benefits of Shared Locks

      1. Consistency in Multi-User Environments – Ensure that data being read is not altered by other transactions, preserving consistency.
      2. Concurrency Support – Allow multiple transactions to read data simultaneously, improving system performance.
      3. Data Integrity – Prevent dirty reads and writes, ensuring that operations yield reliable results.

      Benefits of Exclusive Locks

      1. Data Integrity During Modifications – Prevents other transactions from accessing data being modified, ensuring changes are applied safely.
      2. Isolation of Transactions – Ensures that modifications by one transaction are not visible to others until committed.

      Limitations and Challenges

      Shared Locks

      1. Potential for Deadlocks – Deadlocks can occur if two transactions simultaneously hold shared locks and attempt to upgrade to exclusive locks.
      2. Blocking Writes – Shared locks can delay write operations, potentially impacting performance in write-heavy systems.
      3. Lock Escalation – In systems with high concurrency, shared locks may escalate to table-level locks, reducing granularity and concurrency.

      Exclusive Locks

      1. Reduced Concurrency – Exclusive locks prevent other transactions from accessing the locked resource, which can lead to bottlenecks in highly concurrent systems.
      2. Risk of Deadlocks – Deadlocks can occur if two transactions attempt to acquire exclusive locks on resources held by each other.

      Lock Compatibility

      ❌
      ❌