❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

HAProxy EP 9: Load Balancing with Weighted Round Robin

11 September 2024 at 14:39

Load balancing helps distribute client requests across multiple servers to ensure high availability, performance, and reliability. Weighted Round Robin Load Balancing is an extension of the round-robin algorithm, where each server is assigned a weight based on its capacity or performance capabilities. This approach ensures that more powerful servers handle more traffic, resulting in a more efficient distribution of the load.

What is Weighted Round Robin Load Balancing?

Weighted Round Robin Load Balancing assigns a weight to each server. The weight determines how many requests each server should handle relative to the others. Servers with higher weights receive more requests compared to those with lower weights. This method is useful when backend servers have different processing capabilities or resources.

Step-by-Step Implementation with Docker

Step 1: Create Dockerfiles for Each Flask Application

We’ll use the same three Flask applications (app1.py, app2.py, and app3.py) as in previous examples.

  • Flask App 1 (app1.py):

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 1!"

@app.route("/data")
def data():
    return "Data from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)

  • Flask App 2 (app2.py):

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 2!"

@app.route("/data")
def data():
    return "Data from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)

  • Flask App 3 (app3.py):

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 3!"

@app.route("/data")
def data():
    return "Data from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)

Step 2: Create Dockerfiles for Each Flask Application

Create Dockerfiles for each of the Flask applications:

  • Dockerfile for Flask App 1 (Dockerfile.app1):

# Use the official Python image from Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the application file into the container
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

  • Dockerfile for Flask App 2 (Dockerfile.app2):

FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]

  • Dockerfile for Flask App 3 (Dockerfile.app3):

FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create the HAProxy Configuration File

Create an HAProxy configuration file (haproxy.cfg) to implement Weighted Round Robin Load Balancing


global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance roundrobin
    server server1 app1:5001 weight 2 check
    server server2 app2:5002 weight 1 check
    server server3 app3:5003 weight 3 check

Explanation:

  • The balance roundrobin directive tells HAProxy to use the Round Robin load balancing algorithm.
  • The weight option for each server specifies the weight associated with each server:
    • server1 (App 1) has a weight of 2.
    • server2 (App 2) has a weight of 1.
    • server3 (App 3) has a weight of 3.
  • Requests will be distributed based on these weights: App 3 will receive the most requests, App 2 the least, and App 1 will be in between.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy):


# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80

Step 5: Create a docker-compose.yml File

To manage all the containers together, create a docker-compose.yml file

version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3


Explanation:

  • The docker-compose.yml file defines the services (app1, app2, app3, and haproxy) and their respective configurations.
  • HAProxy depends on the three Flask applications to be up and running before it starts.

Step 6: Build and Run the Docker Containers

Run the following command to build and start all the containers


docker-compose up --build

This command builds Docker images for all three Flask apps and HAProxy, then starts them.

Step 7: Test the Load Balancer

Open your browser or use curl to make requests to the HAProxy server


curl http://localhost/
curl http://localhost/data

Observation:

  • With Weighted Round Robin Load Balancing, you should see that requests are distributed according to the weights specified in the HAProxy configuration.
  • For example, App 3 should receive three times more requests than App 2, and App 1 should receive twice as many as App 2.

Conclusion

By implementing Weighted Round Robin Load Balancing with HAProxy, you can distribute traffic more effectively according to the capacity or performance of each backend server. This approach helps optimize resource utilization and ensures a balanced load across servers.

HAProxy EP 7: Load Balancing with Source IP Hash, URI – Consistent Hashing

11 September 2024 at 13:55

Load balancing helps distribute traffic across multiple servers, enhancing performance and reliability. One common strategy is Source IP Hash load balancing, which ensures that requests from the same client IP are consistently directed to the same server.

This method is particularly useful for applications requiring session persistence, such as shopping carts or user sessions. In this blog, we’ll implement Source IP Hash load balancing using Flask and HAProxy, all within Docker containers.

What is Source IP Hash Load Balancing?

Source IP Hash Load Balancing is a technique that uses a hash function on the client’s IP address to determine which server should handle the request. This guarantees that a particular client will always be directed to the same backend server, ensuring session persistence and stateful behavior.

Consistent Hashing: https://parottasalna.com/2024/06/17/why-do-we-need-to-maintain-same-hash-in-load-balancer/

Step-by-Step Implementation with Docker

Step 1: Create Flask Application

We’ll create three separate Dockerfiles, one for each Flask app.

Flask App 1 (app1.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)


Flask App 2 (app2.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)


Flask App 3 (app3.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)

Each Flask app listens on a different port (5001, 5002, 5003).

Step 2: Dockerfiles for each flask application

Dockerfile for Flask App 1 (Dockerfile.app1)

# Use the official Python image from the Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

Dockerfile for Flask App 2 (Dockerfile.app2)

FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]

Dockerfile for Flask App 3 (Dockerfile.app3)

FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create a configuration for HAProxy

global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance source
    hash-type consistent
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check

Explanation:

  • The balance source directive tells HAProxy to use Source IP Hashing as the load balancing algorithm.
  • The hash-type consistent directive ensures consistent hashing, which is essential for minimizing disruption when backend servers are added or removed.
  • The server directives define the backend servers and their ports.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy)

# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80

Step 5: Create a Dockercompose file

To manage all the containers together, create a docker-compose.yml file

version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3

Explanation:

  • The docker-compose.yml file defines four services: app1, app2, app3, and haproxy.
  • Each Flask app is built from its respective Dockerfile and runs on its port.
  • HAProxy is configured to wait (depends_on) for all three Flask apps to be up and running.

Step 6: Build and Run the Docker Containers

Run the following commands to build and start all the containers:

# Build and run the containers
docker-compose up --build

This command will build Docker images for all three Flask apps and HAProxy and start them up in the background.

Step 7: Test the Load Balancer

Open your browser or use a tool like curl to make requests to the HAProxy server:

curl http://localhost

Observation:

  • With Source IP Hash load balancing, each unique IP address (e.g., your local IP) should always be directed to the same backend server.
  • If you access the HAProxy from different IPs (e.g., using different devices or by simulating different client IPs), you will see that requests are consistently sent to the same server for each IP.

For the URI based hashing we just need to add,

global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance uri
    hash-type consistent
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check


Explanation:

  • The balance uri directive tells HAProxy to use URI Hashing as the load balancing algorithm.
  • The hash-type consistent directive ensures consistent hashing to minimize disruption when backend servers are added or removed.
  • The server directives define the backend servers and their ports.

HAProxy Ep 6: Load Balancing With Least Connection

11 September 2024 at 13:32

Load balancing is crucial for distributing incoming network traffic across multiple servers, ensuring optimal resource utilization and improving application performance. One of the simplest and most popular load balancing algorithms is Round Robin. In this blog, we’ll explore how to implement Least Connection load balancing using Flask as our backend application and HAProxy as our load balancer.

What is Least Connection Load Balancing?

Least Connection Load Balancing is a dynamic algorithm that distributes requests to the server with the fewest active connections at any given time. This method ensures that servers with lighter loads receive more requests, preventing any single server from becoming a bottleneck.

Step-by-Step Implementation with Docker

Step 1: Create Dockerfiles for Each Flask Application

We’ll create three separate Dockerfiles, one for each Flask app.

Flask App 1 (app1.py) – Introduced Slowness by adding sleep

from flask import Flask
import time

app = Flask(__name__)

@app.route("/")
def hello():
    time.sleep(5)
    return "Hello from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)


Flask App 2 (app2.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)


Flask App 3 (app3.py) – Introduced Slowness by adding sleep.

from flask import Flask
import time

app = Flask(__name__)

@app.route("/")
def hello():
    time.sleep(5)
    return "Hello from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)

Each Flask app listens on a different port (5001, 5002, 5003).

Step 2: Dockerfiles for each flask application

Dockerfile for Flask App 1 (Dockerfile.app1)

# Use the official Python image from the Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

Dockerfile for Flask App 2 (Dockerfile.app2)

FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]

Dockerfile for Flask App 3 (Dockerfile.app3)

FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create a configuration for HAProxy

global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance leastconn
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check

Explanation:

  • frontend http_front: Defines the entry point for incoming traffic. It listens on port 80.
  • backend servers: Specifies the servers HAProxy will distribute traffic evenly the three Flask apps (app1, app2, app3). The balance leastconn directive sets the Least Connection for load balancing.
  • server directives: Lists the backend servers with their IP addresses and ports. The check option allows HAProxy to monitor the health of each server.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy)

# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80

Step 5: Create a Dockercompose file

To manage all the containers together, create a docker-compose.yml file

version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3

Explanation:

  • The docker-compose.yml file defines four services: app1, app2, app3, and haproxy.
  • Each Flask app is built from its respective Dockerfile and runs on its port.
  • HAProxy is configured to wait (depends_on) for all three Flask apps to be up and running.

Step 6: Build and Run the Docker Containers

Run the following commands to build and start all the containers:

# Build and run the containers
docker-compose up --build

This command will build Docker images for all three Flask apps and HAProxy and start them up in the background.

You should see the responses alternating between β€œHello from Flask App 1!”, β€œHello from Flask App 2!”, and β€œHello from Flask App 3!” as HAProxy uses the Round Robin algorithm to distribute requests.

Step 7: Test the Load Balancer

Open your browser or use a tool like curl to make requests to the HAProxy server:

curl http://localhost

You should see responses cycling between β€œHello from Flask App 1!”, β€œHello from Flask App 2!”, and β€œHello from Flask App 3!” according to the Least Connection strategy.

HAProxy EP 4: Understanding ACL – Access Control List

10 September 2024 at 23:46

Imagine you are managing a busy highway with multiple lanes, and you want to direct specific types of vehicles to particular lanes: trucks to one lane, cars to another, and motorcycles to yet another. In the world of web traffic, this is similar to what Access Control Lists (ACLs) in HAProxy doβ€”they help you direct incoming requests based on specific criteria.

Let’s dive into what ACLs are in HAProxy, why they are essential, and how you can use them effectively with some practical examples.

What are ACLs in HAProxy?

Access Control Lists (ACLs) in HAProxy are rules or conditions that allow you to define patterns to match incoming requests. These rules help you make decisions about how to route or manage traffic within your infrastructure.

Think of ACLs as powerful filters or guards that analyze incoming HTTP requests based on headers, IP addresses, URL paths, or other attributes. By defining ACLs, you can control how requests are handledβ€”for example, sending specific traffic to different backends, applying security rules, or denying access under certain conditions.

Why Use ACLs in HAProxy?

Using ACLs offers several advantages:

  1. Granular Control Over Traffic: You can filter and route traffic based on very specific criteria, such as the content of HTTP headers, cookies, or request methods.
  2. Security: ACLs can block unwanted traffic, enforce security policies, and prevent malicious access.
  3. Performance Optimization: By directing traffic to specific servers optimized for certain types of content, ACLs can help balance the load and improve performance.
  4. Flexibility and Scalability: ACLs allow dynamic adaptation to changing traffic patterns or new requirements without significant changes to your infrastructure.

How ACLs Work in HAProxy

ACLs in HAProxy are defined in the configuration file (haproxy.cfg). The syntax is straightforward


acl <name> <criteria>
  • <name>: The name you give to your ACL rule, which you will use to reference it in further configuration.
  • <criteria>: The condition or match pattern, such as a path, header, method, or IP address.

It either returns True or False.

Examples of ACLs in HAProxy

Let’s look at some practical examples to understand how ACLs work.

Example 1: Routing Traffic Based on URL Path

Suppose you have a web application that serves both static and dynamic content. You want to route all requests for static files (like images, CSS, and JavaScript) to a server optimized for static content, while all other requests should go to a dynamic content server.

Configuration:


frontend http_front
    bind *:80
    acl is_static path_beg /static
    use_backend static_backend if is_static
    default_backend dynamic_backend

backend static_backend
    server static1 127.0.0.1:5001 check

backend dynamic_backend
    server dynamic1 127.0.0.1:5002 check

  • ACL Definition: acl is_static path_beg /static : checks if the request URL starts with /static.
  • Usage: use_backend static_backend if is_static routes the traffic to the static_backend if the ACL is_static matches. All other requests are routed to the dynamic_backend.

Example 2: Blocking Traffic from Specific IP Addresses

Let’s say you want to block traffic from a range of IP addresses that are known to be malicious.

Configurations

frontend http_front
    bind *:80
    acl block_ip src 192.168.1.0/24
    http-request deny if block_ip
    default_backend web_backend

backend web_backend
    server web1 127.0.0.1:5003 check


ACL Definition:acl block_ip src 192.168.1.0/24 defines an ACL that matches any source IP from the range 192.168.1.0/24.

Usage:http-request deny if block_ip denies the request if it matches the block_ip ACL.

Example 4: Redirecting Traffic Based on Request Method

You might want to redirect all POST requests to a different backend for further processing.

Configurations


frontend http_front
    bind *:80
    acl is_post_method method POST
    use_backend post_backend if is_post_method
    default_backend general_backend

backend post_backend
    server post1 127.0.0.1:5006 check

backend general_backend
    server general1 127.0.0.1:5007 check

Example 5: Redirect Traffic Based on User Agent

Imagine you want to serve a different version of your website to mobile users versus desktop users. You can achieve this by using ACLs that check the User-Agent header in the HTTP request.

Configuration:


frontend http_front
    bind *:80
    acl is_mobile_user_agent req.hdr(User-Agent) -i -m sub Mobile
    use_backend mobile_backend if is_mobile_user_agent
    default_backend desktop_backend

backend mobile_backend
    server mobile1 127.0.0.1:5008 check

backend desktop_backend
    server desktop1 127.0.0.1:5009 check

ACL Definition:acl is_mobile_user_agent req.hdr(User-Agent) -i -m sub Mobile checks if the User-Agent header contains the substring "Mobile" (case-insensitive).

Usage:use_backend mobile_backend if is_mobile_user_agent directs mobile users to mobile_backend and all other users to desktop_backend.

Example 6: Restrict Access to Admin Pages by IP Address

Let’s say you want to allow access to the /admin page only from a specific IP address or range, such as your company’s internal network.


frontend http_front
    bind *:80
    acl is_admin_path path_beg /admin
    acl is_internal_network src 192.168.10.0/24
    http-request deny if is_admin_path !is_internal_network
    default_backend web_backend

backend web_backend
    server web1 127.0.0.1:5015 check

Example with a Flask Application

Let’s see how you can use ACLs with a Flask application to enforce different rules.

Flask Application Setup

You have two Flask apps: app1.py for general requests and app2.py for special requests like form submissions.

app1.py

from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
    return "Welcome to the main page!"

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5003)

app2.py:

from flask import Flask

app = Flask(__name__)

@app.route('/submit', methods=['POST'])
def submit_form():
    return "Form submitted successfully!"

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5004)


HAProxy Configuration with ACLs


frontend http_front
    bind *:80
    acl is_post_method method POST
    acl is_submit_path path_beg /submit
    use_backend post_backend if is_post_method is_submit_path
    default_backend general_backend

backend post_backend
    server app2 127.0.0.1:5004 check

backend general_backend
    server app1 127.0.0.1:5003 check

ACLs:

  • is_post_method checks for the POST method.
  • is_submit_path checks if the path starts with /submit.

Traffic Handling: The traffic is directed to post_backend if both the ACLs match, otherwise, it goes to general_backend.

HAProxy EP 2: TCP Proxy for Flask Application

10 September 2024 at 16:56

Meet Jafer, a backend engineer tasked with ensuring the new microservice they are building can handle high traffic smoothly. The microservice is a Flask application that needs to be accessed over TCP, and Jafer decided to use HAProxy to act as a TCP proxy to manage incoming traffic.

This guide will walk you through how Jafer sets up HAProxy to work as a TCP proxy for a sample Flask application.

Why Use HAProxy as a TCP Proxy?

HAProxy as a TCP proxy operates at Layer 4 (Transport Layer) of the OSI model. It forwards raw TCP connections from clients to backend servers without inspecting the contents of the packets. This is ideal for scenarios where:

  • You need to handle non-HTTP traffic, such as databases or other TCP-based applications.
  • You want to perform load balancing without application-level inspection.
  • Your services are using protocols other than HTTP/HTTPS.

In this layer, it can’t read the packets but can identify the ip address of the client.

Step 1: Set Up a Sample Flask Application

First, Jafer created a simple Flask application that listens on a TCP port. Let’s create a file named app.py

from flask import Flask, request

app = Flask(__name__)

@app.route('/', methods=['GET'])
def home():
    return "Hello from Flask over TCP!"

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=5000)  # Run the app on port 5000


Step 2: Dockerize the Flask Application

To make the Flask app easy to deploy, Jafer decided to containerize it using Docker.

Create a Dockerfile

# Use an official Python runtime as a parent image
FROM python:3.9-slim

# Set the working directory
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install flask

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Run app.py when the container launches
CMD ["python", "app.py"]


To build and run the Docker container, use the following commands

docker build -t flask-app .
docker run -d -p 5000:5000 flask-app

This will start the Flask application on port 5000.

Step 3: Configure HAProxy as a TCP Proxy

Now, Jafer needs to configure HAProxy to act as a TCP proxy for the Flask application.

Create an HAProxy configuration file named haproxy.cfg

global
    log stdout format raw local0
    maxconn 4096

defaults
    mode tcp  # Operating in TCP mode
    log global
    option tcplog
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend tcp_front
    bind *:4000  # Bind to port 4000 for incoming TCP traffic
    default_backend flask_backend

backend flask_backend
    balance roundrobin  # Use round-robin load balancing
    server flask1 127.0.0.1:5000 check  # Proxy to Flask app running on port 5000

In this configuration:

  • Mode TCP: HAProxy is set to work in TCP mode.
  • Frontend: Listens on port 4000 and forwards incoming TCP traffic to the backend.
  • Backend: Contains a single server (flask1) where the Flask app is running.

Step 4: Run HAProxy with the Configuration

To start HAProxy with the above configuration, you can use Docker to run HAProxy in a container.

Create a Dockerfile for HAProxy

FROM haproxy:2.4

# Copy the HAProxy configuration file to the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

Build and run the HAProxy Docker container

docker build -t haproxy-tcp .
docker run -d -p 4000:4000 haproxy-tcp

This will start HAProxy on port 4000, which is configured to proxy TCP traffic to the Flask application running on port 5000.

Step 5: Test the TCP Proxy Setup

To test the setup, open a web browser or use curl to send a request to the HAProxy server

curl http://localhost:4000/

You should see the response

Hello from Flask over TCP!

This confirms that HAProxy is successfully proxying TCP traffic to the Flask application.

Step 6: Scaling Up

If Jafer wants to scale the application to handle more traffic, he can add more backend servers to the haproxy.cfg file

backend flask_backend
    balance roundrobin
    server flask1 127.0.0.1:5000 check
    server flask2 127.0.0.1:5001 check

Jafer could run another instance of the Flask application on a different port (5001), and HAProxy would balance the TCP traffic between the two instances.

Conclusion

By configuring HAProxy as a TCP proxy, Jafer could efficiently manage and balance incoming traffic to their Flask application. This setup ensures scalability and reliability for any TCP-based service, not just HTTP-based ones.

HAProxy EP 1: Traffic Police for Web

9 September 2024 at 16:59

In the world of web applications, imagine you’re running a very popular pizza place. Every evening, customers line up for a delicious slice of pizza. But if your single cashier can’t handle all the orders at once, customers might get frustrated and leave.

What if you could have a system that ensures every customer gets served quickly and efficiently? Enter HAProxy, a tool that helps manage and balance the flow of web traffic so that no single server gets overwhelmed.

Here’s a straightforward guide to understanding HAProxy, installing it, and setting it up to make your web application run smoothly.

What is HAProxy?

HAProxy stands for High Availability Proxy. It’s like a traffic director for your web traffic. It takes incoming requests (like people walking into your pizza place) and decides which server (or pizza station) should handle each request. This way, no single server gets too busy, and everything runs more efficiently.

Why Use HAProxy?

  • Handles More Traffic: Distributes incoming traffic across multiple servers so no single one gets overloaded.
  • Increases Reliability: If one server fails, HAProxy directs traffic to the remaining servers.
  • Improves Performance: Ensures that users get faster responses because the load is spread out.

Installing HAProxy

Here’s how you can install HAProxy on a Linux system:

  1. Open a Terminal: You’ll need to access your command line interface to install HAProxy.
  2. Install HAProxy: Type the following command and hit enter

sudo apt-get update
sudo apt-get install haproxy

3. Check Installation: Once installed, you can verify that HAProxy is running by typing


sudo systemctl status haproxy

This command shows you the current status of HAProxy, ensuring it’s up and running.

Configuring HAProxy

HAProxy’s configuration file is where you set up how it should handle incoming traffic. This file is usually located at /etc/haproxy/haproxy.cfg. Let’s break down the main parts of this configuration file,

1. The global Section

The global section is like setting the rules for the entire pizza place. It defines general settings for HAProxy itself, such as how it should operate, what kind of logging it should use, and what resources it needs. Here’s an example of what you might see in the global section


global
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660
    user haproxy
    group haproxy
    daemon

Let’s break it down line by line:

  • log /dev/log local0: This line tells HAProxy to send log messages to the system log at /dev/log and to use the local0 logging facility. Logs help you keep track of what’s happening with HAProxy.
  • log /dev/log local1 notice: Similar to the previous line, but it uses the local1 logging facility and sets the log level to notice, which is a type of log message indicating important events.
  • chroot /var/lib/haproxy: This line tells HAProxy to run in a restricted area of the file system (/var/lib/haproxy). It’s a security measure to limit access to the rest of the system.
  • stats socket /run/haproxy/admin.sock mode 660: This sets up a special socket (a kind of communication endpoint) for administrative commands. The mode 660 part defines the permissions for this socket, allowing specific users to manage HAProxy.
  • user haproxy: Specifies that HAProxy should run as the user haproxy. Running as a specific user helps with security.
  • group haproxy: Similar to the user directive, this specifies that HAProxy should run under the haproxy group.
  • daemon: This tells HAProxy to run as a background service, rather than tying up a terminal window.

2. The defaults Section

The defaults section sets up default settings for HAProxy’s operation and is like defining standard procedures for the pizza place. It applies default configurations to both the frontend and backend sections unless overridden. Here’s an example of a defaults section


defaults
    log     global
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

Here’s what each line means:

  • log global: Tells HAProxy to use the logging settings defined in the global section for logging.
  • option httplog: Enables HTTP-specific logging. This means HAProxy will log details about HTTP requests and responses, which helps with troubleshooting and monitoring.
  • option dontlognull: Prevents logging of connections that don’t generate any data (null connections). This keeps the logs cleaner and more relevant.
  • timeout connect 5000ms: Sets the maximum time HAProxy will wait when trying to connect to a backend server to 5000 milliseconds (5 seconds). If the connection takes longer, it will be aborted.
  • timeout client 50000ms: Defines the maximum time HAProxy will wait for data from the client to 50000 milliseconds (50 seconds). If the client doesn’t send data within this time, the connection will be closed.
  • timeout server 50000ms: Similar to timeout client, but it sets the maximum time to wait for data from the server to 50000 milliseconds (50 seconds).

3. Frontend Section

The frontend section defines how HAProxy listens for incoming requests. Think of it as the entrance to your pizza place.


frontend http_front
    bind *:80
    default_backend http_back
  • frontend http_front: This is a name for your frontend configuration.
  • bind *:80: Tells HAProxy to listen for traffic on port 80 (the standard port for web traffic).
  • default_backend http_back: Specifies where the traffic should be sent (to the backend section).

4. Backend Section

The backend section describes where the traffic should be directed. Think of it as the different pizza stations where orders are processed.


backend http_back
    balance roundrobin
    server app1 192.168.1.2:5000 check
    server app2 192.168.1.3:5000 check
    server app3 192.168.1.4:5000 check
  • backend http_back: This is a name for your backend configuration.
  • balance roundrobin: Distributes traffic evenly across servers.
  • server app1 192.168.1.2:5000 check: Specifies a server (app1) at IP address 192.168.1.2 on port 5000. The check option ensures HAProxy checks if the server is healthy before sending traffic to it.
  • server app2 and server app3: Additional servers to handle traffic.

Testing Your Configuration

After setting up your configuration, you’ll need to restart HAProxy to apply the changes:


sudo systemctl restart haproxy

To check if everything is working, you can use a web browser or a tool like curl to send requests to HAProxy and see if it correctly distributes them across your servers.

Mastering Request Retrying in Python with Tenacity: A Developer’s Journey

7 September 2024 at 01:49

Meet Jafer, a talented developer (self boast) working at a fast growing tech company. His team is building an innovative app that fetches data from multiple third-party APIs in realtime to provide users with up-to-date information.

Everything is going smoothly until one day, a spike in traffic causes their app to face a wave of β€œHTTP 500” and β€œTimeout” errors. Requests start failing left and right, and users are left staring at the dreaded β€œData Unavailable” message.

Jafer realizes that he needs a way to make their app more resilient against these unpredictable network hiccups. That’s when he discovers Tenacity a powerful Python library designed to help developers handle retries gracefully.

Join Jafer as he dives into Tenacity and learns how to turn his app from fragile to robust with just a few lines of code!

Step 0: Mock FLASK Api

from flask import Flask, jsonify, make_response
import random
import time

app = Flask(__name__)

# Scenario 1: Random server errors
@app.route('/random_error', methods=['GET'])
def random_error():
    if random.choice([True, False]):
        return make_response(jsonify({"error": "Server error"}), 500)  # Simulate a 500 error randomly
    return jsonify({"message": "Success"})

# Scenario 2: Timeouts
@app.route('/timeout', methods=['GET'])
def timeout():
    time.sleep(5)  # Simulate a long delay that can cause a timeout
    return jsonify({"message": "Delayed response"})

# Scenario 3: 404 Not Found error
@app.route('/not_found', methods=['GET'])
def not_found():
    return make_response(jsonify({"error": "Not found"}), 404)

# Scenario 4: Rate-limiting (simulated with a fixed chance)
@app.route('/rate_limit', methods=['GET'])
def rate_limit():
    if random.randint(1, 10) <= 3:  # 30% chance to simulate rate limiting
        return make_response(jsonify({"error": "Rate limit exceeded"}), 429)
    return jsonify({"message": "Success"})

# Scenario 5: Empty response
@app.route('/empty_response', methods=['GET'])
def empty_response():
    if random.choice([True, False]):
        return make_response("", 204)  # Simulate an empty response with 204 No Content
    return jsonify({"message": "Success"})

if __name__ == '__main__':
    app.run(host='localhost', port=5000, debug=True)

To run the Flask app, use the command,

python mock_server.py

Step 1: Introducing Tenacity

Jafer decides to start with the basics. He knows that Tenacity will allow him to retry failed requests without cluttering his codebase with complex loops and error handling. So, he installs the library,

pip install tenacity

With Tenacity ready, Jafer decides to tackle his first problem, retrying a request that fails due to server errors.

Step 2: Retrying on Exceptions

He writes a simple function that fetches data from an API and wraps it with Tenacity’s @retry decorator

import requests
import logging
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_fixed

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(stop=stop_after_attempt(3),
        wait=wait_fixed(2),
        before=before_log(logger, logging.INFO),
        after=after_log(logger, logging.INFO))
def fetch_random_error():
    response = requests.get('http://localhost:5000/random_error')
    response.raise_for_status()  # Raises an HTTPError for 4xx/5xx responses
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_random_error()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This code will attempt the request up to 3 times, waiting 2 seconds between each try. Jafer feels confident that this will handle the occasional hiccup. However, he soon realizes that he needs more control over which exceptions trigger a retry.

Step 3: Handling Specific Exceptions

Jafer’s app sometimes receives a β€œ404 Not Found” error, which should not be retried because the resource doesn’t exist. He modifies the retry logic to handle only certain exceptions,

import requests
import logging
from tenacity import before_log, after_log
from requests.exceptions import HTTPError, Timeout
from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_fixed
 

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(stop=stop_after_attempt(3),
        wait=wait_fixed(2),
        retry=retry_if_exception_type((HTTPError, Timeout)),
        before=before_log(logger, logging.INFO),
        after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get('http://localhost:5000/timeout', timeout=2)  # Set a short timeout to simulate failure
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

Now, the function retries only on HTTPError or Timeout, avoiding unnecessary retries for a β€œ404” error. Jafer’s app is starting to feel more resilient!

Step 4: Implementing Exponential Backoff

A few days later, the team notices that they’re still getting rate-limited by some APIs. Jafer recalls the concept of exponential backoff a strategy where the wait time between retries increases exponentially, reducing the load on the server and preventing further rate limiting.

He decides to implement it,

import requests
import logging
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_exponential

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


@retry(stop=stop_after_attempt(5),
       wait=wait_exponential(multiplier=1, min=2, max=10),
       before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_rate_limit():
    response = requests.get('http://localhost:5000/rate_limit')
    response.raise_for_status()
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_rate_limit()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

With this code, the wait time starts at 2 seconds and doubles with each retry, up to a maximum of 10 seconds. Jafer’s app is now much less likely to be rate-limited!

Step 5: Retrying Based on Return Values

Jafer encounters another issue: some APIs occasionally return an empty response (204 No Content). These cases should also trigger a retry. Tenacity makes this easy with the retry_if_result feature,

import requests
import logging
from tenacity import before_log, after_log

from tenacity import retry, stop_after_attempt, retry_if_result

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
  

@retry(retry=retry_if_result(lambda x: x is None), stop=stop_after_attempt(3), before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_empty_response():
    response = requests.get('http://localhost:5000/empty_response')
    if response.status_code == 204:
        return None  # Simulate an empty response
    response.raise_for_status()
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_empty_response()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

Now, the function retries when it receives an empty response, ensuring that users get the data they need.

Step 6: Combining Multiple Retry Conditions

But Jafer isn’t done yet. Some situations require combining multiple conditions. He wants to retry on HTTPError, Timeout, or a None return value. With Tenacity’s retry_any feature, he can do just that,

import requests
import logging
from tenacity import before_log, after_log

from requests.exceptions import HTTPError, Timeout
from tenacity import retry_any, retry, retry_if_exception_type, retry_if_result, stop_after_attempt
 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(retry=retry_any(retry_if_exception_type((HTTPError, Timeout)), retry_if_result(lambda x: x is None)), stop=stop_after_attempt(3), before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get("http://localhost:5000/timeout")
    if response.status_code == 204:
        return None
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This approach covers all his bases, making the app even more resilient!

Step 7: Logging and Tracking Retries

As the app scales, Jafer wants to keep an eye on how often retries happen and why. He decides to add logging,

import logging
import requests
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_fixed

 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
@retry(stop=stop_after_attempt(2), wait=wait_fixed(2),
       before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get("http://localhost:5000/timeout", timeout=2)
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This logs messages before and after each retry attempt, giving Jafer full visibility into the retry process. Now, he can monitor the app’s behavior in production and quickly spot any patterns or issues.

The Happy Ending

With Tenacity, Jafer has transformed his app into a resilient powerhouse that gracefully handles intermittent failures. Users are happy, the servers are humming along smoothly, and Jafer’s team has more time to work on new features rather than firefighting network errors.

By mastering Tenacity, Jafer has learned that handling network failures gracefully can turn a fragile app into a robust and reliable one. Whether it’s dealing with flaky APIs, network blips, or rate limits, Tenacity is his go-to tool for retrying operations in Python.

So, the next time your app faces unpredictable network challenges, remember Jafer’s story and give Tenacity a try you might just save the day!

Postgres Ep 2 : Amutha Hotel and Issues with Flat Files

23 August 2024 at 01:29

Once upon a time in ooty, there was a small business called β€œAmutha Hotel,” run by a passionate baker named Saravanan. Saravanan bakery was famous for its delicious sambar, and as his customer base grew, he needed to keep track of orders, customer information, and inventory.

Being a techie, he decided to store all this information in a flat file a simple spreadsheet named β€œHotelData.csv.”

The Early Days: Simple and Sweet

At first, everything was easy. Saravanan’s flat file had only a few columns, OrderID, CustomerName, Product, Quantity, and Price. Each row represented a new order, and it was simple enough to manage. Saravanan could quickly find orders, calculate totals, and even check his inventory by filtering the file.

The Business Grows: Complexity Creeps In

As the business boomed, Saravanan started offering new products, special discounts, and loyalty programs. He added more columns to her flat file, like Discount, LoyaltyPoints, and DeliveryAddress. He once-simple file began to swell with information.

Then, Saravanan decided to start tracking customer preferences and order history. He began adding multiple rows for the same customer, each representing a different order. His flat file now had repeating groups of data for each customer, and it became harder and harder to find the information he needed.

His flat file was getting out of hand. For every new order from a returning customer, he had to re-enter all their information

CustomerName, DeliveryAddress, LoyaltyPoints

over and over again. This duplication wasn’t just tedious; it started to cause mistakes. One day, he accidentally typed β€œJohn Smyth” instead of β€œJohn Smith,” and suddenly, his loyal customer was split into two different entries.

On a Busy Saturday

One busy Saturday, Saravanan opened his flat file to update the day’s orders, but instead of popping up instantly as it used to, it took several minutes to load. As he scrolled through the endless rows, his computer started to lag, and the spreadsheet software even crashed a few times. The file had become too large and cumbersome for him to handle efficiently.

Customers were waiting longer for their orders to be processed because Saravanan was struggling to find their previous details and apply the right discounts. The flat file that once served his so well was now slowing her down, and it was affecting her business.

The Journaling

Techie Saravanan started to note these issues in to a notepad. He badly wants a solution which will solve these problems. So he started listing out the problems with examples to look for a solution.

His journal continues …

Before databases became common for data storage, flat files (such as CSVs or text files) were often used to store and manage data. The data file that we use has no special structure; it’s just some lines of text that mean something to the particular application that reads it. It has no inherent structure

However, these flat files posed several challenges, particularly when dealing with repeating groups, which are essentially sets of related fields that repeat multiple times within a record. Here are some of the key problems associated with repeating groups in flat files,

1. Data Redundancy

  • Description: Repeating groups can lead to significant redundancy, as the same data might need to be repeated across multiple records.
  • Example: If an employee can have multiple skills, a flat file might need to repeat the employee’s name, ID, and other details for each skill.
  • Problem: This not only increases the file size but also makes data entry, updates, and deletions more prone to errors.

Eg: Suppose you are maintaining a flat file to track employees and their skills. Each employee can have multiple skills, which you store as repeating groups in the file.

EmployeeID, EmployeeName, Skill1, Skill2, Skill3, Skill4
1, John Doe, Python, SQL, Java, 
2, Jane Smith, Excel, PowerPoint, Python, SQL

If an employee has four skills, you need to add four columns (Skill1, Skill2, Skill3, Skill4). If an employee has more than four skills, you must either add more columns or create a new row with repeated employee details.

2. Data Inconsistency

  • Description: Repeating groups can lead to inconsistencies when data is updated.
  • Example: If an employee’s name changes, and it’s stored multiple times in different rows because of repeating skills, it’s easy for some instances to be updated while others are not.
  • Problem: This can lead to situations where the same employee is listed under different names or IDs in the same file.

Eg: Suppose you are maintaining a flat file to track employees and their skills. Each employee can have multiple skills, which you store as repeating groups in the file.

EmployeeID, EmployeeName, Skill1, Skill2, Skill3, Skill4
1, John Doe, Python, SQL, Java, 
2, Jane Smith, Excel, PowerPoint, Python, SQL

If John’s name changes to β€œJohn A. Doe,” you must manually update each occurrence of β€œJohn Doe” across all rows, which increases the chance of inconsistencies.

3. Difficulty in Querying

  • Description: Querying data in flat files with repeating groups can be cumbersome and inefficient.
  • Example: Extracting a list of unique employees with their respective skills requires complex scripting or manual processing.
  • Problem: Unlike relational databases, which use joins to simplify such queries, flat files require custom logic to manage and extract data, leading to slower processing and more potential for errors.

Eg: Suppose you are maintaining a flat file to track employees and their skills. Each employee can have multiple skills, which you store as repeating groups in the file.

EmployeeID, EmployeeName, Skill1, Skill2, Skill3, Skill4
1, John Doe, Python, SQL, Java, 
2, Jane Smith, Excel, PowerPoint, Python, SQL

Extracting a list of all employees proficient in β€œPython” requires you to search across multiple skill columns (Skill1, Skill2, etc.), which is cumbersome compared to a relational database where you can use a simple JOIN on a normalized EmployeeSkills table.

4. Limited Scalability

  • Description: Flat files do not scale well when the number of repeating groups or the size of the data grows.
  • Example: A file with multiple repeating fields can become extremely large and difficult to manage as the number of records increases.
  • Problem: This can lead to performance issues, such as slow read/write operations and difficulty in maintaining the file over time.

Eg: You are storing customer orders in a flat file where each customer can place multiple orders.

CustomerID, CustomerName, Order1ID, Order1Date, Order2ID, Order2Date, Order3ID, Order3Date
1001, Alice Brown, 5001, 2023-08-01, 5002, 2023-08-15, 
1002, Bob White, 5003, 2023-08-05, 

If Alice places more than three orders, you’ll need to add more columns (Order4ID, Order4Date, etc.), leading to an unwieldy file with many empty cells for customers with fewer orders.

5. Challenges in Data Integrity

  • Description: Ensuring data integrity in flat files with repeating groups is difficult.
  • Example: Enforcing rules like β€œan employee can only have unique skills” is nearly impossible in a flat file format.
  • Problem: This can result in duplicated or invalid data, which is hard to detect and correct without a database system.

Eg: You are storing customer orders in a flat file where each customer can place multiple orders.

CustomerID, CustomerName, Order1ID, Order1Date, Order2ID, Order2Date, Order3ID, Order3Date
1001, Alice Brown, 5001, 2023-08-01, 5002, 2023-08-15, 
1002, Bob White, 5003, 2023-08-05,

There’s no easy way to enforce that each order ID is unique and corresponds to the correct customer, which could lead to errors or duplicated orders.

6. Complex File Formats

  • Description: Managing and processing flat files with repeating groups often requires complex file formats.
  • Example: Custom delimiters or nested formats might be needed to handle repeating groups, making the file harder to understand and work with.
  • Problem: This increases the likelihood of errors during data entry, processing, or when the file is read by different systems.

Eg: You are storing customer orders in a flat file where each customer can place multiple orders.

CustomerID, CustomerName, Order1ID, Order1Date, Order2ID, Order2Date, Order3ID, Order3Date
1001, Alice Brown, 5001, 2023-08-01, 5002, 2023-08-15, 
1002, Bob White, 5003, 2023-08-05, 

As the number of orders grows, the file format becomes increasingly complex, requiring custom scripts to manage and extract order data for each customer.

7. Lack of Referential Integrity

  • Description: Flat files lack mechanisms to enforce referential integrity between related groups of data.
  • Example: Ensuring that a skill listed in one file corresponds to a valid skill ID in another file requires manual checks or complex logic.
  • Problem: This can lead to orphaned records or mismatches between related data sets.

Eg: A fleet management company tracks maintenance records for each vehicle in a flat file. Each vehicle can have multiple maintenance records.

VehicleID, VehicleType, Maintenance1Date, Maintenance1Type, Maintenance2Date, Maintenance2Type
V001, Truck, 2023-01-15, Oil Change, 2023-03-10, Tire Rotation
V002, Van, 2023-02-20, Brake Inspection, , 

There’s no way to ensure that the Maintenance1Type and Maintenance2Type fields are valid maintenance types or that the dates are in correct chronological order.

8. Difficulty in Data Modification

  • Description: Modifying data in flat files with repeating groups can be complex and error-prone.
  • Example: Adding or removing an item from a repeating group might require extensive manual edits across multiple records.
  • Problem: This increases the risk of errors and makes data management time-consuming.

Eg: A university maintains a flat file to record student enrollments in courses. Each student can enroll in multiple courses.

StudentID, StudentName, Course1, Course2, Course3, Course4, Course5
2001, Charlie Green, Math101, Physics102, , , 
2002, Dana Blue, History101, Math101, Chemistry101, , 

If a student drops a course or switches to a different one, manually editing the file can easily lead to errors, especially as the number of students and courses increases.

After listing down all these, Saravanan started looking into solutions. His search goes on…

Tool: Serial Activity – Remote SSH Manager

20 August 2024 at 02:16

Why this tool was created ?

During our college times, we had a crash course on Machine Learning. Our coordinators has arranged an ML Engineer to take class for 3 days. He insisted to install packages to have hands-on experience. But unfortunately many of our people were not sure about the installations of the packages. So we need to find a solution to install all necessary packages in all machines.

We had a scenario like, all the machines had one specific same user account with same password for all the machines. So we were like; if we are able to automate it in one machine then it would be easy for rest of the machines ( Just a for-loop iterating the x.0.0.1 to x.0.0.255 ). This is the birthplace of this tool.

Code=-

#!/usr/bin/env python
import sys
import os.path
from multiprocessing.pool import ThreadPool

import paramiko

BASE_ADDRESS = "192.168.7."
USERNAME = "t1"
PASSWORD = "uni1"


def create_client(hostname):
    """Create a SSH connection to a given hostname."""
    ssh_client = paramiko.SSHClient()
    ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh_client.connect(hostname=hostname, username=USERNAME, password=PASSWORD)
    ssh_client.invoke_shell()
    return ssh_client


def kill_computer(ssh_client):
    """Power off a computer."""
    ssh_client.exec_command("poweroff")


def install_python_modules(ssh_client):
    """Install the programs specified in requirements.txt"""
    ftp_client = ssh_client.open_sftp()

    # Move over get-pip.py
    local_getpip = os.path.expanduser("~/lab_freak/get-pip.py")
    remote_getpip = "/home/%s/Documents/get-pip.py" % USERNAME
    ftp_client.put(local_getpip, remote_getpip)

    # Move over requirements.txt
    local_requirements = os.path.expanduser("~/lab_freak/requirements.txt")
    remote_requirements = "/home/%s/Documents/requirements.txt" % USERNAME
    ftp_client.put(local_requirements, remote_requirements)

    ftp_client.close()

    # Install pip and the desired modules.
    ssh_client.exec_command("python %s --user" % remote_getpip)
    ssh_client.exec_command("python -m pip install --user -r %s" % remote_requirements)


def worker(action, hostname):
    try:
        ssh_client = create_client(hostname)

        if action == "kill":
            kill_computer(ssh_client)
        elif action == "install":
            install_python_modules(ssh_client)
        else:
            raise ValueError("Unknown action %r" % action)
    except BaseException as e:
        print("Running the payload on %r failed with %r" % (hostname, action))


def main():
    if len(sys.argv) < 2:
        print("USAGE: python kill.py ACTION")
        sys.exit(1)

    hostnames = [str(BASE_ADDRESS) + str(i) for i in range(30, 60)]

    with ThreadPool() as pool:
        pool.map(lambda hostname: worker(sys.argv[1], hostname), hostnames)


if __name__ == "__main__":
    main()


Docker Ep 7: Diving Deeper into Docker with Detached Mode, Naming, and Inspections

15 August 2024 at 05:04

In our last adventure with Docker, we successfully ran our first container using the BusyBox image. It was a moment of triumph, seeing that β€œhello world” echo back at us from the depths of our new container.

Today, we delve even deeper into the Dockerverse, learning how to run containers in detached mode, name them, and inspect their inner workings.

The Background Wizardry: Detached Mode

Imagine you’re a wizard, conjuring a spell to summon a creature (a.k.a. a Docker container). But what if you wanted to summon this creature to do your bidding in the background while you continued your magical studies in the foreground? This is where the art of detached mode comes in.

To cast this spell, we use the -d flag with our docker run command, telling Docker to send the container off to the background.


docker run -d busybox:1.36 sleep 1000

With a flick of our wands (or rather, the Enter key), the spell is cast, and Docker returns a long string of magical symbolsβ€”a container ID. This ID is proof that our BusyBox creature is now running somewhere in the background, quietly counting sheep for 1000 seconds.

But how do we know our creature is truly running, and not just wandering off in the Docker forest? Fear not, young wizard; there is a way to check.

Casting the docker ps Spell

To see the creatures currently roaming the Dockerverse, we use the docker ps spell. This spell reveals a list of all the active containers, along with some essential details like their ID, image name, and what they’re currently up to.


docker ps

Behold! There it is our BusyBox container, happily sleeping as commanded. The spell even shows us the short version of our container’s ID, which is the first few characters of the long ID Docker gave us earlier.

But wait, what if you wanted to see all the creatures that have ever existed in your Dockerverse, even the ones that have now perished? For that, we have another spell.

Summoning the Ghosts: docker ps -a

The docker ps -a spell lets you commune with the ghosts of containers past. These are containers that once roamed your Dockerverse but have since exited, leaving behind only memories and a few logs.


docker ps -a

With this spell, you’ll see a list of all the containers that have ever run on your system, whether they’re still alive or not.

The Vanishing Trick: Automatically Removing Containers

Sometimes, we summon creatures for a quick task, like retrieving a lost scroll, and we don’t want them to linger once the task is done. For this, we have a trick to make them vanish as soon as they’re no longer needed. We simply add the --rm flag to our summoning command.


docker run --rm busybox:1.36 sleep 1

In this case, our BusyBox creature wakes up, sleeps for 1 second, and then vanishes into the ether, leaving no trace behind. If you run the docker ps -a spell afterward, you’ll see that the container is gone completely erased from existence.

Naming Your Creatures: The --name Flag

By default, Docker gives your creatures amusing names like β€œboring_rosaline” or β€œkickass_hopper.” But as a wizard of Docker, you have the power to name your creatures whatever you like. This is done using the --name flag when summoning them.

docker run --name hello_world busybox:1.36

Now, when you run the docker ps -a spell, you’ll see your BusyBox container proudly bearing the name β€œhello_world”. Naming your containers makes it easier to keep track of them, especially when you’re managing an entire menagerie of Docker creatures.

The All-Seeing Eye: docker inspect

Finally, we come to one of the most powerful spells in your Docker arsenal: docker inspect. This spell allows you to peer into the very soul of a container, revealing all sorts of low-level details that are hidden from the casual observer.

After summoning a new creature in detached mode, like so:


docker run -d busybox:1.36 sleep 1000

You can use the docker inspect spell to see everything about it:


docker inspect <container_id>

With this spell, you can uncover the container’s IP address, MAC address, image ID, log paths, and much more. It’s like having x-ray vision, allowing you to see every detail about your container.

And so, our journey continues, with new spells and powers at your disposal. You’ve learned how to run containers in the background, name them, and inspect their deepest secrets. As you grow in your mastery of Docker, these tools will become invaluable in managing your containers with the precision and skill of a true Docker wizard.

Docker Ep 4 : The Digital Tea Kadai – Client Server Architecture & Docker

12 August 2024 at 14:25

The Client-Server Architecture

Once upon a time in the electronic city of Banglore, there was a popular digital tea kadai. This cafe was unique because it didn’t serve traditional coffee or pastries. Instead, it served data and services to its customers developers, businesses, and tech enthusiasts who were hungry for information and resources.

The Client:

One day, a young developer named Dinesh walked into tea kadai. He was working on a new app and needed to fetch some data from the cafe’s servers. In this story, Dinesh represents the client. As a client, his role was to request specific services and data from the cafe. He approached the counter and handed over his order slip, detailing what he needed.

The Server:

Behind the counter was Syed, the tea master, representing the server. Syed’s job was to take Dinesh’s request, process it, and deliver the requested data back to him.

Syed had access to a vast array of resources stored in the cafe’s back room, where all the data was kept. When Dinesh made his request, Syed quickly went to the back, gathered the data, and handed it back to Dinesh.

The client-server architecture at Tea Kadai worked seamlessly.

Dinesh, as the client, could make requests whenever he needed, and

Syed, as the server, would respond by providing the requested data.

This interaction was efficient, allowing many clients to be served by a single server at the cafe.

Docker’s Client-Server Technology

As Tea Kadai grew in popularity, it decided to expand its services to deliver data more efficiently and flexibly. To do this, they adopted a new technology called Docker, which helped them manage their operations more effectively.

Docker Client:

In the world of Docker at Tea Kadai, Dinesh still played the role of the client. But now, instead of just making simple data requests, she could request entire environments where he could test and run his applications.

These environments, called containers, were like personalized booths in the cafe where Alice could have her own setup with everything she needed to work on her app.

Dinesh used a special tool called the Docker Client to place his order. With this tool, he could specify exactly what he wanted in his container like the operating system, libraries, and applications needed for his app. The Docker Client was her interface for communicating with the cafe’s new backend system.

Docker Server (Daemon):

Behind the scenes, Tea Kadai had installed a powerful system known as the Docker Daemon, which acted as the server in this setup. The Docker Daemon was responsible for creating, running, and managing the containers requested by clients like Dinesh.

When Dinesh sent his container request using the Docker Client, the Docker Daemon received it, built the container environment, and handed it back to Dinesh for use.

Docker Images:

The Tea Kadai had a collection of premade recipes called Docker Images. These images were like blueprints for creating containers, containing all the necessary ingredients and instructions.

When Dinesh requested a new container, the Docker Daemon used these images to quickly prepare the environment.

Flexibility and Isolation:

The beauty of Docker at Tea Kadai was that it allowed multiple clients like Dinesh to have their containers running simultaneously, each isolated from the others. This isolation ensured that one client’s work wouldn’t interfere with another’s, just like having separate booths in the cafe for each customer. Dinesh could run, test, and even destroy his environment without affecting anyone else.

At the end,

In the vibrant city of Banglore, Tea Kadai thrived by adopting client-server architecture and Docker’s client-server technology. This approach allowed them to efficiently serve data and services while providing flexible, isolated environments for their clients. Dinesh and many others continued to come to tea kadai, knowing they could always get what they needed in a reliable and innovative way.

Virtual Machine and the grand HOUSE

4 August 2024 at 06:39

A Tale of Two Living Arrangements

Once upon a time, in an old town, there was a grand house that many families shared. This large house came with shared kitchens, bathrooms, utilities, and backyards.

While this communal living saved money, it also led to clashes and conflicts. Each family had different needs, and the shared space couldn’t always accommodate them all. Eventually, these families found themselves unhappy with the constant disagreements and lack of privacy.

In the digital world, this grand house scenario is akin to hosting multiple pieces of software on a single physical server. Just like the families had varied needs, different software applications often require different runtime environments. This shared environment led to conflicts and inefficiencies, causing frequent failures.

The Move to Single-Family Houses

Displeased with the conflicts, the families decided to move into their own single-family houses. Each house was self-contained, and the families enjoyed privacy and control over their living conditions.

In isolation, each software application runs in its own machine, complete with its own operating system. This separation reduces conflicts and ensures that each application has the resources it needs.

However, just as maintaining separate houses can be costly and labor-intensive, managing multiple machines also comes with its own set of challenges.

So they all moved to an apartment,

1. Independence and Privacy

Apartment Analogy: In an apartment, you have your own private space with your own front door, allowing you to live independently from your neighbors. You control your own environment, including how you decorate and maintain it.

VM Justification: Similarly, a virtual machine provides an isolated environment where your software runs independently of others. Each VM has its own operating system, ensuring that different applications or services do not interfere with each other. This setup offers a high level of privacy and control over the software environment.

2. Resource Allocation

Apartment Analogy: Just like an apartment building has multiple units, each with its own utilities and space, you get a designated portion of resources such as electricity and water. This separation means you’re not competing for resources with other tenants.

VM Justification: With VMs, each instance is allocated a specific share of the host server’s resources (CPU, memory, disk space). This means that your application operates with a guaranteed set of resources, reducing the risk of performance issues caused by other applications running on the same physical server.

3. Customization

Apartment Analogy: In an apartment, you have the ability to make certain customizations to your living space. You can adjust the thermostat, choose your own paint colors, and arrange furniture as you please.

VM Justification: Virtual machines offer similar flexibility. You can customize each VM’s operating system and environment according to the specific needs of your application. This customization helps in optimizing performance and configuring the system to handle specific tasks or software dependencies effectively.

4. Cost Management

Apartment Analogy: Renting an apartment often involves paying a monthly rent that covers utilities and maintenance. You have a fixed cost structure, and you can budget accordingly.

VM Justification: Using VMs can also offer predictable cost management. You pay for the resources you allocate to each VM, often with the flexibility to scale up or down based on demand. This allows you to manage expenses more effectively compared to maintaining a large physical server.

5. Shared Infrastructure

Apartment Analogy: While each apartment is independent, they share the building’s infrastructure, such as the roof, walls, and common areas. This shared infrastructure helps keep costs down for everyone.

VM Justification: Virtual machines share the same physical hardware but provide separate, isolated environments. This shared infrastructure helps reduce costs compared to having dedicated physical servers for each application or service.

6. Scalability

Apartment Analogy: If your needs change, you can move to a different apartment with more space or different amenities. The apartment complex can accommodate your changing requirements without requiring a complete overhaul.

VM Justification: Virtual machines are highly scalable. If you need more resources or additional instances, you can easily create new VMs or adjust the resources of existing ones. This flexibility allows you to adapt quickly to changing demands or workloads.

So they all lived happily.

The Botanical Garden and Rose Garden: Understanding Sets

3 August 2024 at 09:36

Introduction to the Botanical Garden

We are planning to opening a botanical garden with flowers which will attract people to visit.

Morning: Planting Unique Flowers

One morning, we decides to plant flowers in the garden. They ensure that each flower they plant is unique.


botanical_garden = {"Rose", "Lily", "Sunflower"}

Noon: Adding More Flowers

At noon, they find some more flowers and add them to the garden, making sure they only add flowers that aren’t already there.

Adding Elements to a Set:


# Adding more unique flowers to the enchanted garden
botanical_garden.add("Jasmine")
botanical_garden.add("Hibiscus")
print(botanical_garden)
# output: {'Hibiscus', 'Rose', 'Tulip', 'Sunflower', 'Jasmine'}

Afternoon: Trying to Plant Duplicate Flowers

In the afternoon, they accidentally try to plant another Rose, but the garden’s rule prevents any duplicates from being added.

Adding Duplicate Elements:


# Attempting to add a duplicate flower
botanical_garden.add("Rose")
print(botanical_garden)
# output: {'Lily', 'Sunflower', 'Rose'}

Evening: Removing Unwanted Plants

As evening approaches, they decide to remove some flowers they no longer want in their garden.

Removing Elements from a Set:


# Removing a flower from the enchanted garden
botanical_garden.remove("Lily")
print(botanical_garden)
# output: {'Sunflower', 'Rose'}

Night: Checking Flower Types

Before going to bed, they check if certain flowers are present in their botanical garden.

Checking Membership:


# Checking if certain flowers are in the garden
is_rose_in_garden = "Rose" in botanical_garden
is_tulip_in_garden = "Tulip" in botanical_garden

print(f"Is Rose in the garden? {is_rose_in_garden}")
print(f"Is Tulip in the garden? {is_tulip_in_garden}")

# Output
# Is Rose in the garden? True
# Is Tulip in the garden? False

Midnight: Comparing with Rose Garden

Late at night, they compare their botanical garden with their rose garden to see which flowers they have in common and which are unique to each garden.

Set Operations:

Intersections:


# Neighbor's enchanted garden
rose_garden = {"Rose", "Lavender"}

# Flowers in both gardens (Intersection)
common_flowers = botanical_garden.intersection(rose_garden)
print(f"Common flowers: {common_flowers}")

# Output
# Common flowers: {'Rose'}
# Unique flowers: {'Sunflower'}
# All unique flowers: {'Sunflower', 'Lavender', 'Rose'}

Difference:



# Flowers unique to their garden (Difference)
unique_flowers = botanical_garden.difference(rose_garden)
print(f"Unique flowers: {unique_flowers}")

#output
# Unique flowers: {'Sunflower'}


Union:



# All unique flowers from both gardens (Union)
all_unique_flowers = botanical_garden.union(rose_garden)
print(f"All unique flowers: {all_unique_flowers}")
# Output: All unique flowers: {'Sunflower', 'Lavender', 'Rose'}

ANNACHI KADAI – The Dictionary

3 August 2024 at 09:23

In a vibrant town in Tamil Nadu, there is a popular grocery store called Annachi Kadai. This store is always bustling with fresh deliveries of items.

The store owner, Pandian, uses a special inventory system to track the products. This system functions like a dictionary in Python, where each item is labeled with its name, and the quantity available is recorded.

Morning: Delivering Items to the Store

One bright morning, a new delivery truck arrives at the grocery store, packed with fresh items. Pandian records these new items in his inventory list.

Creating and Updating the Inventory:


# Initial delivery of items to the store
inventory = {
    "apples": 20,
    "bananas": 30,
    "carrots": 15,
    "milk": 10
}

print("Initial Inventory:", inventory)
# Output: Initial Inventory: {'apples': 20, 'bananas': 30, 'carrots': 15, 'milk': 10}

Noon: Additional Deliveries

As the day progresses, more deliveries arrive with additional items that need to be added to the inventory. Pandian updates the system with these new arrivals.

Adding New Items to the Inventory:


# Adding more items from the delivery
inventory["bread"] = 25
inventory["eggs"] = 50

print("Updated Inventory:", inventory)
# Output: Updated Inventory: {'apples': 20, 'bananas': 30, 'carrots': 15, 'milk': 10, 'bread': 25, 'eggs': 50}

Afternoon: Stocking the Shelves

In the afternoon, Pandian notices that some items are running low and restocks them by updating the quantities in the inventory system.

Updating Quantities:


# Updating item quantities after restocking shelves
inventory["apples"] += 10  # 10 more apples added
inventory["milk"] += 5     # 5 more bottles of milk added

print("Inventory after Restocking:", inventory)
# Output: Inventory after Restocking: {'apples': 30, 'bananas': 30, 'carrots': 15, 'milk': 15, 'bread': 25, 'eggs': 50}

Evening: Removing Sold-Out Items

As evening falls, some items are sold out, and Pandian needs to remove them from the inventory to reflect their unavailability.

Removing Items from the Inventory:


# Removing sold-out items
del inventory["carrots"]

print("Inventory after Removal:", inventory)
# Output: Inventory after Removal: {'apples': 30, 'bananas': 30, 'milk': 15, 'bread': 25, 'eggs': 50}

Night: Checking Inventory

Before closing the store, Pandian checks the inventory to ensure that all items are accurately recorded and none are missing.

Checking for Items:

# Checking if specific items are in the inventory
is_bananas_in_stock = "bananas" in inventory
is_oranges_in_stock = "oranges" in inventory

print(f"Are bananas in stock? {is_bananas_in_stock}")
print(f"Are oranges in stock? {is_oranges_in_stock}")
# Output: Are bananas in stock? True
# Output: Are oranges in stock? False


Midnight: Reviewing Inventory

After a busy day, Pandian reviews the entire inventory to ensure all deliveries and sales are accurately recorded.

Iterating Over the Inventory:


# Reviewing the final inventory
for item, quantity in inventory.items():
    print(f"Item: {item}, Quantity: {quantity}")

# Output:
# Item: apples, Quantity: 30
# Item: bananas, Quantity: 30
# Item: milk, Quantity: 15
# Item: bread, Quantity: 25
# Item: eggs, Quantity: 50

Python-FUNDAMENTALS: The Print()

5 July 2024 at 07:09

Welcome to the world of ParottaSalna !

One of the first things you’ll learn in any programming language is how to display output on the screen.

In Python, we do this using the print function. It’s simple, yet powerful. Let’s explore the various ways you can use print in Python.

1. The Basics: Printing a String

To print text, you simply put the text inside double or single quotes and pass it to the print function.

print("Hello, world!")

This will display: Hello, world!

2. Printing Variables

Variables are used to store data. You can print variables by passing them to the print function.

name = "Parotta Salna"
print(name)

This will display: `Parotta Salna`

3. Printing Multiple Items

You can print multiple items by separating them with commas. Python will add a space between each item.

age = 25
city = "New York"
name = "Parotta Salna"
print("Name:", name, "Age:", age, "City:", city)

This will display: Name: Parotta Salna Age: 25 City: New York

4. Formatted Strings with f-strings

An f-string is a way to format strings in Python. You can insert variables directly into the string by prefixing it with an f and using curly braces {} around the variables.

age = 25
city = "New York"
name = "Parotta Salna"
print(f"Name: {name}, Age: {age}, City: {city}")

This will display: Name: Parotta Salna Age: 25 City: New York

5. Concatenation of Strings

You can also combine (concatenate) strings using the + operator.

main_dish = "Idly"
side_dish = "Sambar"
print(main_dish + " " + side_dish + "!")

This will display: Idly Sambar!

6. Using Escape Sequences

Escape sequences allow you to include special characters in a string. For example, \n adds a new line.

print("Line1\nLine2\nLine3")

This will display:

7. Printing Quotes Inside Strings

To print quotes inside a string, you can use either single or double quotes to enclose the string and the other type of quotes inside it.

print('He said, "Hello, world!"')

This will display: He said, β€œHello, world!”

8. Raw Strings to Ignore Escape Sequences

Prefix the string with r to treat backslashes as literal characters.

print(r"C:\Users\Name")

This will display: C:\Users\Name

9. Printing Numbers

You can print numbers directly without quotes.

print(12345)

This will display: 12345

10. Printing Results of Expressions

You can also print the result of an expression.

print(5 + 3)

This will display: 8

11. Printing Lists and Dictionaries

You can print entire lists and dictionaries.

fruits = ["apple", "banana", "cherry"]
print(fruits)

person = {"name": "Alice", "age": 25, "city": "New York"}
print(person)

This will display:

[β€˜apple’, β€˜banana’, β€˜cherry’]

{β€˜name’: β€˜Alice’, β€˜age’: 25, β€˜city’: β€˜New York’}

12. Using sep and end Parameters

The sep parameter changes the separator between items, and end changes the ending character (default is newline).

print("Hello", "world", sep="-", end="!")


This will display: Hello-world!

13. Multiline Strings with Triple Quotes

Triple quotes allow you to print multiline strings easily.

print("""This is a
multiline
string""")

This will display:

14. Printing in a Loop

You can use a loop to print multiple lines.

for i in range(5):
    print("Iteration", i)

15. String Multiplication

Multiply a string by an integer to repeat it.

print("Hello " * 3)

This will display: Hello Hello Hello

16. Printing Boolean Values

Print boolean values directly.

is_active = True
print(is_active)

This will display: True

17. Printing None

Print the None value directly.

value = None
print(value)

This will display: None

18. Combining Strings and Variables

Combine strings and variables using + for simple cases or formatted strings for more complex scenarios.

temperature = 22.5
print("The temperature is " + str(temperature) + " degrees Celsius.")

This will display: The temperature is 22.5 degrees Celsius.

19. Using print for Debugging

You can use print to debug your code by printing variable values at different points.

def add(a, b):
    print(f"Adding {a} and {b}")
    return a + b

result = add(5, 3)
print("Result:", result)

This will display:





20. Printing with .format()

Use the .format() method for string formatting.

print("Name: {}, Age: {}, City: {}".format(name, age, city))

This will display: Name: Alice, Age: 25, City: New York

Excercises:

Quiz:

https://docs.google.com/forms/d/e/1FAIpQLSeW7dGCYrvPXBK7llexbwa_yImFQWFiHHE4c4ATOk-NwJWxIw/viewform?usp=sf_link

Infographics:

❌
❌